Skip to content

Troubleshooting & FAQ

Common issues, what causes them, and how to fix them.

Ingest returns 429 Too Many Requests

The writer queue is full — backpressure, working as intended. Compatibility shims return 429 (the native /api/ingest blocks instead). The client should retry with backoff. If sustained, give the queue more room (HELIOS_BLOCK_QUEUE_CAP, restart) or look at why flushes are slow. See Performance tuning.

Ingest returns 400 "unknown environment"

The target environment must exist first. Create it under Admin → Environments, or ingest into default. Note that environments starting with _ are reserved.

  • Time range — events route by their event timestamp; widen the search window to where the data actually falls.
  • Wrong env/index — confirm ?env=/?index= (or the token's pinned env) match what you're searching. Try * | stats count by index.
  • Parse errors — check the ingest response's errors count; malformed lines are counted, not ingested.

Logged out / 401 after a restart

The JWT secret was regenerated or isn't shared. If it isn't persisted (or, in a cluster, isn't identical on every node), tokens stop validating. Pin HELIOS_JWT_SECRET_PATH to a stable, shared file.

Multi-node: data or settings not converging

  • Sync interval — cross-node visibility is eventual (≈HELIOS_BLOCK_SYNC_SECS, default 10s). Give it a moment.
  • Mismatched secrets — every node needs the same control key and JWT secret, and the same HELIOS_CONTROL_ENCRYPTION setting. A control-key mismatch means a node can't decrypt the shared control plane. See Multi-node.

Retention isn't deleting old data

  • No retention is configured — set a global default (HELIOS_RETENTION_DEFAULT_DAYS or Admin → General) or a per-env override. Unset = keep forever.
  • The sweep runs on an interval (HELIOS_RETENTION_SWEEP_SECS, default hourly). To force it now, call POST /api/admin/gc. See Indexes & retention.

FIPS binary won't start

A FIPS build aborts if the validated module can't load. On macOS, set DYLD_FALLBACK_LIBRARY_PATH to the directory of libaws_lc_fips_*crypto.dylib. On Linux/Docker this is handled for you. Confirm via the crypto provider in Admin → General.

The agent or AI monitors don't work

The LLM provider isn't configured or enabled. Set a provider, enter credentials, Test connection, and enable the agent. Check the _helioslogsself-log for provider errors.

An MCP client can't connect

  • Use the right URL: POST /mcp.
  • Include Authorization: Bearer <token> if an MCP auth token is set.
  • Check the env/index and tool allowlists — a hidden tool is also rejected if called. Smoke-test with the tools/list curl from the MCP page.

SAML login fails

The user sees a generic error; the real reason is logged. Search _helioslogs for saml_* events. Common causes: certificate/audience mismatch, an expired or replayed assertion, or no matching HeliosLogs user (SAML is match-only — create the user first).

Forgot the admin password

Set HELIOS_ADMIN_RESET=1 together with HELIOS_ADMIN_PASSWORD and restart; the admin password is reset and sessions revoked. Unset it again afterward. See First steps.

FAQ

Do I need a database? No — HeliosLogs is a single binary; the control plane is encrypted JSON on disk (or in the shared store).

How do I scale out / get HA? Point multiple nodes at one shared store and share the secret files.

How do I back up? Replicate the data dir / shared store and back up the two secret files. See Upgrades & backups.

Can existing shippers send to HeliosLogs? Yes — it accepts Elasticsearch, Splunk HEC, Loki, OTLP, and syslog. See Compatibility endpoints.