> ## Documentation Index > Fetch the complete documentation index at: https://docs.voicegateway.dev/llms.txt > Use this file to discover all available pages before exploring further. # voicegw livekit > Diagnostics commands for inspecting agents, measuring end-to-end latency, and probing SFU health against a LiveKit server. # voicegw livekit Diagnostics for a running LiveKit deployment. Four subcommands cover agent listing, end-to-end latency measurement, SFU health, and an all-in-one check report. ```bash theme={null} voicegw livekit [OPTIONS] ``` ## Credentials All four subcommands resolve LiveKit credentials in the same order: 1. **CLI flags**: `--url`, `--api-key`, `--api-secret` (highest priority). 2. **Environment variables**: `LIVEKIT_URL`, `LIVEKIT_API_KEY`, `LIVEKIT_API_SECRET`. 3. **Config file**: a `livekit:` block in `voicegw.yaml` (lowest priority). If credentials are missing after all three layers, the command exits with an error before making any network calls. ```yaml theme={null} # voicegw.yaml livekit: url: wss://my-project.livekit.cloud api_key: ${LIVEKIT_API_KEY} api_secret: ${LIVEKIT_API_SECRET} ``` *** ## voicegw livekit agents List agents that are currently active in rooms on the LiveKit server. ```bash theme={null} voicegw livekit agents [OPTIONS] ``` ### What it reports Queries the LiveKit server API for all active rooms and the participants currently inside them. For each participant identified as an agent (dispatched or joined), the command reports: | Column | Description | | ---------- | ------------------------------------ | | **Agent** | Participant identity string. | | **Room** | Room name the agent is currently in. | | **State** | `active` or `dispatched`. | | **Joined** | Timestamp the participant joined. | ### Idle workers: the heartbeat roster The LiveKit server API exposes in-room participants only. Agents that are registered and waiting for dispatch (the idle worker pool) are not returned by any server API. To close that gap, instrument your agents with `voicegateway.register_worker(...)`: each worker then heartbeats its presence (idle / busy / offline) to the collector. When `VOICEGW_COLLECTOR_URL` and `VOICEGW_API_KEY` are set, `agents` also fetches that roster and prints it below the in-room table. Without the collector, only the in-room view is shown, plus a note on how to enable the roster. The roster fetch is best-effort: if the collector is unreachable, the in-room table still renders. ``` Registered workers (heartbeat roster): AGENT STATUS REGION VERSION realty busy iad 0.13.0 concierge idle - 0.13.0 3 workers registered (1 idle, 1 busy, 1 offline). ``` ### Example output ``` AGENT ROOM STATE IN-CALL AGE agent-7f4a demo-room active 1 42s agent-2c9b qa-room dispatched 0 8s 2 agents active in 2 rooms. Idle/registered workers are not reported by LiveKit's server API. Set VOICEGW_COLLECTOR_URL + VOICEGW_API_KEY (and run register_worker in your agents) to also list the heartbeat roster. ``` ### Options | Flag | Type | Default | Description | | -------------- | -------- | ----------------- | ----------------------------------------------------------------------------------------------------------- | | `--url` | `string` | (see Credentials) | LiveKit server WebSocket URL. | | `--api-key` | `string` | (see Credentials) | LiveKit API key. | | `--api-secret` | `string` | (see Credentials) | LiveKit API secret. | | `--json` | flag | off | Emit JSON instead of plain text. With the collector set, the JSON is `{"in_room": [...], "roster": [...]}`. | *** ## voicegw livekit latency Measure end-to-end voice latency by placing real synthetic test calls to each agent. ```bash theme={null} voicegw livekit latency [OPTIONS] ``` ### What it measures Always reports **end-to-end latency**: the time from the end of the caller's speech to the first reply audio frame received from the agent. This is the number users perceive. For each probe turn the command: 1. Joins a test room as a synthetic caller. 2. Plays a short utterance and waits for end-of-utterance (EOU). 3. Records the time from speech-end to the first reply audio frame arriving from the agent. | Metric | Description | | --------------- | ------------------------------------------------------------------------------------ | | **E2E latency** | Caller speech-end to first reply audio (seconds). This is the number users perceive. | ### Per-component breakdown (turn-detect / STT / LLM / TTS) When the probed agent is instrumented with `voicegateway.attach(session)`, the command also shows the latency split across turn detection, STT, LLM (time-to-first-token), and TTS. The correlation key is the probe room name: `attach` stamps it on every captured row, and the probe reads those rows back by room after the turns finish. This read-back is **co-located** in this version: the agent and the prober must share the same local VoiceGateway store (`~/.config/voicegateway/voicegw.db`, or `VOICEGW_DB_PATH`). In collector mode (`VOICEGW_COLLECTOR_URL` set) the rows go to the collector rather than this host, so the split is not shown from the CLI. When no split is available, the command prints a one-line note explaining what is needed. ### Cost warning **Each probe is a real agent turn.** The agent's STT, LLM, and TTS providers are invoked with live credentials and will incur real provider charges. Run with a low `--trials` value (`1` or `2`) unless you are deliberately benchmarking. Keep `--agent` scoped to avoid probing every agent. ### Options | Flag | Type | Default | Description | | ---------------------- | --------- | ----------------- | ------------------------------------------------------- | | `--agent` | `string` | all agents | Probe only the named agent identity. | | `--trials` | `integer` | `3` | Number of probe turns per agent. | | `--warmup/--no-warmup` | flag | warmup on | Discard first trial as cold-start warmup. | | `--target-ms` | `integer` | `1500` | Mark agent SLOW if avg E2E exceeds this threshold (ms). | | `--url` | `string` | (see Credentials) | LiveKit server WebSocket URL. | | `--api-key` | `string` | (see Credentials) | LiveKit API key. | | `--api-secret` | `string` | (see Credentials) | LiveKit API secret. | ### Example output ``` agent-7f4a E2E avg 0.82s p50 0.82s p95 0.84s GOOD (<1.5s) turn-detect 0.30 . STT 0.12 . LLM-ttft 0.45 . TTS 0.09 agent-2c9b E2E avg 1.14s p50 1.14s p95 1.18s SLOW (<1.5s) breakdown (turn-detect/STT/LLM/TTS) needs an instrumented agent (voicegateway.attach) writing to the same local store, co-located ``` *** ## voicegw livekit sfu Measure SFU connection quality from the host running `voicegw`. ```bash theme={null} voicegw livekit sfu [OPTIONS] ``` ### What it measures Baseline mode (no flags): * Connects to the LiveKit SFU and sends data-channel pings. * Reports round-trip time (RTT) and the LiveKit connection quality score. * Runs from wherever `voicegw` is executing. If that host is co-located with the SFU (the typical self-hosted setup), the result represents the real agent-to-SFU signal. Load-ramp mode (`--load`): * Ramps concurrent prober connections through the levels in `--ramp`. * At each concurrency level, runs for `--duration` and records RTT and quality score. * Identifies the capacity knee: the concurrency level at which RTT degrades or quality drops. * A resource monitor watches CPU and memory on the prober host. If the host itself saturates during the ramp, the output flags this so results are not mistaken for SFU limits. ### Distributed load (multi-vantage) A single host only shows what one machine can push. To load the SFU concurrently from several regions, run one **coordinator** and N **probers**: ```bash theme={null} # on the coordinator host (needs the [server] extra: pip install 'voicegateway[server]') voicegw livekit sfu --coordinator --expect 3 --ramp 10,25,50 --duration 20s # on each prober host / region (needs only the base install) voicegw livekit sfu --report-to http://:8787 --vantage iad voicegw livekit sfu --report-to http://:8787 --vantage sjc voicegw livekit sfu --report-to http://:8787 --vantage lhr ``` Each prober registers, waits at a shared barrier so every vantage starts its ramp at the same instant, ramps the **same** room, and reports its per-tier measurements back. The coordinator aggregates: at each tier the SFU sees the sum of all vantages' clients, while rtt / loss / quality report the worst any single vantage saw. It then prints the combined capacity and cleans up the shared rooms. ``` SFU distributed: 3 vantages combined: 30(3v)-> 14.0ms 0.0% Good . 75(3v)-> 22.0ms 0.1% Good . 150(3v)-> 55.0ms 1.4% Poor combined knee ~75 clients iad : 10-> 12.0ms 0.0% . 25-> 20.0ms 0.0% . 50-> 48.0ms 1.1% sjc : 10-> 14.0ms 0.0% . 25-> 22.0ms 0.1% . 50-> 55.0ms 1.4% lhr : 10-> 11.0ms 0.0% . 25-> 18.0ms 0.0% . 50-> 41.0ms 0.9% ``` To deploy probers across regions (for example on Fly.io), see [Distributed SFU probers](/deployment/distributed-sfu). ### Limitations **Prober host saturation.** Under high `--ramp` concurrency, the machine running `voicegw` can become the bottleneck before the SFU does. The resource monitor flags CPU or memory saturation in the output so you can distinguish host limits from SFU limits. Distributing across several smaller prober hosts (above) sidesteps this. ### Options | Flag | Type | Default | Description | | -------------------- | --------- | ----------------- | ------------------------------------------------------------ | | `--load` | flag | off | Enable concurrency ramp mode (single host). | | `--ramp` | `string` | `2,10,25,50` | Comma-separated concurrency levels for the ramp. | | `--duration` | `string` | `20s` | How long to hold each concurrency level. | | `--coordinator` | flag | off | Run as the distributed coordinator (needs `[server]` extra). | | `--expect` | `integer` | `0` | Number of probers the coordinator waits for. | | `--coordinator-port` | `integer` | `8787` | Port the coordinator listens on. | | `--report-to` | `string` | (none) | Run as a prober reporting to this coordinator URL. | | `--vantage` | `string` | `$VOICEGW_REGION` | Label for this prober's vantage. | | `--url` | `string` | (see Credentials) | LiveKit server WebSocket URL. | | `--api-key` | `string` | (see Credentials) | LiveKit API key. | | `--api-secret` | `string` | (see Credentials) | LiveKit API secret. | ### Example: baseline ```bash theme={null} voicegw livekit sfu ``` ``` SFU vantage: co-located baseline: rtt 11ms . loss 0.0% . Excellent ``` ### Example: load ramp ```bash theme={null} voicegw livekit sfu --load --ramp 2,10,25,50 --duration 20s ``` ``` SFU vantage: co-located baseline: rtt 11ms . loss 0.0% . Excellent ramp: 2-> 11ms 0.0% . 10-> 12ms 0.0% . 25-> 18ms 0.1% . 50-> 41ms 1.2% knee ~25 clients prober: ~12% CPU + ~80 kbps up per client; host sustains ~40 before CPU-bound ``` *** ## voicegw livekit check Run all three diagnostics and print a single pass/warn/fail report. ```bash theme={null} voicegw livekit check [OPTIONS] ``` ### What it runs Executes `agents`, `latency` (two trials per agent), and `sfu` (baseline) in sequence. For each item it assigns a status: | Status | Meaning | | -------- | ------------------------------------------------------------------- | | **PASS** | Metric within acceptable range. | | **WARN** | Metric degraded but not failing (e.g. latency above `--target-ms`). | | **FAIL** | Error, unreachable, or hard threshold exceeded. | The command exits 0 if everything passes, 1 if any item is WARN or FAIL. ### Options | Flag | Type | Default | Description | | -------------- | --------- | ----------------- | ---------------------------------------------------- | | `--target-ms` | `integer` | `1500` | Latency threshold (ms) for the WARN boundary. | | `--url` | `string` | (see Credentials) | LiveKit server WebSocket URL. | | `--api-key` | `string` | (see Credentials) | LiveKit API key. | | `--api-secret` | `string` | (see Credentials) | LiveKit API secret. | | `--json` | flag | off | Emit a structured JSON record instead of plain text. | ### Example: plain text output ```bash theme={null} voicegw livekit check --target-ms 1000 ``` ``` VERDICT: WARN AGENT ROOM STATE IN-CALL AGE agent-7f4a demo-room active 1 42s agent-2c9b qa-room dispatched 0 8s 2 agents active in 2 rooms. Idle/registered workers are not reported by LiveKit's server API. Set VOICEGW_COLLECTOR_URL + VOICEGW_API_KEY (and run register_worker in your agents) to also list the heartbeat roster. agent-7f4a E2E avg 0.82s p50 0.82s p95 0.84s GOOD (<1.0s) breakdown (turn-detect/STT/LLM/TTS) needs an instrumented agent (voicegateway.attach) writing to the same local store, co-located agent-2c9b E2E avg 1.14s p50 1.14s p95 1.18s SLOW (<1.0s) breakdown (turn-detect/STT/LLM/TTS) needs an instrumented agent (voicegateway.attach) writing to the same local store, co-located SFU vantage: co-located baseline: rtt 11ms . loss 0.0% . Excellent ``` ### Example: JSON output ```bash theme={null} voicegw livekit check --json ``` ```json theme={null} { "agents": [ {"agent_name": "agent-7f4a", "room": "demo-room", "identity": "agent-7f4a", "state": "active", "humans": 1, "age_s": 42.0}, {"agent_name": "agent-2c9b", "room": "qa-room", "identity": "agent-2c9b", "state": "dispatched", "humans": 0, "age_s": 8.0} ], "latency": [ {"agent": "agent-7f4a", "stats": {"avg": 0.82, "p50": 0.82, "p95": 0.84, "min": 0.80, "max": 0.84, "trials": 2}, "components": null}, {"agent": "agent-2c9b", "stats": {"avg": 1.14, "p50": 1.14, "p95": 1.18, "min": 1.10, "max": 1.18, "trials": 2}, "components": null} ], "sfu": { "baseline": {"clients": 1, "rtt_ms": 11.0, "loss_pct": 0.0, "quality": "Excellent"}, "ramp": [], "knee": null }, "verdict": "WARN" } ``` ### Exit codes | Code | Meaning | | ---- | ---------------------------------------------------------------------- | | `0` | All checks passed. | | `1` | One or more checks are WARN or FAIL, or credentials were not resolved. | *** ## Shared limitations The following limitations apply across all four subcommands: **In-room agents only, unless workers heartbeat.** The LiveKit server API does not expose idle (pre-dispatch) workers. `agents` shows the idle/registered roster only when your agents call `voicegateway.register_worker(...)` and the collector is configured; otherwise it (and `latency`) see only agents currently in rooms. **Real provider cost on latency probes.** Every `latency` probe invokes the agent's actual STT, LLM, and TTS pipeline. Charges are incurred. Use low `--trials` counts for routine checks. **Per-component breakdown needs an instrumented, co-located agent.** The split across turn-detection, STT, LLM, and TTS requires agents instrumented with `voicegateway.attach(session)` writing to the same local store as the prober. In collector mode the split is not shown from the CLI. **SFU vantage.** `sfu` measures from the host running `voicegw`. For a self-hosted setup where the gateway and SFU share a network, the co-located result is the right signal; for end-user latency in other regions, use the distributed coordinator/prober mode above. **Prober host saturation.** During `sfu --load`, the prober machine can saturate before the SFU does. The resource monitor flags this in the output. *** ## Related commands * [`voicegw smoke-test`](/cli/smoke-test): validate the inference pipeline without a LiveKit server. * [`voicegw status`](/cli/status): check provider configuration. * [`voicegw logs`](/cli/logs): view per-request cost and latency records. * [`voicegw costs`](/cli/costs): aggregated cost view by provider and project.