Replay storage costs
VoiceGateway’s conversation replay captures every STT chunk, LLM token, TTS frame, and conversation-state snapshot for every session. This page surfaces the on-disk storage cost so the trade-off between fidelity and footprint is visible before you set per-project retention.What gets stored
Four tables per the migration 0004 schema:| Table | Row payload | Typical row size |
|---|---|---|
replay_stt_events | JSON: text, is_final, alternatives | 100 - 400 bytes |
replay_llm_tokens | JSON: token_text, role, is_tool_invoke, tool_args_partial | 80 - 200 bytes |
replay_tts_frames | JSON: frame_duration_ms, underrun, voice_id | 80 - 120 bytes |
replay_state_snapshots | JSON: system_prompt, message_history, tool_call_in_flight, structured_output_collected | 500 - 5000 bytes (depends on prompt + history size) |
id, session_id, t_ms, provider, cost_usd, created_at. Add roughly 80-120 bytes of column overhead per row. Indexes on (session_id, t_ms) add another 30-50% of payload size.
Per-minute estimate
For a typical voice conversation (caller speaks for half the time, agent for half, normal-cadence LLM with a 500-token system prompt):| Modality | Events per minute | Bytes per minute |
|---|---|---|
| STT (partials + finals) | 30 - 60 | 8 KB - 24 KB |
| LLM tokens | 200 - 500 | 30 KB - 80 KB |
| TTS frames (20-50ms each) | 600 - 1500 | 60 KB - 180 KB |
| State snapshots (1/sec cap) | 60 | 30 KB - 300 KB |
replay.enabled: false toggle is the fastest mitigation.
Worked example
A solo developer running 100 voice calls per day, averaging 5 minutes each:retention_days knob matters: dropping to 30 days cuts storage to one-third.
Tuning knobs
Three per-project knobs invoicegw.yaml’s replay: block influence storage:
| Knob | Default | Effect |
|---|---|---|
enabled | true | Capture for every session. Set false to skip capture entirely. |
retention_days | 90 | Age replay rows out after this window. Lower to reduce footprint linearly. |
flush_size_events | 500 | Batched writes; smaller writes more often, larger holds more memory. No storage effect. |
buffer_size_events | 5000 | In-memory cap. Above this, oldest events are dropped with a counter. The dashboard surfaces dropped events as “events dropped here” rather than silently misleading the developer. |
enabled toggle is the binary on/off. The retention_days knob is the gradient lever. The buffer_size_events and flush_size_events knobs trade off memory pressure and write batching but do not change long-term storage volume.
Dashboard storage view
GET /api/replay/storage returns per-project replay byte totals. The dashboard surfaces this as a breakdown so the developer sees the cost in real time: