Replay storage costs

VoiceGateway’s conversation replay captures every STT chunk, LLM token, TTS frame, and conversation-state snapshot for every session. This page surfaces the on-disk storage cost so the trade-off between fidelity and footprint is visible before you set per-project retention.

What gets stored

Four tables per the migration 0004 schema:

Table	Row payload	Typical row size
`replay_stt_events`	JSON: `text`, `is_final`, `alternatives`	100 - 400 bytes
`replay_llm_tokens`	JSON: `token_text`, `role`, `is_tool_invoke`, `tool_args_partial`	80 - 200 bytes
`replay_tts_frames`	JSON: `frame_duration_ms`, `underrun`, `voice_id`	80 - 120 bytes
`replay_state_snapshots`	JSON: `system_prompt`, `message_history`, `tool_call_in_flight`, `structured_output_collected`	500 - 5000 bytes (depends on prompt + history size)

Every row also carries the boilerplate: id, session_id, t_ms, provider, cost_usd, created_at. Add roughly 80-120 bytes of column overhead per row. Indexes on (session_id, t_ms) add another 30-50% of payload size.

Per-minute estimate

For a typical voice conversation (caller speaks for half the time, agent for half, normal-cadence LLM with a 500-token system prompt):

Modality	Events per minute	Bytes per minute
STT (partials + finals)	30 - 60	8 KB - 24 KB
LLM tokens	200 - 500	30 KB - 80 KB
TTS frames (20-50ms each)	600 - 1500	60 KB - 180 KB
State snapshots (1/sec cap)	60	30 KB - 300 KB

Total: roughly 130 KB - 580 KB per minute of conversation, with the floor for short crisp exchanges and the ceiling for chatty agents with long conversation histories. The design target of 30-100 KB/min is achievable on the floor; realistic agents will land closer to the ceiling. If you find yourself trending above 500 KB/min consistently, the per-project replay.enabled: false toggle is the fastest mitigation.

Worked example

A solo developer running 100 voice calls per day, averaging 5 minutes each:

100 calls/day × 5 min/call × 200 KB/min = 100,000 KB/day ≈ 100 MB/day

At the default 90-day retention:

100 MB/day × 90 days ≈ 9 GB total replay storage

At AWS S3 standard storage prices (~$0.023/GB-month):

9 GB × $0.023 ≈ $0.21/month

On the local SQLite database (no cloud markup), the cost is the disk byte cost: ~

0.01/GB-month on a developer SSD ≈ ~

0.09/month for the same 9 GB. A team agency running 10,000 conversations per day at 3 minutes average scales linearly: ~3 TB at 90-day retention, ~

70/month on S3 standard or ~

30/month on local disk. At that point the retention_days knob matters: dropping to 30 days cuts storage to one-third.

Tuning knobs

Three per-project knobs in voicegw.yaml’s replay: block influence storage:

Knob	Default	Effect
`enabled`	`true`	Capture for every session. Set `false` to skip capture entirely.
`retention_days`	`90`	Age replay rows out after this window. Lower to reduce footprint linearly.
`flush_size_events`	`500`	Batched writes; smaller writes more often, larger holds more memory. No storage effect.
`buffer_size_events`	`5000`	In-memory cap. Above this, oldest events are dropped with a counter. The dashboard surfaces dropped events as “events dropped here” rather than silently misleading the developer.

The enabled toggle is the binary on/off. The retention_days knob is the gradient lever. The buffer_size_events and flush_size_events knobs trade off memory pressure and write batching but do not change long-term storage volume.

Dashboard storage view

GET /api/replay/storage returns per-project replay byte totals. The dashboard surfaces this as a breakdown so the developer sees the cost in real time:

{
  "total_replay_size_bytes": 9234567890,
  "by_project": [
    {"project": "acme", "replay_size_bytes": 8000000000},
    {"project": "default", "replay_size_bytes": 1234567890}
  ]
}

A future follow-up will surface this in a sidebar panel on the dashboard with cost-per-month estimates inline.

​Replay storage costs

​What gets stored

​Per-minute estimate

​Worked example

​Tuning knobs

​Dashboard storage view

​Related