Docker Deployment

Deploy VoiceGateway in production with Docker Compose. The daemon serves the HTTP API and the web dashboard on the same port, so one service is enough; the optional dashboard service in the compose file below is a convenience for operators who want the dashboard reachable on a different external port. Includes persistent storage, health checks, and an optional Ollama sidecar for local LLM inference. For hosting the collector on a VPS, Railway, or Fly.io, see Deployment.

Project Structure

your-project/
  docker-compose.yml
  voicegw.yaml
  .env

Environment Variables

Create a .env file with your provider API keys:

# .env
DEEPGRAM_API_KEY=your-deepgram-key
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
GROQ_API_KEY=your-groq-key
CARTESIA_API_KEY=your-cartesia-key
ELEVENLABS_API_KEY=your-elevenlabs-key
ASSEMBLYAI_API_KEY=your-assemblyai-key

# Optional: set a fixed Fernet key for encryption across container restarts
VOICEGW_SECRET=your-base64-fernet-key

Never commit .env files to version control. Add .env to your .gitignore.

Generating a Fernet Key

If you do not set VOICEGW_SECRET, VoiceGateway auto-generates one on first run and stores it in the container. Since containers are ephemeral, set this explicitly for production:

python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"

Configuration

Create voicegw.yaml:

providers:
  openai:
    api_key: ${OPENAI_API_KEY}
  deepgram:
    api_key: ${DEEPGRAM_API_KEY}
  cartesia:
    api_key: ${CARTESIA_API_KEY}
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}
  groq:
    api_key: ${GROQ_API_KEY}
  elevenlabs:
    api_key: ${ELEVENLABS_API_KEY}

models:
  stt:
    deepgram/nova-3:
      provider: deepgram
      model: nova-3
  llm:
    openai/gpt-4.1-mini:
      provider: openai
      model: gpt-4.1-mini
    anthropic/claude-sonnet-4-20250514:
      provider: anthropic
      model: claude-sonnet-4-20250514
  tts:
    cartesia/sonic-3:
      provider: cartesia
      model: sonic-3
      default_voice: 794f9389-aac1-45b6-b726-9d9369183238

stacks:
  premium:
    stt: deepgram/nova-3
    llm: openai/gpt-4.1-mini
    tts: cartesia/sonic-3

fallbacks:
  stt:
    - deepgram/nova-3
  llm:
    - openai/gpt-4.1-mini
    - anthropic/claude-sonnet-4-20250514
  tts:
    - cartesia/sonic-3
    - elevenlabs/turbo-v2.5

projects:
  prod:
    name: Production
    daily_budget: 100.00
    budget_action: throttle
    default_stack: premium
    tags: [production]

cost_tracking:
  enabled: true

rate_limits:
  openai:
    requests_per_minute: 60
  deepgram:
    requests_per_minute: 100

Docker Compose

version: "3.8"

services:
  voicegateway:
    build:
      context: .
      dockerfile: src/voicegateway/Dockerfile
    container_name: voicegateway
    ports:
      - "8080:8080"
    volumes:
      - voicegw-data:/data
      - ./voicegw.yaml:/app/voicegw.yaml:ro
    environment:
      - VOICEGW_CONFIG=/app/voicegw.yaml
      - VOICEGW_DB_PATH=/data/voicegw.db
      - VOICEGW_SECRET=${VOICEGW_SECRET:-}
      # Provider API keys from .env
      - DEEPGRAM_API_KEY=${DEEPGRAM_API_KEY:-}
      - OPENAI_API_KEY=${OPENAI_API_KEY:-}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
      - GROQ_API_KEY=${GROQ_API_KEY:-}
      - CARTESIA_API_KEY=${CARTESIA_API_KEY:-}
      - ELEVENLABS_API_KEY=${ELEVENLABS_API_KEY:-}
      - ASSEMBLYAI_API_KEY=${ASSEMBLYAI_API_KEY:-}
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 5s
      retries: 3
    networks:
      - voicegw-net

  # The dashboard runs inside the voicegateway service: the daemon
  # mounts the React SPA at / and the dashboard API at /api/* on
  # the same port as the public HTTP API. No second service needed.

  # Optional: local LLM via Ollama
  ollama:
    image: ollama/ollama:latest
    container_name: voicegateway-ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama-models:/root/.ollama
    profiles:
      - local  # Only starts with: docker compose --profile local up
    restart: unless-stopped
    networks:
      - voicegw-net

volumes:
  voicegw-data:
    driver: local
  ollama-models:
    driver: local

networks:
  voicegw-net:
    driver: bridge

Starting the Services

Cloud-only (API + Dashboard)

docker compose up -d

This starts:

voicegateway on port 8080 (HTTP API)
dashboard on port 9090 (Web UI)

With Local Ollama

docker compose --profile local up -d

# Pull a model into Ollama
docker exec voicegateway-ollama ollama pull qwen2.5:3b

This adds:

ollama on port 11434

Update voicegw.yaml to use the container hostname:

providers:
  ollama:
    base_url: http://ollama:11434

Fleet collector (Postgres)

To run the self-hosted collector that many LiveKit agents push telemetry to, use the Postgres-backed stack. The same container serves POST /v1/ingest, the dashboard API, and the SPA on port 8080.

docker compose -f docker-compose.collector.yml up -d

This starts a postgres service and a collector service. The collector reads its database from VOICEGW_DB_URL; setting that enables storage automatically, so cost_tracking does not need to be turned on by hand. The image ships the postgres extra (asyncpg) and the migrations, so it builds its schema on first start. Provide a voicegw.yaml next to the compose file (the collector mainly needs auth.api_keys for the agents’ virtual keys), and set VOICEGW_PG_PASSWORD for anything beyond a local trial. Point the official image at any Postgres directly without compose:

docker run -p 8080:8080 \
  -e VOICEGW_DB_URL="postgresql+asyncpg://user:pass@host:5432/voicegw" \
  -v $(pwd)/voicegw.yaml:/app/voicegw.yaml:ro \
  -e VOICEGW_CONFIG=/app/voicegw.yaml \
  mahimairaja/voicegateway:latest

Verifying the Deployment

Health Check

curl http://localhost:8080/health

{
  "status": "ok",
  "uptime_seconds": 42.3,
  "version": "0.1.0"
}

Provider Status

curl http://localhost:8080/v1/status

Dashboard

Open http://localhost:8080 in your browser to see the dashboard with cost charts, latency metrics, and request logs. The daemon serves both the React UI and the dashboard API at this port.

Production Considerations

Persistent Storage

The voicegw-data volume stores the SQLite database. This persists across container restarts and rebuilds. To back up:

# Copy the database out of the volume
docker cp voicegateway:/data/voicegw.db ./backup-$(date +%Y%m%d).db

Encryption Key Persistence

If you do not set VOICEGW_SECRET, a new Fernet key is generated on first run and stored in the container filesystem (not the volume). This means:

Rebuilding the container loses the key
Encrypted API keys in the database become undecryptable
You will need to re-add managed providers

Always set VOICEGW_SECRET in production via the .env file or a secrets manager.

Reverse Proxy

For TLS termination, put Nginx or Caddy in front:

  nginx:
    image: nginx:alpine
    ports:
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
      - ./certs:/etc/nginx/certs:ro
    depends_on:
      - voicegateway
      - dashboard
    networks:
      - voicegw-net

Resource Limits

For production deployments, add resource constraints:

  voicegateway:
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 1G
        reservations:
          cpus: "0.5"
          memory: 256M

Logging

VoiceGateway logs to stdout. Use Docker’s logging driver to ship logs:

  voicegateway:
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Connecting Your Application

From your voice agent application, point inference factories at the deployed VoiceGateway by sharing the same voicegw.yaml:

import os
os.environ["VOICEGW_CONFIG"] = "/path/to/voicegw.yaml"

from voicegateway import inference

stt = inference.STT("deepgram/nova-3")

For pure observability calls (status, costs, logs), hit the HTTP API:

import httpx
resp = httpx.get("http://localhost:8080/v1/status")

If your application runs in a separate container on the same Docker network, use the service name:

# From another container on voicegw-net
resp = httpx.get("http://voicegateway:8080/v1/status")

Updating

# Pull latest code and rebuild
git pull
docker compose build
docker compose up -d

# The SQLite database auto-migrates on startup
# No manual migration steps needed

​Docker Deployment

​Project Structure

​Environment Variables

​Generating a Fernet Key

​Configuration

​Docker Compose

​Starting the Services

​Cloud-only (API + Dashboard)

​With Local Ollama

​Fleet collector (Postgres)

​Verifying the Deployment

​Health Check

​Provider Status

​Dashboard

​Production Considerations

​Persistent Storage

​Encryption Key Persistence

​Reverse Proxy

​Resource Limits

​Logging

​Connecting Your Application

​Updating

Docker Deployment

Project Structure

Environment Variables

Generating a Fernet Key

Configuration

Docker Compose

Starting the Services

Cloud-only (API + Dashboard)

With Local Ollama

Fleet collector (Postgres)

Verifying the Deployment

Health Check

Provider Status

Dashboard

Production Considerations

Persistent Storage

Encryption Key Persistence

Reverse Proxy

Resource Limits

Logging

Connecting Your Application

Updating