> ## Documentation Index
> Fetch the complete documentation index at: https://docs.voicegateway.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Architecture

# Architecture Overview

VoiceGateway is cost tracking and reconciliation for LiveKit voice agents. It returns native LiveKit STT, LLM, and TTS plugin instances across cloud providers and local models, with modality-aware unit accounting (audio-minutes, tokens, characters), resolver-time fallback chains, rate limiting, budget enforcement, a `voicegw reconcile` command, and a web dashboard.

## System Architecture

```mermaid theme={null}
graph TB
    subgraph UserCode["User Code / LiveKit Agent"]
        A["gateway.stt('deepgram/nova-3')"]
        B["gateway.llm('openai/gpt-4.1-mini')"]
        C["gateway.tts('cartesia/sonic-3')"]
    end

    subgraph Core["Core Layer"]
        GW[Gateway]
        R[Router]
        REG[Registry]
        MID[ModelId Parser]
    end

    subgraph Middleware["Middleware Pipeline"]
        BE[BudgetEnforcer]
        IP[InstrumentedProvider]
        CT[CostTracker]
        LM[LatencyMonitor]
        RL[RateLimiter]
        FB[FallbackChain]
        LG[RequestLogger]
    end

    subgraph Providers["Provider Layer"]
        BP[BaseProvider ABC]
        OAI[OpenAI]
        DG[Deepgram]
        CA[Cartesia]
        AN[Anthropic]
        GR[Groq]
        EL[ElevenLabs]
        AA[AssemblyAI]
        OL[Ollama]
        WH[Whisper]
        KO[Kokoro]
        PI[Piper]
    end

    subgraph Storage["Storage Layer"]
        DB[(SQLite)]
        CM[ConfigManager]
        CR[Crypto / Fernet]
    end

    subgraph Interfaces["External Interfaces"]
        API[FastAPI HTTP Server]
        DASH[Dashboard - React/Vite]
        MCP[MCP Server]
        CLI[CLI - voicegw]
    end

    A --> GW
    B --> GW
    C --> GW
    GW --> BE
    BE --> R
    R --> MID
    R --> REG
    REG --> BP
    BP --> OAI & DG & CA & AN & GR & EL & AA & OL & WH & KO & PI
    GW --> IP
    IP --> CT
    IP --> LM
    GW --> FB
    CT --> DB
    API --> GW
    DASH --> API
    MCP --> GW
    CLI --> GW
    CM --> DB
    CR --> DB
```

## Request Flow

Every call to `gateway.stt()`, `gateway.llm()`, or `gateway.tts()` follows the same path:

```mermaid theme={null}
sequenceDiagram
    participant App as User Code
    participant GW as Gateway
    participant BE as BudgetEnforcer
    participant R as Router
    participant MID as ModelId
    participant REG as Registry
    participant P as Provider
    participant IP as InstrumentedProvider
    participant CT as CostTracker
    participant DB as SQLite

    App->>GW: gateway.stt("deepgram/nova-3", project="prod")
    GW->>BE: check_budget("prod")
    BE->>DB: get_cost_summary("today", project="prod")
    DB-->>BE: $4.20 / $10.00 budget
    BE-->>GW: OK (under budget)
    GW->>R: resolve("deepgram/nova-3", "stt")
    R->>MID: parse("deepgram/nova-3")
    MID-->>R: ModelId(provider="deepgram", model="nova-3")
    R->>REG: create_provider("deepgram", config)
    REG-->>R: DeepgramProvider instance
    R->>P: create_stt(model="nova-3")
    P-->>R: STT instance
    R-->>GW: STT instance
    GW->>IP: wrap_provider(instance, "stt", ...)
    IP-->>GW: InstrumentedSTT wrapper
    GW-->>App: InstrumentedSTT (proxies all access)
    Note over IP,DB: On first byte/completion, records TTFB + latency
    IP->>CT: create_record(...)
    CT->>DB: log_request(record)
```

## Directory Structure

The Python package lives under `voicegateway/`. A separate `dashboard/`
tree carries the FastAPI backend and the React + TypeScript + Vite +
Recharts frontend for the local cost dashboard.

```
voicegateway/
├── core/
│   ├── gateway.py
│   ├── config.py
│   ├── config_manager.py
│   ├── registry.py
│   ├── schema.py
│   └── crypto.py
├── inference/
│   ├── __init__.py
│   ├── _factory.py
│   ├── _project.py
│   ├── _session_context.py
│   ├── _resolution.py
│   ├── _stt.py
│   ├── _llm.py
│   └── _tts.py
├── providers/
│   ├── base.py
│   ├── openai_provider.py
│   ├── deepgram_provider.py
│   ├── cartesia_provider.py
│   ├── anthropic_provider.py
│   ├── groq_provider.py
│   ├── elevenlabs_provider.py
│   ├── assemblyai_provider.py
│   ├── ollama_provider.py
│   ├── whisper_provider.py
│   ├── kokoro_provider.py
│   └── piper_provider.py
├── middleware/
│   ├── cost_tracker.py
│   ├── latency_monitor.py
│   ├── rate_limiter.py
│   ├── logger.py
│   ├── budget_enforcer.py
│   └── instrumented_provider.py
├── storage/
│   ├── sqlite.py
│   └── models.py
├── server.py
├── mcp/
│   ├── server.py
│   ├── auth.py
│   ├── errors.py
│   ├── schemas.py
│   └── tools/
└── pricing/
    └── catalog.py
dashboard/
├── api/
└── frontend/
```

## Design Principles

1. **Async throughout** -- all database, HTTP, and provider operations use async/await. The Gateway provides synchronous wrapper methods for convenience.

2. **Lazy loading** -- providers are only imported and instantiated on first use. `pip install voicegateway[openai]` installs only the OpenAI SDK.

3. **Transparent instrumentation** -- `InstrumentedSTT/LLM/TTS` wrappers proxy all attribute access via `__getattr__`, so user code sees the exact same API as the underlying provider instance.

4. **Config layering** -- three sources merged at startup: environment variables (highest priority), SQLite managed tables (dashboard/MCP writes), and YAML (base config). Each resource carries a `source` field (`"yaml"` or `"db"`).

5. **Encryption at rest** -- all API keys stored in SQLite are encrypted with Fernet (AES-128-CBC + HMAC-SHA256). Keys in API responses are masked to `secr...2345` format.

## Key Components

| Component                                      | File                                     | Purpose                                                           |
| ---------------------------------------------- | ---------------------------------------- | ----------------------------------------------------------------- |
| [Gateway Core](./gateway-core)                 | `core/gateway.py`                        | Main orchestrator, entry point for all requests                   |
| [Provider Abstraction](./provider-abstraction) | `providers/base.py`                      | ABC for all 11 provider implementations                           |
| [Middleware](./middleware)                     | `middleware/`                            | Cost, latency, rate limiting, fallback, budget                    |
| [Cost Tracking](./cost-tracking)               | `pricing/`, `middleware/cost_tracker.py` | Per-request cost calculation, pricing layer, streaming validation |
| [Storage](./storage)                           | `storage/sqlite.py`                      | SQLite schema, tables, views, indexes                             |
| [Config Layers](./config-layers)               | `core/config_manager.py`                 | YAML + SQLite + env merge strategy                                |
| [Security](./security)                         | `core/crypto.py`                         | Fernet encryption, secret management, masking                     |
