Architecture¶
This page is the system map: boundaries, flow, and extension points.
System Boundaries¶
- Transport boundary converts telephony events into frames.
- Provider boundary isolates vendor SDKs behind adapters.
- Core pipeline contains deterministic processors.
- Observers record what happened without changing behavior.
End‑to‑End Flow¶
- Transport receives audio and emits
AudioFramewith metadata. - STT converts audio into
TextFramewithsource=sttandis_final=truewhen complete. - Router optionally adds
agentandglobal_*fields based on the final text. - LLM consumes text and context, emits streaming text and optional tool calls.
- Tool calls go to
ToolDispatcher, then back astool_resultsystem frames. - TTS converts LLM text into
AudioFramefor the transport. - Turn manager watches frames to handle barge‑in, end‑of‑turn, and silence reprompts.
Component Diagram¶
flowchart LR
Caller((Caller))
Transport[Transport]
STT[STT Processor]
Router[Router Processor]
LLM[LLM Processor]
Dispatcher[Tool Dispatcher]
TTS[TTS Processor]
Turn[Turn Manager]
Observers[Observers]
Caller --> Transport
Transport -->|AudioFrame| STT
STT -->|TextFrame final| Router
Router -->|TextFrame agent| LLM
LLM -->|Control tool_call| Dispatcher
Dispatcher -->|System tool_result| LLM
LLM -->|TextFrame llm| TTS
TTS -->|AudioFrame| Transport
Transport --> Caller
STT -->|Control flush| Turn
LLM -->|System thinking_start| Turn
TTS -->|Control audio_ready| Turn
Turn -->|Control cancel| LLM
Observers -.-> STT
Observers -.-> Router
Observers -.-> LLM
Observers -.-> TTS
Extension Points (Where You Add Logic)¶
- Before LLM: text normalization, custom prompts, sensitive filtering.
- Before TTS: response shaping, truncation, localization.
- Pre/Post processors: logging, serialization, analytics hooks.
What You Can Swap Without Code Changes¶
- STT, TTS, LLM, and Transport providers via config.
What Requires Code Changes¶
- New processors.
- Custom transport protocols.
- New tool registry logic.