Documentation Index
Fetch the complete documentation index at: https://docs.automagik.dev/llms.txt
Use this file to discover all available pages before exploring further.
Media Processing Pipeline
Omni processes incoming media automatically — transcribing audio, describing images, extracting text from documents. The pipeline is designed for resilience with circuit breakers, retry logic, and health metrics.Processing Flow
Processors
| Type | What It Does | Output Field |
|---|---|---|
| Image | AI-powered image description | description |
| Audio | Speech-to-text transcription | transcription |
| Video | Frame extraction + audio transcription | description + transcription |
| Document | PDF/DOCX text extraction + AI summarization | description |
Prompt Overrides
Each processor uses a configurable LLM prompt. You can override the defaults:image, video, document, gate.
Circuit Breaker
The media pipeline uses a circuit breaker pattern to prevent cascading failures when the AI backend is down or slow:| State | Behavior |
|---|---|
| Closed | Normal operation — all requests go through |
| Open | Backend is failing — requests are short-circuited, no API calls made |
| Half-Open | Testing recovery — a few requests are sent to check if the backend is back |
Retry Logic
Failed processing attempts are retried with exponential backoff. After max retries, the message is stored without processed media and flagged for batch reprocessing.Health Metrics
Monitor media processing health:Batch Processing
For bulk media operations — reprocess failed items, process historical messages, or run ad-hoc transcription:Media Storage
Downloaded media files are stored locally at:See also
Messaging
omni media list, download, and inline TTS via omni send --tts.Batch processing
omni batch for bulk transcription and description backfills.Voice gateway
Real-time voice streaming and DAVE-protected sessions.
Providers
Configure Gemini, ElevenLabs, and Groq for transcription and TTS.