Audio Ingestion Overview
Veronese accepts audio through four channels. Each channel results in the same outcome: a new episode in your library, queued for transcription.
Channels at a glance
Section titled “Channels at a glance”| Channel | Best for |
|---|---|
| Web upload | Quick one-off files from your computer |
| YouTube / URL | Public videos, podcasts, web audio |
| Mobile recordings, quick captures from any device | |
| Telegram | Voice notes from mobile |
The ingestion pipeline
Section titled “The ingestion pipeline”Every episode goes through the same pipeline regardless of channel:
- Source resolution — Download the remote file or retrieve the uploaded attachment.
- Normalization — FFmpeg converts the audio to a clean WAV at a fixed sample rate.
- Duration probe —
ffprobemeasures the audio length and stores it on the episode. - Billing check — Your available credit balance must be ≥ 300 seconds to proceed.
- Transcription queue — A transcription job is enqueued for async processing.
Episode states
Section titled “Episode states”draft ↓ingesting ← source audio being downloaded/normalized ↓ready_for_transcription ↓transcribing ← AI model processing the audio ↓ready ← transcript available for editingIf anything goes wrong, the episode moves to failed and a notification is sent.
Supported audio formats
Section titled “Supported audio formats”Veronese accepts any audio format that FFmpeg can decode, including:
- MP3, M4A, AAC, OGG, FLAC, WAV, AIFF, OPUS
- Video with audio tracks (MP4, MOV, WebM) — audio is extracted automatically
- URLs pointing directly to audio files