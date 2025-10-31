fluxAutomatic Speech Recognition • Deepgram
Flux is the first conversational speech recognition model built specifically for voice agents.
|Model Info
|Partner
|Yes
|Real-time
|Yes
Usage
Step 1: Create a Worker that establishes a WebSocket connection
Step 2: Deploy your Worker
Step 3: Write a client script to connect to your Worker and send audio
Parameters
* indicates a required field
Input
-
encodingstring
Encoding of the audio stream. Currently only supports raw signed little-endian 16-bit PCM.
-
sample_ratestring
Sample rate of the audio stream in Hz.
-
eager_eot_thresholdstring
End-of-turn confidence required to fire an eager end-of-turn event. When set, enables EagerEndOfTurn and TurnResumed events. Valid Values 0.3 - 0.9.
-
eot_thresholdstring default 0.7
End-of-turn confidence required to finish a turn. Valid Values 0.5 - 0.9.
-
eot_timeout_msstring default 5000
A turn will be finished when this much time has passed after speech, regardless of EOT confidence.
-
keytermstring
Keyterm prompting can improve recognition of specialized terminology. Pass multiple keyterm query parameters to boost multiple keyterms.
-
mip_opt_outstring default false
Opts out requests from the Deepgram Model Improvement Program. Refer to Deepgram Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip
-
tagstring
Label your requests for the purpose of identification during usage reporting
Output
-
request_idstring
The unique identifier of the request (uuid)
-
sequence_idinteger min 0
Starts at 0 and increments for each message the server sends to the client.
-
eventstring
The type of event being reported.
-
turn_indexinteger min 0
The index of the current turn
-
audio_window_startnumber
Start time in seconds of the audio range that was transcribed
-
audio_window_endnumber
End time in seconds of the audio range that was transcribed
-
transcriptstring
Text that was said over the course of the current turn
-
wordsarray
The words in the transcript
-
itemsobject
-
wordstring required
The individual punctuated, properly-cased word from the transcript
-
confidencenumber required
Confidence that this word was transcribed correctly
-
-
-
end_of_turn_confidencenumber
Confidence that no more speech is coming in this turn
API Schemas
The following schemas are based on JSON Schema
