Skip to content
MiniMax logo

MiniMax Speech 2.8 Turbo

Text-to-SpeechMiniMaxProxied

MiniMax Speech 2.8 Turbo turns text into natural, expressive speech with voice cloning, emotion control, and 40+ language support at faster speeds.

Model Info
Terms and Licenselink
More informationlink
PricingView pricing in the Cloudflare dashboard

Usage

TypeScript
const response = await env.AI.run(
'minimax/speech-2.8-turbo',
{
format: 'mp3',
pitch: 0,
speed: 1,
text: 'Hello! Welcome to Cloudflare AI Gateway. Let me show you what we can do.',
voice_id: 'English_expressive_narrator',
volume: 1,
},
)
console.log(response)

Examples

Fast Narration — Speed up narration for quick playback
TypeScript
const response = await env.AI.run(
'minimax/speech-2.8-turbo',
{
format: 'mp3',
pitch: 0,
speed: 1.5,
text: 'This is a fast-paced summary of the key findings from the quarterly report. Revenue is up fifteen percent and user growth exceeded expectations.',
voice_id: 'English_expressive_narrator',
volume: 1,
},
)
console.log(response)
Calm Tone — Calm and steady speech for meditation or relaxation
TypeScript
const response = await env.AI.run(
'minimax/speech-2.8-turbo',
{
emotion: 'calm',
format: 'mp3',
pitch: 0,
speed: 0.8,
text: 'Take a deep breath in. Hold it for a moment. Now slowly exhale. Let your shoulders relax and release any tension.',
voice_id: 'English_expressive_narrator',
volume: 1,
},
)
console.log(response)
Adjusted Pitch — Lower the pitch for a deeper voice
TypeScript
const response = await env.AI.run(
'minimax/speech-2.8-turbo',
{
format: 'mp3',
pitch: -6,
speed: 1,
text: 'Good evening. Tonight we explore the mysteries of the deep ocean and the creatures that live in total darkness.',
voice_id: 'English_expressive_narrator',
volume: 1,
},
)
console.log(response)

Parameters

emotion
stringenum: happy, sad, angry, fearful, disgusted, surprised, calm, fluentEmotion control for synthesized speech
format
stringrequireddefault: mp3enum: mp3, flac, wavOutput audio format
pitch
integerrequireddefault: 0maximum: 12minimum: -12Pitch adjustment (-12 to 12)
speed
numberrequireddefault: 1maximum: 2minimum: 0.5Speech speed (0.5 to 2)
text
stringrequiredmaxLength: 10000The text to convert to speech. Maximum 10,000 characters.
voice_id
stringrequireddefault: English_expressive_narratorThe voice ID to use for synthesis
volume
numberrequireddefault: 1maximum: 10minimum: 0Speech volume (0 to 10)

API Schemas (Raw)

Input
Output