Skip to content
OpenAI logo

whisper

Automatic Speech RecognitionOpenAIHosted

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

Model Info
More informationlink
Unit Pricing$0.00045 per audio minute

Usage

export interface Env {
AI: Ai;
}
export default {
async fetch(request, env): Promise<Response> {
const res = await fetch(
"https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/samples/cpp/windows/console/samples/enrollment_audio_katie.wav"
);
const blob = await res.arrayBuffer();
const input = {
audio: [...new Uint8Array(blob)],
};
const response = await env.AI.run(
"@cf/openai/whisper",
input
);
return Response.json({ input: { audio: [] }, response });
},
} satisfies ExportedHandler<Env>;

Parameters

Option 1
stringformat: binary

API Schemas (Raw)

Input
Output