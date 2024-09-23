Whisper-large-v3-turbo with Cloudflare Workers AI
In this tutorial you will learn how to:
- Transcribe large audio files: Use the Whisper-large-v3-turbo model from Cloudflare Workers AI to perform automatic speech recognition (ASR) or translation.
- Handle large files: Split large audio files into smaller chunks for processing, which helps overcome memory and execution time limitations.
- Deploy using Cloudflare Workers: Create a scalable, low‑latency transcription pipeline in a serverless environment.
- Sign up for a Cloudflare account ↗.
- Install
Node.js↗.
Node.js version manager
Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of
16.17.0 or later.
You will create a new Worker project using the
create-cloudflare CLI (C3). C3 ↗ is a command-line tool designed to help you set up and deploy new applications to Cloudflare.
Create a new project named
whisper-tutorial by running:
Running
npm create cloudflare@latest will prompt you to install the
create-cloudflare package ↗, and lead you through setup. C3 will also install Wrangler, the Cloudflare Developer Platform CLI.
For setup, select the following options:
- For What would you like to start with?, choose
Hello World Starter.
- For Which template would you like to use?, choose
Worker only.
- For Which language do you want to use?, choose
TypeScript.
- For Do you want to use git for version control?, choose
Yes.
- For Do you want to deploy your application?, choose
No(we will be making some changes before deploying).
This will create a new
whisper-tutorial directory. Your new
whisper-tutorial directory will include:
- A
"Hello World"Worker at
src/index.ts.
- A
wrangler.jsoncconfiguration file.
Go to your application directory:
You must create an AI binding for your Worker to connect to Workers AI. Bindings allow your Workers to interact with resources, like Workers AI, on the Cloudflare Developer Platform.
To bind Workers AI to your Worker, add the following to the end of your
wrangler.toml file:
Your binding is available in your Worker code on
env.AI.
In your wrangler file, add or update the following settings to enable Node.js APIs and polyfills (with a compatibility date of 2024‑09‑23 or later):
Replace the contents of your
src/index.ts file with the following integrated code. This sample demonstrates how to:
(1) Extract an audio file URL from the query parameters.
(2) Fetch the audio file while explicitly following redirects.
(3) Split the audio file into smaller chunks (such as, 1 MB chunks).
(4) Transcribe each chunk using the Whisper-large-v3-turbo model via the Cloudflare AI binding.
(5) Return the aggregated transcription as plain text.
-
Run the Worker locally:
Use wrangler's development mode to test your Worker locally:
Open your browser and go to http://localhost:8787 ↗, or use curl:
Replace the URL query parameter with the direct link to your audio file. (For GitHub-hosted files, ensure you use the raw file URL.)
-
Deploy the Worker:
Once testing is complete, deploy your Worker with:
-
Test the deployed Worker:
After deployment, test your Worker by passing the audio URL as a query parameter:
Make sure to replace
<your-worker-subdomain>,
your-username,
your-repo, and
your-audio-file.mp3 with your actual details.
If successful, the Worker will return a transcript of the audio file:
