In this tutorial, you will learn how to create a Voice Notes App with automatic transcriptions of voice recordings, and optional post-processing. The following tools will be used to build the application:
Workers AI to transcribe the voice recordings, and for the optional post processing
D1 database to store the notes
R2 storage to store the voice recordings
Nuxt framework to build the full-stack application
Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler, discussed later in this guide, requires a Node version of 16.17.0 or later.
1. Create a new Worker project
Create a new Worker project using the c3 CLI with the nuxt framework preset.
If everything is set up correctly, you should see a Nuxt welcome page at http://localhost:3000.
2. Create the transcribe API endpoint
This API makes use of Workers AI to transcribe the voice recordings. To use Workers AI within your project, you first need to bind it to the Worker.
Add the AI binding to the wrangler.toml file.
Once the AI binding has been configured, run the cf-typegen command to generate the necessary Cloudflare type definitions. This makes the types definitions available in the server event contexts.
Create a transcribe POST endpoint by creating transcribe.post.ts file inside the /server/api directory.
The above code does the following:
Extracts the audio blob from the event.
Transcribes the blob using the @cf/openai/whisper model and returns the transcription text as response.
3. Create an API endpoint for uploading audio recordings to R2
Before uploading the audio recordings to R2, you need to create a bucket first. You will also need to add the R2 binding to your wrangler.toml file and regenerate the Cloudflare type definitions.
This will create a new migrations folder in the project's root directory, and add an empty 0001_create_notes_table.sql file to it. Replace the contents of this file with the code below.
And then apply this migration to create the notes table.
Now you can create the API endpoint. Create a new file index.post.ts in the server/api/notes directory, and change its content to the following:
The above does the following:
Extracts the text, and optional audioUrls from the event.
Saves it to the database after converting the audioUrls to a JSON string.
5. Handle note creation on the client-side
Now you're ready to work on the client side. Let's start by tackling the note creation part first.
Recording user audio
Create a composable to handle audio recording using the MediaRecorder API. This will be used to record notes through the user's microphone.
Create a new file useMediaRecorder.ts in the app/composables folder, and add the following code to it:
The above code does the following:
Exposes functions to start and stop audio recordings in a Vue application.
Captures audio input from the user's microphone using MediaRecorder API.
Processes real-time audio data for visualization using AudioContext and AnalyserNode.
Stores recording state including duration and recording status.
Maintains chunks of audio data and combines them into a final audio blob when recording stops.
Updates audio visualization data continuously using animation frames while recording.
Automatically cleans up all audio resources when recording stops or component unmounts.
Returns audio recordings in webm format for further processing.
Create a component for note creation
This component allows users to create notes by either typing or recording audio. It also handles audio transcription and uploading the recordings to the server.
Create a new file named CreateNote.vue inside the app/components folder. Add the following template code to the newly created file:
The above template results in the following:
A panel with a textarea inside to type the note manually.
Another panel to manage start/stop of an audio recording, and show the recordings done already.
A bottom panel to reset or save the note (along with the recordings).
Now, add the following code below the template code in the same file:
The above code does the following:
When a recording is stopped by calling handleRecordingStop function, the audio blob is sent for transcribing to the transcribe API endpoint.
The transcription response text is appended to the existing textarea content.
When the note is saved by calling the saveNote function, the audio recordings are uploaded first to R2 by using the upload endpoint we created earlier. Then, the actual note content along with the audioUrls (the R2 object keys) are saved by calling the notes post endpoint.
Create a new page route for showing the component
You can use this component in a Nuxt page to show it to the user. But before that you need to modify your app.vue file. Update the content of your app.vue to the following:
The above code allows for a nuxt page to be shown to the user, apart from showing an app header and a navigation sidebar.
Next, add a new file named new.vue inside the app/pages folder, add the following code to it:
The above code shows the CreateNote component inside a modal, and navigates back to the home page on successful note creation.
6. Showing the notes on the client side
To show the notes from the database on the client side, create an API endpoint first that will interact with the database.
Create an API endpoint to fetch notes from the database
Create a new file named index.get.ts inside the server/api/notes directory, and add the following code to it:
The above code fetches the last 50 notes from the database, ordered by their creation date in descending order. The audio_urls field is stored as a string in the database, but it's converted to an array using JSON.parse to handle multiple audio files seamlessly on the client side.
Next, create a page named index.vue inside the app/pages directory. This will be the home page of the application. Add the following code to it:
The above code fetches the notes from the database by calling the /api/notes endpoint you created just now, and renders them as note cards.
Serving the saved recordings from R2
To be able to play the audio recordings of these notes, you need to serve the saved recordings from the R2 storage.
Create a new file named [...pathname].get.ts inside the server/routes/recordings directory, and add the following code to it:
The above code extracts the path name from the event params, and serves the saved recording matching that object key from the R2 bucket.
7. [Optional] Post Processing the transcriptions
Even though the speech-to-text transcriptions models perform satisfactorily, sometimes you want to post process the transcriptions for various reasons. It could be to remove any discrepancy, or to change the tone/style of the final text.
Create a settings page
Create a new file named settings.vue in the app/pages folder, and add the following code to it:
The above code renders a toggle button that enables/disables the post processing of transcriptions. If enabled, users can change the prompt that will used while post processing the transcription with an AI model.
The transcription settings are saved using useStorageAsync, which utilizes the browser's local storage. This ensures that users' preferences are retained even after refreshing the page.
Send the post processing prompt with recorded audio
Modify the CreateNote component to send the post processing prompt along with the audio blob, while calling the transcribe API endpoint.
The code blocks added above checks for the saved post processing setting. If enabled, and there is a defined prompt, it sends the prompt to the transcribe API endpoint.
Handle post processing in the transcribe API endpoint
Modify the transcribe API endpoint, and update it to the following:
The above code does the following:
Extracts the post processing prompt from the event FormData.
If present, it calls the Workers AI API to process the transcription text using the @cf/meta/llama-3.1-8b-instruct model.
Finally, it returns the response from Workers AI to the client.
8. Deploy the application
Now you are ready to deploy the project to a .workers.dev sub-domain by running the deploy command.
You can preview your application at <YOUR_WORKER>.<YOUR_SUBDOMAIN>.workers.dev.
Conclusion
In this tutorial, you have gone through the steps of building a voice notes application using Nuxt 3, Cloudflare Workers, D1, and R2 storage. You learnt to:
Set up the backend to store and manage notes
Create API endpoints to fetch and display notes
Handle audio recordings
Implement optional post-processing for transcriptions
Deploy the application using the Cloudflare module syntax
The complete source code of the project is available on GitHub. You can go through it to see the code for various frontend components not covered in the article. You can find it here: github.com/ra-jeev/vnotes ↗.
Was this helpful?
What did you like?
What went wrong?
Thank you for helping improve Cloudflare's documentation!