Llama 3.2 11B Vision Instruct model on Cloudflare Workers AI
Before you begin, ensure you have the following:
- A Cloudflare account ↗ with Workers and Workers AI enabled.
- Your
CLOUDFLARE_ACCOUNT_ID
andCLOUDFLARE_AUTH_TOKEN
.- You can generate an API token in your Cloudflare dashboard under API Tokens.
- Node.js installed for working with Cloudflare Workers (optional but recommended).
The first time you use the Llama 3.2 11B Vision Instruct model, you need to agree to Meta's License and Acceptable Use Policy.
curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/meta/llama-3.2-11b-vision-instruct \ -X POST \ -H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \ -d '{ "prompt": "agree" }'
Replace $CLOUDFLARE_ACCOUNT_ID
and $CLOUDFLARE_AUTH_TOKEN
with your actual account ID and token.
-
Create a Worker Project You will create a new Worker project using the
create-cloudflare
CLI (C3
). This tool simplifies setting up and deploying new applications to Cloudflare.Run the following command in your terminal:
npm create cloudflare@latest -- llama-vision-tutorial
pnpm create cloudflare@latest llama-vision-tutorial
yarn create cloudflare llama-vision-tutorial
For setup, select the following options:
- For What would you like to start with?, choose
Hello World example
. - For Which template would you like to use?, choose
Hello World Worker
. - For Which language do you want to use?, choose
JavaScript
. - For Do you want to use git for version control?, choose
Yes
. - For Do you want to deploy your application?, choose
No
(we will be making some changes before deploying).
After completing the setup, a new directory called llama-vision-tutorial
will be created.
-
Navigate to your application directory Change into the project directory:
Terminal window cd llama-vision-tutorial -
Project structure Your
llama-vision-tutorial
directory will include:- A "Hello World" Worker at
src/index.ts
. - A
wrangler.json
configuration file for managing deployment settings.
- A "Hello World" Worker at
Edit the src/index.ts
(or index.js
if you are not using TypeScript) file and replace the content with the following code:
export interface Env { AI: Ai;}
export default { async fetch(request, env): Promise<Response> { const messages = [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Describe the image I'm providing." }, ];
// Replace this with your image data encoded as base64 or a URL const imageBase64 = "data:image/png;base64,IMAGE_DATA_HERE";
const response = await env.AI.run("@cf/meta/llama-3.2-11b-vision-instruct", { messages, image: imageBase64, });
return Response.json(response); },} satisfies ExportedHandler<Env>;
- Open the Wrangler configuration file and add the following configuration:
{ "env": {}, "ai": { "binding": "AI", "model": "@cf/meta/llama-3.2-11b-vision-instruct" }}
[env][ai]binding="AI"model = "@cf/meta/llama-3.2-11b-vision-instruct"
- Save the file.
Run the following command to deploy your Worker:
wrangler deploy
- After deployment, you will receive a unique URL for your Worker (e.g.,
https://llama-vision-tutorial.<your-subdomain>.workers.dev
). - Use a tool like
curl
or Postman to send a request to your Worker:
curl -X POST https://llama-vision-tutorial.<your-subdomain>.workers.dev \ -d '{ "image": "BASE64_ENCODED_IMAGE" }'
Replace BASE64_ENCODED_IMAGE
with an actual base64-encoded image string.
The response will include the output from the model, such as a description or answer to your prompt based on the image provided.
Example response:
{ "result": "This is a golden retriever sitting in a grassy park."}