Llama 3.2 11B Vision Instruct model on Cloudflare Workers AI

Last reviewed: 6 months ago

Prerequisites

Before you begin, ensure you have the following:

A Cloudflare account ↗ with Workers and Workers AI enabled.
Your CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_AUTH_TOKEN.
- You can generate an API token in your Cloudflare dashboard under API Tokens.
Node.js installed for working with Cloudflare Workers (optional but recommended).

1. Agree to Meta's license

The first time you use the Llama 3.2 11B Vision Instruct model, you need to agree to Meta's License and Acceptable Use Policy.

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/meta/llama-3.2-11b-vision-instruct \
  -X POST \
  -H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \
  -d '{ "prompt": "agree" }'

Replace $CLOUDFLARE_ACCOUNT_ID and $CLOUDFLARE_AUTH_TOKEN with your actual account ID and token.

2. Set up your Cloudflare Worker

Create a Worker Project You will create a new Worker project using the create-cloudflare CLI (C3). This tool simplifies setting up and deploying new applications to Cloudflare.

Run the following command in your terminal:

npm create cloudflare@latest -- llama-vision-tutorial

yarn create cloudflare llama-vision-tutorial

pnpm create cloudflare@latest llama-vision-tutorial

For setup, select the following options:

For What would you like to start with?, choose Hello World example.
For Which template would you like to use?, choose Worker only.
For Which language do you want to use?, choose JavaScript.
For Do you want to use git for version control?, choose Yes.
For Do you want to deploy your application?, choose No (we will be making some changes before deploying).

After completing the setup, a new directory called llama-vision-tutorial will be created.

Navigate to your application directory Change into the project directory:
Terminal window
```
cd llama-vision-tutorial
```
Project structure Your llama-vision-tutorial directory will include:
- A "Hello World" Worker at src/index.ts.
- A wrangler.json configuration file for managing deployment settings.

3. Write the Worker code

Edit the src/index.ts (or index.js if you are not using TypeScript) file and replace the content with the following code:

export interface Env {
  AI: Ai;
}

export default {
  async fetch(request, env): Promise<Response> {
    const messages = [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "Describe the image I'm providing." },
    ];

    // Replace this with your image data encoded as base64 or a URL
    const imageBase64 = "data:image/png;base64,IMAGE_DATA_HERE";

    const response = await env.AI.run("@cf/meta/llama-3.2-11b-vision-instruct", {
      messages,
      image: imageBase64,
    });

    return Response.json(response);
  },
} satisfies ExportedHandler<Env>;

4. Bind Workers AI to your Worker

Open the Wrangler configuration file and add the following configuration:

wrangler.jsonc
wrangler.toml

{
  "env": {},
  "ai": {
    "binding": "AI"
  }
}

[env]
[ai]
binding="AI"

Save the file.

5. Deploy the Worker

Run the following command to deploy your Worker:

wrangler deploy

6. Test Your Worker

After deployment, you will receive a unique URL for your Worker (e.g., https://llama-vision-tutorial.<your-subdomain>.workers.dev).
Use a tool like curl or Postman to send a request to your Worker:

curl -X POST https://llama-vision-tutorial.<your-subdomain>.workers.dev \
  -d '{ "image": "BASE64_ENCODED_IMAGE" }'

Replace BASE64_ENCODED_IMAGE with an actual base64-encoded image string.

7. Verify the response

The response will include the output from the model, such as a description or answer to your prompt based on the image provided.

Example response:

{
  "result": "This is a golden retriever sitting in a grassy park."
}

Was this helpful?

Community
X
Discord
YouTube
GitHub