Use browser rendering with AI
The ability to browse websites can be crucial when building workflows with AI. Here, we provide an example where we use Browser Rendering to visit
https://labs.apnic.net/
and then, using a machine learning model available in Workers AI, extract the first post as JSON with a specified schema.
- Use the
create-cloudflare
CLI to generate a new Hello World Cloudflare Worker script:
npm create cloudflare@latest -- browser-worker
- Install
@cloudflare/puppeteer
, which allows you to control the Browser Rendering instance:
npm i @cloudflare/puppeteer
- Install
zod
so we can define our output format andzod-to-json-schema
so we can convert it into a JSON schema format:
npm i zodnpm i zod-to-json-schema
- Activate the nodejs compatibility flag and add your Browser Rendering binding to your new Wrangler configuration:
{ "compatibility_flags": [ "nodejs_compat" ]}
compatibility_flags = [ "nodejs_compat" ]
{ "browser": { "binding": "MY_BROWSER" }}
[browser]binding = "MY_BROWSER"
- In order to use Workers AI, you need to get your Account ID and API token.
Once you have those, create a
.dev.vars
file and set them there:
ACCOUNT_ID=API_TOKEN=
We use .dev.vars
here since it's only for local development, otherwise you'd use Secrets.
In the code below, we launch a browser using await puppeteer.launch(env.MY_BROWSER)
, extract the rendered text and close the browser.
Then, with the user prompt, the desired output schema and the rendered text, prepare a prompt to send to the LLM.
Replace the contents of src/index.ts
with the following skeleton script:
import { z } from "zod";import puppeteer from "@cloudflare/puppeteer";import zodToJsonSchema from "zod-to-json-schema";
export default { async fetch(request, env) { const url = new URL(request.url); if (url.pathname != "/") { return new Response("Not found"); }
// Your prompt and site to scrape const userPrompt = "Extract the first post only."; const targetUrl = "https://labs.apnic.net/";
// Launch browser const browser = await puppeteer.launch(env.MY_BROWSER); const page = await browser.newPage(); await page.goto(targetUrl);
// Get website text const renderedText = await page.evaluate(() => { // @ts-ignore js code to run in the browser context const body = document.querySelector("body"); return body ? body.innerText : ""; }); // Close browser since we no longer need it await browser.close();
// define your desired json schema const outputSchema = zodToJsonSchema( z.object({ title: z.string(), url: z.string(), date: z.string() }) );
// Example prompt const prompt = ` You are a sophisticated web scraper. You are given the user data extraction goal and the JSON schema for the output data format. Your task is to extract the requested information from the text and output it in the specified JSON schema format:
${JSON.stringify(outputSchema)}
DO NOT include anything else besides the JSON output, no markdown, no plaintext, just JSON.
User Data Extraction Goal: ${userPrompt}
Text extracted from the webpage: ${renderedText}`;
// TODO call llm //const result = await getLLMResult(env, prompt, outputSchema); //return Response.json(result); }
} satisfies ExportedHandler<Env>;
Having the webpage text, the user's goal and output schema, we can now use an LLM to transform it to JSON according to the user's request.
The example below uses @hf/thebloke/deepseek-coder-6.7b-instruct-awq
but other models, or services like OpenAI, could be used with minimal changes:
async getLLMResult(env, prompt: string, schema?: any) { const model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" const requestBody = { messages: [{ role: "user", content: prompt } ], }; const aiUrl = `https://api.cloudflare.com/client/v4/accounts/${env.ACCOUNT_ID}/ai/run/${model}`
const response = await fetch(aiUrl, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${env.API_TOKEN}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { console.log(JSON.stringify(await response.text(), null, 2)); throw new Error(`LLM call failed ${aiUrl} ${response.status}`); }
// process response const data = await response.json(); const text = data.result.response || ''; const value = (text.match(/```(?:json)?\s*([\s\S]*?)\s*```/) || [null, text])[1]; try { return JSON.parse(value); } catch(e) { console.error(`${e} . Response: ${value}`) } }
If you want to use Browser Rendering with OpenAI instead you'd just need to change the aiUrl
endpoint and requestBody
(or check out the llm-scraper-worker ↗ package).
The full Worker script now looks as follows:
import { z } from "zod";import puppeteer from "@cloudflare/puppeteer";import zodToJsonSchema from "zod-to-json-schema";
export default { async fetch(request, env) { const url = new URL(request.url); if (url.pathname != "/") { return new Response("Not found"); }
// Your prompt and site to scrape const userPrompt = "Extract the first post only."; const targetUrl = "https://labs.apnic.net/";
// Launch browser const browser = await puppeteer.launch(env.MY_BROWSER); const page = await browser.newPage(); await page.goto(targetUrl);
// Get website text const renderedText = await page.evaluate(() => { // @ts-ignore js code to run in the browser context const body = document.querySelector("body"); return body ? body.innerText : ""; }); // Close browser since we no longer need it await browser.close();
// define your desired json schema const outputSchema = zodToJsonSchema( z.object({ title: z.string(), url: z.string(), date: z.string() }) );
// Example prompt const prompt = ` You are a sophisticated web scraper. You are given the user data extraction goal and the JSON schema for the output data format. Your task is to extract the requested information from the text and output it in the specified JSON schema format:
${JSON.stringify(outputSchema)}
DO NOT include anything else besides the JSON output, no markdown, no plaintext, just JSON.
User Data Extraction Goal: ${userPrompt}
Text extracted from the webpage: ${renderedText}`;
// call llm const result = await getLLMResult(env, prompt, outputSchema); return Response.json(result); }
} satisfies ExportedHandler<Env>;
async function getLLMResult(env, prompt: string, schema?: any) { const model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" const requestBody = { messages: [{ role: "user", content: prompt } ], }; const aiUrl = `https://api.cloudflare.com/client/v4/accounts/${env.ACCOUNT_ID}/ai/run/${model}`
const response = await fetch(aiUrl, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${env.API_TOKEN}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { console.log(JSON.stringify(await response.text(), null, 2)); throw new Error(`LLM call failed ${aiUrl} ${response.status}`); }
// process response const data = await response.json() as { result: { response: string }}; const text = data.result.response || ''; const value = (text.match(/```(?:json)?\s*([\s\S]*?)\s*```/) || [null, text])[1]; try { return JSON.parse(value); } catch(e) { console.error(`${e} . Response: ${value}`) }}
You can run this script to test it using Wrangler's --remote
flag:
npx wrangler dev --remote
With your script now running, you can go to http://localhost:8787/
and should see something like the following:
{ "title": "IP Addresses in 2024", "url": "http://example.com/ip-addresses-in-2024", "date": "11 Jan 2025"}
For more complex websites or prompts, you might need a better model. Check out the latest models in Workers AI.