Browse the web

Agents can browse the web using the Browser Rendering API or your preferred headless browser service.

Browser Rendering API

The Browser Rendering allows you to spin up headless browser instances, render web pages, and interact with websites through your Agent.

You can define a method that uses Puppeteer to pull the content of a web page, parse the DOM, and extract relevant information by calling the OpenAI model:

JavaScript
TypeScript

export class MyAgent extends Agent {
  async browse(browserInstance, urls) {
    let responses = [];
    for (const url of urls) {
      const browser = await puppeteer.launch(browserInstance);
      const page = await browser.newPage();
      await page.goto(url);

      await page.waitForSelector("body");
      const bodyContent = await page.$eval(
        "body",
        (element) => element.innerHTML,
      );
      const client = new OpenAI({
        apiKey: this.env.OPENAI_API_KEY,
      });

      let resp = await client.chat.completions.create({
        model: this.env.MODEL,
        messages: [
          {
            role: "user",
            content: `Return a JSON object with the product names, prices and URLs with the following format: { "name": "Product Name", "price": "Price", "url": "URL" } from the website content below. <content>${bodyContent}</content>`,
          },
        ],
        response_format: {
          type: "json_object",
        },
      });

      responses.push(resp);
      await browser.close();
    }

    return responses;
  }
}

interface Env {
  BROWSER: Fetcher;
}

export class MyAgent extends Agent<Env> {
  async browse(browserInstance: Fetcher, urls: string[]) {
    let responses = [];
    for (const url of urls) {
      const browser = await puppeteer.launch(browserInstance);
      const page = await browser.newPage();
      await page.goto(url);

      await page.waitForSelector("body");
      const bodyContent = await page.$eval(
        "body",
        (element) => element.innerHTML,
      );
      const client = new OpenAI({
        apiKey: this.env.OPENAI_API_KEY,
      });

      let resp = await client.chat.completions.create({
        model: this.env.MODEL,
        messages: [
          {
            role: "user",
            content: `Return a JSON object with the product names, prices and URLs with the following format: { "name": "Product Name", "price": "Price", "url": "URL" } from the website content below. <content>${bodyContent}</content>`,
          },
        ],
        response_format: {
          type: "json_object",
        },
      });

      responses.push(resp);
      await browser.close();
    }

    return responses;
  }
}

You'll also need to add install the @cloudflare/puppeteer package and add the following to the wrangler configuration of your Agent:

npm i -D @cloudflare/puppeteer

yarn add -D @cloudflare/puppeteer

pnpm add -D @cloudflare/puppeteer

wrangler.jsonc
wrangler.toml

{
  // ...
  "browser": {
    "binding": "MYBROWSER",
  },
  // ...
}

[browser]
binding = "MYBROWSER"

Browserbase

You can also use Browserbase ↗ by using the Browserbase API directly from within your Agent.

Once you have your Browserbase API key ↗, you can add it to your Agent by creating a secret:

cd your-agent-project-folder
npx wrangler@latest secret put BROWSERBASE_API_KEY

Enter a secret value: ******
Creating the secret for the Worker "agents-example"
Success! Uploaded secret BROWSERBASE_API_KEY

Install the @cloudflare/puppeteer package and use it from within your Agent to call the Browserbase API:

npm i @cloudflare/puppeteer

yarn add @cloudflare/puppeteer

pnpm add @cloudflare/puppeteer

JavaScript
TypeScript

export class MyAgent extends Agent {
  constructor(env) {
    super(env);
  }
}

interface Env {
  BROWSERBASE_API_KEY: string;
}

export class MyAgent extends Agent<Env> {
  constructor(env: Env) {
    super(env);
  }
}

Was this helpful?

Community
X
Discord
YouTube
GitHub