# Changelog URL: https://developers.cloudflare.com/browser-rendering/changelog/ import { ProductReleaseNotes } from "~/components"; {/* <!-- Actual content lives in /src/content/release-notes/radar.yaml. Update the file there for new entries to appear here. For more details, refer to https://developers.cloudflare.com/style-guide/documentation-content-strategy/content-types/changelog/#yaml-file --> */} <ProductReleaseNotes /> --- # FAQ URL: https://developers.cloudflare.com/browser-rendering/faq/ import { GlossaryTooltip } from "~/components"; Below you will find answers to our most commonly asked questions. If you cannot find the answer you are looking for, refer to the [Discord](https://discord.cloudflare.com) to explore additional resources. ##### Uncaught (in response) TypeError: Cannot read properties of undefined (reading 'fetch') Make sure that you are passing your Browser binding to the `puppeteer.launch` api and that you have [Workers for Platforms Paid plan](/cloudflare-for-platforms/workers-for-platforms/platform/pricing/). ##### Will browser rendering bypass Cloudflare's Bot Protection? Browser rendering requests are always identified as bots by Cloudflare. If you are trying to **scan** your **own zone**, you can create a [WAF skip rule](/waf/custom-rules/skip/) to bypass the bot protection using a header or a custom user agent. ## Puppeteer ##### Code generation from strings disallowed for this context while using an Xpath selector Currently it's not possible to use Xpath to select elements since this poses a security risk to Workers. As an alternative try to use a css selector or `page.evaluate` for example: ```ts const innerHtml = await page.evaluate(() => { return ( // @ts-ignore this runs on browser context new XPathEvaluator() .createExpression("/html/body/div/h1") // @ts-ignore this runs on browser context .evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE).singleNodeValue .innerHTML ); }); ``` :::note Keep in mind that `page.evaluate` can only return primitive types like strings, numbers, etc. Returning an `HTMLElement` will not work. ::: --- # Browser Rendering URL: https://developers.cloudflare.com/browser-rendering/ import { CardGrid, Description, LinkTitleCard, Plan, RelatedProduct, } from "~/components"; <Description> Browser automation for [Cloudflare Workers](/workers/). </Description> <Plan type="workers-paid" /> The Workers Browser Rendering API allows developers to programmatically control and interact with a headless browser instance and create automation flows for their applications and products. Once you configure the service, Workers Browser Rendering gives you access to a WebSocket endpoint that speaks the [DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/). DevTools is what allows Cloudflare to instrument a Chromium instance running in the Cloudflare global network. Use Browser Rendering to: - Take screenshots of pages. - Convert a page to a PDF. - Test web applications. - Gather page load performance metrics. - Crawl web pages for information retrieval. ## Related products <RelatedProduct header="Workers" href="/workers/" product="workers"> Build serverless applications and deploy instantly across the globe for exceptional performance, reliability, and scale. </RelatedProduct> <RelatedProduct header="Durable Objects" href="/durable-objects/" product="durable-objects"> A globally distributed coordination API with strongly consistent storage. </RelatedProduct> ## More resources <CardGrid> <LinkTitleCard title="Get started" href="/browser-rendering/get-started/" icon="open-book" > Deploy your first Browser Rendering project using Wrangler and Cloudflare's version of Puppeteer. </LinkTitleCard> <LinkTitleCard title="Learning Path" href="/learning-paths/workers/concepts/" icon="pen" > New to Workers? Get started with the Workers Learning Path. </LinkTitleCard> <LinkTitleCard title="Limits" href="/browser-rendering/platform/limits/" icon="document" > Learn about Browser Rendering limits. </LinkTitleCard> <LinkTitleCard title="Developer Discord" href="https://discord.cloudflare.com" icon="discord" > Connect with the Workers community on Discord to ask questions, show what you are building, and discuss the platform with other developers. </LinkTitleCard> <LinkTitleCard title="@CloudflareDev" href="https://x.com/cloudflaredev" icon="x.com" > Follow @CloudflareDev on Twitter to learn about product announcements, and what is new in Cloudflare Workers. </LinkTitleCard> </CardGrid> --- # Get started URL: https://developers.cloudflare.com/browser-rendering/get-started/ Browser rendering can be used in two ways: - [Workers Binding API](/browser-rendering/workers-binding-api) for complex scripts. - [REST API](/browser-rendering/rest-api/) for simple actions. --- # Use browser rendering with AI URL: https://developers.cloudflare.com/browser-rendering/how-to/ai/ import { Aside, WranglerConfig } from "~/components"; The ability to browse websites can be crucial when building workflows with AI. Here, we provide an example where we use Browser Rendering to visit `https://labs.apnic.net/` and then, using a machine learning model available in [Workers AI](/workers-ai/), extract the first post as JSON with a specified schema. ## Prerequisites 1. Use the `create-cloudflare` CLI to generate a new Hello World Cloudflare Worker script: ```sh npm create cloudflare@latest -- browser-worker ``` 2. Install `@cloudflare/puppeteer`, which allows you to control the Browser Rendering instance: ```sh npm i @cloudflare/puppeteer ``` 2. Install `zod` so we can define our output format and `zod-to-json-schema` so we can convert it into a JSON schema format: ```sh npm i zod npm i zod-to-json-schema ``` 3. Activate the nodejs compatibility flag and add your Browser Rendering binding to your new Wrangler configuration: <WranglerConfig> ```toml compatibility_flags = [ "nodejs_compat" ] ``` </WranglerConfig> <WranglerConfig> ```toml [browser] binding = "MY_BROWSER" ``` </WranglerConfig> 4. In order to use [Workers AI](/workers-ai/), you need to get your [Account ID and API token](/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id). Once you have those, create a [`.dev.vars`](/workers/configuration/environment-variables/#add-environment-variables-via-wrangler) file and set them there: ``` ACCOUNT_ID= API_TOKEN= ``` We use `.dev.vars` here since it's only for local development, otherwise you'd use [Secrets](/workers/configuration/secrets/). ## Load the page using Browser Rendering In the code below, we launch a browser using `await puppeteer.launch(env.MY_BROWSER)`, extract the rendered text and close the browser. Then, with the user prompt, the desired output schema and the rendered text, prepare a prompt to send to the LLM. Replace the contents of `src/index.ts` with the following skeleton script: ```ts import { z } from "zod"; import puppeteer from "@cloudflare/puppeteer"; import zodToJsonSchema from "zod-to-json-schema"; export default { async fetch(request, env) { const url = new URL(request.url); if (url.pathname != "/") { return new Response("Not found"); } // Your prompt and site to scrape const userPrompt = "Extract the first post only."; const targetUrl = "https://labs.apnic.net/"; // Launch browser const browser = await puppeteer.launch(env.MY_BROWSER); const page = await browser.newPage(); await page.goto(targetUrl); // Get website text const renderedText = await page.evaluate(() => { // @ts-ignore js code to run in the browser context const body = document.querySelector("body"); return body ? body.innerText : ""; }); // Close browser since we no longer need it await browser.close(); // define your desired json schema const outputSchema = zodToJsonSchema( z.object({ title: z.string(), url: z.string(), date: z.string() }) ); // Example prompt const prompt = ` You are a sophisticated web scraper. You are given the user data extraction goal and the JSON schema for the output data format. Your task is to extract the requested information from the text and output it in the specified JSON schema format: ${JSON.stringify(outputSchema)} DO NOT include anything else besides the JSON output, no markdown, no plaintext, just JSON. User Data Extraction Goal: ${userPrompt} Text extracted from the webpage: ${renderedText}`; // TODO call llm //const result = await getLLMResult(env, prompt, outputSchema); //return Response.json(result); } } satisfies ExportedHandler<Env>; ``` ## Call an LLM Having the webpage text, the user's goal and output schema, we can now use an LLM to transform it to JSON according to the user's request. The example below uses `@hf/thebloke/deepseek-coder-6.7b-instruct-awq` but other [models](/workers-ai/models/), or services like OpenAI, could be used with minimal changes: ```ts async getLLMResult(env, prompt: string, schema?: any) { const model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" const requestBody = { messages: [{ role: "user", content: prompt } ], }; const aiUrl = `https://api.cloudflare.com/client/v4/accounts/${env.ACCOUNT_ID}/ai/run/${model}` const response = await fetch(aiUrl, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${env.API_TOKEN}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { console.log(JSON.stringify(await response.text(), null, 2)); throw new Error(`LLM call failed ${aiUrl} ${response.status}`); } // process response const data = await response.json(); const text = data.result.response || ''; const value = (text.match(/```(?:json)?\s*([\s\S]*?)\s*```/) || [null, text])[1]; try { return JSON.parse(value); } catch(e) { console.error(`${e} . Response: ${value}`) } } ``` If you want to use Browser Rendering with OpenAI instead you'd just need to change the `aiUrl` endpoint and `requestBody` (or check out the [llm-scraper-worker](https://www.npmjs.com/package/llm-scraper-worker) package). ## Conclusion The full Worker script now looks as follows: ```ts import { z } from "zod"; import puppeteer from "@cloudflare/puppeteer"; import zodToJsonSchema from "zod-to-json-schema"; export default { async fetch(request, env) { const url = new URL(request.url); if (url.pathname != "/") { return new Response("Not found"); } // Your prompt and site to scrape const userPrompt = "Extract the first post only."; const targetUrl = "https://labs.apnic.net/"; // Launch browser const browser = await puppeteer.launch(env.MY_BROWSER); const page = await browser.newPage(); await page.goto(targetUrl); // Get website text const renderedText = await page.evaluate(() => { // @ts-ignore js code to run in the browser context const body = document.querySelector("body"); return body ? body.innerText : ""; }); // Close browser since we no longer need it await browser.close(); // define your desired json schema const outputSchema = zodToJsonSchema( z.object({ title: z.string(), url: z.string(), date: z.string() }) ); // Example prompt const prompt = ` You are a sophisticated web scraper. You are given the user data extraction goal and the JSON schema for the output data format. Your task is to extract the requested information from the text and output it in the specified JSON schema format: ${JSON.stringify(outputSchema)} DO NOT include anything else besides the JSON output, no markdown, no plaintext, just JSON. User Data Extraction Goal: ${userPrompt} Text extracted from the webpage: ${renderedText}`; // call llm const result = await getLLMResult(env, prompt, outputSchema); return Response.json(result); } } satisfies ExportedHandler<Env>; async function getLLMResult(env, prompt: string, schema?: any) { const model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" const requestBody = { messages: [{ role: "user", content: prompt } ], }; const aiUrl = `https://api.cloudflare.com/client/v4/accounts/${env.ACCOUNT_ID}/ai/run/${model}` const response = await fetch(aiUrl, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${env.API_TOKEN}`, }, body: JSON.stringify(requestBody), }); if (!response.ok) { console.log(JSON.stringify(await response.text(), null, 2)); throw new Error(`LLM call failed ${aiUrl} ${response.status}`); } // process response const data = await response.json() as { result: { response: string }}; const text = data.result.response || ''; const value = (text.match(/```(?:json)?\s*([\s\S]*?)\s*```/) || [null, text])[1]; try { return JSON.parse(value); } catch(e) { console.error(`${e} . Response: ${value}`) } } ``` You can run this script to test it using Wrangler's `--remote` flag: ```sh npx wrangler dev --remote ``` With your script now running, you can go to `http://localhost:8787/` and should see something like the following: ```json { "title": "IP Addresses in 2024", "url": "http://example.com/ip-addresses-in-2024", "date": "11 Jan 2025" } ``` For more complex websites or prompts, you might need a better model. Check out the latest models in [Workers AI](/workers-ai/models/). --- # How To URL: https://developers.cloudflare.com/browser-rendering/how-to/ import { DirectoryListing } from "~/components"; <DirectoryListing /> --- # Generate PDFs Using HTML and CSS URL: https://developers.cloudflare.com/browser-rendering/how-to/pdf-generation/ import { Aside, WranglerConfig } from "~/components"; As seen in the [Getting Started guide](/browser-rendering/workers-binding-api/screenshots/), Browser Rendering can be used to generate screenshots for any given URL. Alongside screenshots, you can also generate full PDF documents for a given webpage, and can also provide the webpage markup and style ourselves. ## Prerequisites 1. Use the `create-cloudflare` CLI to generate a new Hello World Cloudflare Worker script: ```sh npm create cloudflare@latest -- browser-worker ``` 2. Install `@cloudflare/puppeteer`, which allows you to control the Browser Rendering instance: ```sh npm install @cloudflare/puppeteer --save-dev ``` 3. Add your Browser Rendering binding to your new Wrangler configuration: <WranglerConfig> ```toml title="wrangler.toml" browser = { binding = "BROWSER" } ``` </WranglerConfig> 4. Replace the contents of `src/index.ts` (or `src/index.js` for JavaScript projects) with the following skeleton script: ```ts import puppeteer from "@cloudflare/puppeteer"; const generateDocument = (name: string) => {}; export default { async fetch(request, env) { const { searchParams } = new URL(request.url); let name = searchParams.get("name"); if (!name) { return new Response("Please provide a name using the ?name= parameter"); } const browser = await puppeteer.launch(env.BROWSER); const page = await browser.newPage(); // Step 1: Define HTML and CSS const document = generateDocument(name); // Step 2: Send HTML and CSS to our browser await page.setContent(document); // Step 3: Generate and return PDF return new Response(); }, }; ``` ## 1. Define HTML and CSS Rather than using Browser Rendering to navigate to a user-provided URL, manually generate a webpage, then provide that webpage to the Browser Rendering instance. This allows you to render any design you want. :::note You can generate your HTML or CSS using any method you like. This example uses string interpolation, but the method is also fully compatible with web frameworks capable of rendering HTML on Workers such as React, Remix, and Vue. ::: For this example, we're going to take in user-provided content (via a '?name=' parameter), and have that name output in the final PDF document. To start, fill out your `generateDocument` function with the following: ```ts const generateDocument = (name: string) => { return ` <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8" /> <style> html, body, #container { width: 100%; height: 100%; margin: 0; } body { font-family: Baskerville, Georgia, Times, serif; background-color: #f7f1dc; } strong { color: #5c594f; font-size: 128px; margin: 32px 0 48px 0; } em { font-size: 24px; } #container { flex-direction: column; display: flex; align-items: center; justify-content: center; text-align: center; } </style> </head> <body> <div id="container"> <em>This is to certify that</em> <strong>${name}</strong> <em>has rendered a PDF using Cloudflare Workers</em> </div> </body> </html> `; }; ``` This example HTML document should render a beige background imitating a certificate showing that the user-provided name has successfully rendered a PDF using Cloudflare Workers. :::note It is usually best to avoid directly interpolating user-provided content into an image or PDF renderer in production applications. To render contents like an invoice, it would be best to validate the data input and fetch the data yourself using tools like [D1](/d1/) or [Workers KV](/kv/). ::: ## 2. Load HTML and CSS Into Browser Now that you have your fully styled HTML document, you can take the contents and send it to your browser instance. Create an empty page to store this document as follows: ```ts const browser = await puppeteer.launch(env.BROWSER); const page = await browser.newPage(); ``` The [`page.setContent()`](https://github.com/cloudflare/puppeteer/blob/main/docs/api/puppeteer.page.setcontent.md) function can then be used to set the page's HTML contents from a string, so you can pass in your created document directly like so: ```ts await page.setContent(document); ``` ## 3. Generate and Return PDF With your Browser Rendering instance now rendering your provided HTML and CSS, you can use the [`page.pdf()`](https://github.com/cloudflare/puppeteer/blob/main/docs/api/puppeteer.page.pdf.md) command to generate a PDF file and return it to the client. ```ts let pdf = page.pdf({ printBackground: true }); ``` The `page.pdf()` call supports a [number of options](https://github.com/cloudflare/puppeteer/blob/main/docs/api/puppeteer.pdfoptions.md), including setting the dimensions of the generated PDF to a specific paper size, setting specific margins, and allowing fully-transparent backgrounds. For now, you are only overriding the `printBackground` option to allow your `body` background styles to show up. Now that you have your PDF data, return it to the client in the `Response` with an `application/pdf` content type: ```ts return new Response(pdf, { headers: { "content-type": "application/pdf", }, }); ``` ## Conclusion The full Worker script now looks as follows: ```ts import puppeteer from "@cloudflare/puppeteer"; const generateDocument = (name: string) => { return ` <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8" /> <style> html, body, #container { width: 100%; height: 100%; margin: 0; } body { font-family: Baskerville, Georgia, Times, serif; background-color: #f7f1dc; } strong { color: #5c594f; font-size: 128px; margin: 32px 0 48px 0; } em { font-size: 24px; } #container { flex-direction: column; display: flex; align-items: center; justify-content: center; text-align: center } </style> </head> <body> <div id="container"> <em>This is to certify that</em> <strong>${name}</strong> <em>has rendered a PDF using Cloudflare Workers</em> </div> </body> </html> `; }; export default { async fetch(request, env) { const { searchParams } = new URL(request.url); let name = searchParams.get("name"); if (!name) { return new Response("Please provide a name using the ?name= parameter"); } const browser = await puppeteer.launch(env.BROWSER); const page = await browser.newPage(); // Step 1: Define HTML and CSS const document = generateDocument(name); // // Step 2: Send HTML and CSS to our browser await page.setContent(document); // // Step 3: Generate and return PDF const pdf = await page.pdf({ printBackground: true }); return new Response(pdf, { headers: { "content-type": "application/pdf", }, }); }, }; ``` You can run this script to test it using Wrangler’s `--remote` flag: ```sh npx wrangler@latest dev --remote ``` With your script now running, you can pass in a `?name` parameter to the local URL (such as `http://localhost:8787/?name=Harley`) and should see the following: . --- Dynamically generating PDF documents solves a number of common use-cases, from invoicing customers to archiving documents to creating dynamic certificates (as seen in the simple example here). --- # Browser close reasons URL: https://developers.cloudflare.com/browser-rendering/platform/browser-close-reasons/ A browser session may close for a variety of reasons, occasionally due to connection errors or errors in the headless browser instance. As a best practice, wrap `puppeteer.connect` or `puppeteer.launch` in a [`try/catch`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/try...catch) statement. The reason that a browser closed can be found on the Browser Rendering Dashboard in the [logs tab](https://dash.cloudflare.com/?to=/:account/workers/browser-renderingl/logs). When Cloudflare begins charging for the Browser Rendering API, we will not charge when errors are due to underlying Browser Rendering infrastructure. | Reasons a session may end | | ---------------------------------------------------- | | User opens and closes browser normally. | | Browser is idle for 60 seconds. | | Chromium instance crashes. | | Error connecting with the client, server, or Worker. | | Browser session is evicted. | --- # Platform URL: https://developers.cloudflare.com/browser-rendering/platform/ import { DirectoryListing } from "~/components"; <DirectoryListing /> --- # Limits URL: https://developers.cloudflare.com/browser-rendering/platform/limits/ import { Render, Plan } from "~/components"; <Plan type="workers-paid" /> ## Workers Binding API | Feature | Limit | | -------------------------------- | ------------------- | | Concurrent browsers per account | 10 per account [^1] | | New browser instances per minute | 10 per minute [^1] | | Browser timeout | 60 seconds [^1][^2] | ## REST API | Feature | Limit | | -------------------------------- | ------------------- | | Concurrent browsers per account | 10 per account [^1] | | New browser instances per minute | 10 per minute [^1] | | Browser timeout | 60 seconds [^1][^2] | | Total requests per minute | 60 per minute [^1] | [^1]: Contact our team to request increases to this limit. [^2]: By default, a browser instance gets killed if it does not get any [devtools](https://chromedevtools.github.io/devtools-protocol/) command for 60 seconds, freeing one instance. Users can optionally increase this by using the `keep_alive` [option](/browser-rendering/platform/puppeteer/#keep-alive). `browser.close()` releases the browser instance. <Render file="limits-increase" product="browser-rendering" /> --- # Puppeteer URL: https://developers.cloudflare.com/browser-rendering/platform/puppeteer/ import { TabItem, Tabs } from "~/components"; [Puppeteer](https://pptr.dev/) is one of the most popular libraries that abstract the lower-level DevTools protocol from developers and provides a high-level API that you can use to easily instrument Chrome/Chromium and automate browsing sessions. Puppeteer is used for tasks like creating screenshots, crawling pages, and testing web applications. Puppeteer typically connects to a local Chrome or Chromium browser using the DevTools port. Refer to the [Puppeteer API documentation on the `Puppeteer.connect()` method](https://pptr.dev/api/puppeteer.puppeteer.connect) for more information. The Workers team forked a version of Puppeteer and patched it to connect to the Workers Browser Rendering API instead. After connecting, the developers can then use the full [Puppeteer API](https://github.com/cloudflare/puppeteer/blob/main/docs/api/index.md) as they would on a standard setup. Our version is open sourced and can be found in [Cloudflare's fork of Puppeteer](https://github.com/cloudflare/puppeteer). The npm can be installed from [npmjs](https://www.npmjs.com/) as [@cloudflare/puppeteer](https://www.npmjs.com/package/@cloudflare/puppeteer): ```bash npm install @cloudflare/puppeteer --save-dev ``` ## Use Puppeteer in a Worker Once the [browser binding](/browser-rendering/platform/wrangler/#bindings) is configured and the `@cloudflare/puppeteer` library is installed, Puppeteer can be used in a Worker: <Tabs> <TabItem label="JavaScript" icon="seti:javascript"> ```js import puppeteer from "@cloudflare/puppeteer"; export default { async fetch(request, env) { const browser = await puppeteer.launch(env.MYBROWSER); const page = await browser.newPage(); await page.goto("https://example.com"); const metrics = await page.metrics(); await browser.close(); return Response.json(metrics); }, }; ``` </TabItem> <TabItem label="TypeScript" icon="seti:typescript"> ```ts import puppeteer from "@cloudflare/puppeteer"; interface Env { MYBROWSER: Fetcher; } export default { async fetch(request, env): Promise<Response> { const browser = await puppeteer.launch(env.MYBROWSER); const page = await browser.newPage(); await page.goto("https://example.com"); const metrics = await page.metrics(); await browser.close(); return Response.json(metrics); }, } satisfies ExportedHandler<Env>; ``` </TabItem> </Tabs> This script [launches](https://pptr.dev/api/puppeteer.puppeteernode.launch) the `env.MYBROWSER` browser, opens a [new page](https://pptr.dev/api/puppeteer.browser.newpage), [goes to](https://pptr.dev/api/puppeteer.page.goto) [https://example.com/](https://example.com/), gets the page load [metrics](https://pptr.dev/api/puppeteer.page.metrics), [closes](https://pptr.dev/api/puppeteer.browser.close) the browser and prints metrics in JSON. ### Keep Alive If users omit the `browser.close()` statement, it will stay open, ready to be connected to again and [re-used](/browser-rendering/workers-binding-api/reuse-sessions/) but it will, by default, close automatically after 1 minute of inactivity. Users can optionally extend this idle time up to 10 minutes, by using the `keep_alive` option, set in milliseconds: ```js const browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 600000 }); ``` Using the above, the browser will stay open for up to 10 minutes, even if inactive. ## Session management In order to facilitate browser session management, we've added new methods to `puppeteer`: ### List open sessions `puppeteer.sessions()` lists the current running sessions. It will return an output similar to this: ```json [ { "connectionId": "2a2246fa-e234-4dc1-8433-87e6cee80145", "connectionStartTime": 1711621704607, "sessionId": "478f4d7d-e943-40f6-a414-837d3736a1dc", "startTime": 1711621703708 }, { "sessionId": "565e05fb-4d2a-402b-869b-5b65b1381db7", "startTime": 1711621703808 } ] ``` Notice that the session `478f4d7d-e943-40f6-a414-837d3736a1dc` has an active worker connection (`connectionId=2a2246fa-e234-4dc1-8433-87e6cee80145`), while session `565e05fb-4d2a-402b-869b-5b65b1381db7` is free. While a connection is active, no other workers may connect to that session. ### List recent sessions `puppeteer.history()` lists recent sessions, both open and closed. It's useful to get a sense of your current usage. ```json [ { "closeReason": 2, "closeReasonText": "BrowserIdle", "endTime": 1711621769485, "sessionId": "478f4d7d-e943-40f6-a414-837d3736a1dc", "startTime": 1711621703708 }, { "closeReason": 1, "closeReasonText": "NormalClosure", "endTime": 1711123501771, "sessionId": "2be00a21-9fb6-4bb2-9861-8cd48e40e771", "startTime": 1711123430918 } ] ``` Session `2be00a21-9fb6-4bb2-9861-8cd48e40e771` was closed explicitly with `browser.close()` by the client, while session `478f4d7d-e943-40f6-a414-837d3736a1dc` was closed due to reaching the maximum idle time (check [limits](/browser-rendering/platform/limits/)). You should also be able to access this information in the dashboard, albeit with a slight delay. ### Active limits `puppeteer.limits()` lists your active limits: ```json { "activeSessions": [ "478f4d7d-e943-40f6-a414-837d3736a1dc", "565e05fb-4d2a-402b-869b-5b65b1381db7" ], "allowedBrowserAcquisitions": 1, "maxConcurrentSessions": 2, "timeUntilNextAllowedBrowserAcquisition": 0 } ``` - `activeSessions` lists the IDs of the current open sessions - `maxConcurrentSessions` defines how many browsers can be open at the same time - `allowedBrowserAcquisitions` specifies if a new browser session can be opened according to the rate [limits](/browser-rendering/platform/limits/) in place - `timeUntilNextAllowedBrowserAcquisition` defines the waiting period before a new browser can be launched. ## Puppeteer API The full Puppeteer API can be found in the [Cloudflare's fork of Puppeteer](https://github.com/cloudflare/puppeteer/blob/main/docs/api/index.md). --- # Wrangler URL: https://developers.cloudflare.com/browser-rendering/platform/wrangler/ import { Render, WranglerConfig } from "~/components" [Wrangler](/workers/wrangler/) is a command-line tool for building with Cloudflare developer products. Use Wrangler to deploy projects that use the Workers Browser Rendering API. ## Install To install Wrangler, refer to [Install and Update Wrangler](/workers/wrangler/install-and-update/). ## Bindings [Bindings](/workers/runtime-apis/bindings/) allow your Workers to interact with resources on the Cloudflare developer platform. A browser binding will provide your Worker with an authenticated endpoint to interact with a dedicated Chromium browser instance. To deploy a Browser Rendering Worker, you must declare a [browser binding](/workers/runtime-apis/bindings/) in your Worker's Wrangler configuration file. <Render file="nodejs-compat-howto" product="workers" /> <WranglerConfig> ```toml # Top-level configuration name = "browser-rendering" main = "src/index.ts" workers_dev = true compatibility_flags = ["nodejs_compat_v2"] browser = { binding = "MYBROWSER" } ``` </WranglerConfig> After the binding is declared, access the DevTools endpoint using `env.MYBROWSER` in your Worker code: ```javascript const browser = await puppeteer.launch(env.MYBROWSER); ``` Run [`npx wrangler dev --remote`](/workers/wrangler/commands/#dev) to test your Worker remotely before deploying to Cloudflare's global network. Local mode support does not exist for Browser Rendering so `--remote` is required. To deploy, run [`npx wrangler deploy`](/workers/wrangler/commands/#deploy). --- # Fetch HTML URL: https://developers.cloudflare.com/browser-rendering/rest-api/content-endpoint/ The `/content` endpoint instructs the browser to navigate to a website and capture the fully rendered HTML of a page, including the `head` section, after JavaScript execution. This is ideal for capturing content from JavaScript-heavy or interactive websites. ## Basic usage Go to `https://example.com` and return the rendered HTML. ```bash curl -X 'POST' 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/content' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer <apiToken>' \ -d '{"url": "https://example.com"}' ``` ## Advanced usage Navigate to `https://cloudflare.com/` but block images and stylesheets from loading. Undesired requests can be blocked by resource type (`rejectResourceTypes`) or by using a regex pattern (`rejectRequestPattern`). The opposite can also be done, only allow requests that match `allowRequestPattern` or `allowResourceTypes`. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/content' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://cloudflare.com/", "rejectResourceTypes": ["image"], "rejectRequestPattern": ["/^.*\\.(css)"] }' ``` Many more options exist, like setting HTTP headers using `setExtraHTTPHeaders`, setting `cookies`, and using `gotoOptions` to control page load behaviour - check the endpoint [reference](/api/resources/browser_rendering/subresources/content/methods/create/) for all available parameters. --- # REST API URL: https://developers.cloudflare.com/browser-rendering/rest-api/ The REST API is a RESTful interface that provides endpoints for common browser actions such as capturing screenshots, extracting HTML content, generating PDFs, and more. The following are the available options: import { DirectoryListing } from "~/components"; <DirectoryListing /> Use the REST API when you need a fast, simple way to perform common browser tasks such as capturing screenshots, extracting HTML, or generating PDFs without writing complex scripts. If you require more advanced automation, custom workflows, or persistent browser sessions, the [Workers Binding API](/browser-rendering/workers-binding-api/) is the better choice. ## Before you begin Before you begin, make sure you [create a custom API Token](/fundamentals/api/get-started/create-token/) with the following permissions: - `Browser Rendering - Edit` --- # Render PDF URL: https://developers.cloudflare.com/browser-rendering/rest-api/pdf-endpoint/ The `/pdf` endpoint instructs the browser to render the webpage as a PDF document. ## Basic usage Navigate to `https://example.com/` and inject custom CSS and an external stylesheet. Then return the rendered page as a PDF. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/pdf' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://example.com/", "addStyleTag": [ { "content": "body { font-family: Arial; }" }, { "url": "https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/css/bootstrap.min.css" } ] }' \ --output "output.pdf" ``` ## Advanced usage Navigate to `https://example.com`, first setting an additional HTTP request header and configuring the page size (`viewport`). Then, wait until there are no more than 2 network connections for at least 500 ms, or until the maximum timeout of 4500 ms is reached, before considering the page loaded and returning the rendered PDF document. The `goToOptions` parameter exposes most of [Puppeteer'd API](https://pptr.dev/api/puppeteer.gotooptions). ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/pdf' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://example.com/", "setExtraHTTPHeaders": { "X-Custom-Header": "value" }, "viewport": { "width": 1200, "height": 800 }, "gotoOptions": { "waitUntil": "networkidle2", "timeout": 45000 } }' \ --output "advanced-output.pdf" ``` ## Blocking images and styles when generating a PDF The options `rejectResourceTypes` and `rejectRequestPattern` can be used to block requests. The opposite can also be done, _only_ allow certain requests using `allowResourceTypes` and `allowRequestPattern`. ```bash curl -X POST https://api.cloudflare.com/client/v4/accounts/<acccountID>/browser-rendering/pdf \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://cloudflare.com/", "rejectResourceTypes": ["image"], "rejectRequestPattern": ["/^.*\\.(css)"] }' \ --output "cloudflare.pdf" ``` ## Generate PDF from custom HTML If you have HTML you'd like to generate a PDF from, the `html` option can be used. The option `addStyleTag` can be used to add custom styles. ```bash curl -X POST https://api.cloudflare.com/client/v4/accounts/<acccountID>/browser-rendering/pdf \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "html": "<html><body>Advanced Snapshot</body></html>", "addStyleTag": [ { "content": "body { font-family: Arial; }" }, { "url": "https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/css/bootstrap.min.css" } ] }' \ --output "invoice.pdf" ``` Many more options exist, like setting HTTP credentials using `authenticate`, setting `cookies`, and using `gotoOptions` to control page load behaviour - check the endpoint [reference](/api/resources/browser_rendering/subresources/pdf/methods/create/) for all available parameters. --- # Scrape HTML elements URL: https://developers.cloudflare.com/browser-rendering/rest-api/scrape-endpoint/ The `/scrape` endpoint extracts structured data from specific elements on a webpage, returning details such as element dimensions and inner HTML. ## Basic usage Go to `https://example.com` and extract metadata from all `h1` and `a` elements in the DOM. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/scrape' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://example.com/", "elements": [{ "selector": "h1" }, { "selector": "a" }] }' ``` ### JSON response ```json title="json response" { "success": true, "result": [ { "results": [ { "attributes": [], "height": 39, "html": "Example Domain", "left": 100, "text": "Example Domain", "top": 133.4375, "width": 600 } ], "selector": "h1" }, { "results": [ { "attributes": [ { "name": "href", "value": "https://www.iana.org/domains/example" } ], "height": 20, "html": "More information...", "left": 100, "text": "More information...", "top": 249.875, "width": 142 } ], "selector": "a" } ] } ``` Many more options exist, like setting HTTP credentials using `authenticate`, setting `cookies`, and using `gotoOptions` to control page load behaviour - check the endpoint [reference](/api/resources/browser_rendering/subresources/scrape/methods/create/) for all available parameters. ### Response fields - `results` _(array of objects)_ - Contains extracted data for each selector. - `selector` _(string)_ - The CSS selector used. - `results` _(array of objects)_ - List of extracted elements matching the selector. - `text` _(string)_ - Inner text of the element. - `html` _(string)_ - Inner HTML of the element. - `attributes` _(array of objects)_ - List of extracted attributes such as `href` for links. - `height`, `width`, `top`, `left` _(number)_ - Position and dimensions of the element. --- # Capture screenshot URL: https://developers.cloudflare.com/browser-rendering/rest-api/screenshot-endpoint/ The `/screenshot` endpoint renders the webpage by processing its HTML and JavaScript, then captures a screenshot of the fully rendered page. ## Basic usage Sets the HTML content of the page to `Hello World!` and then takes a screenshot. The option `omitBackground` hides the default white background and allows capturing screenshots with transparency. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/screenshot' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "html": "Hello World!", "screenshotOptions": { "omitBackground": true } }' \ --output "screenshot.png" ``` For more options to control the final screenshot, like `clip`, `captureBeyondViewport`, `fullPage` and others, check the endpoint [reference](/api/resources/browser_rendering/subresources/screenshot/methods/create/). ## Advanced usage Navigate to `https://cloudflare.com/`, changing the page size (`viewport`) and waiting until there are no active network connections (`waitUntil`) or up to a maximum of `4500ms` (`timeout`). Then take a `fullPage` screenshot. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/screenshot' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://cnn.com/", "screenshotOptions": { "fullPage": true }, "viewport": { "width": 1280, "height": 720 }, "gotoOptions": { "waitUntil": "networkidle0", "timeout": 45000 } }' \ --output "advanced-screenshot.png" ``` ## Customize CSS and embed custom JavaScript Instruct the browser to go to `https://example.com`, embed custom JavaScript (`addScriptTag`) and add extra styles (`addStyleTag`), both inline (`addStyleTag.content`) and by loading an external stylesheet (`addStyleTag.url`). ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/screenshot' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://example.com/", "addScriptTag": [ { "content": "document.querySelector(`h1`).innerText = `Hello World!!!`" } ], "addStyleTag": [ { "content": "div { background: linear-gradient(45deg, #2980b9 , #82e0aa ); }" }, { "url": "https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/css/bootstrap.min.css" } ] }' \ --output "screenshot.png" ``` Many more options exist, like setting HTTP credentials using `authenticate`, setting `cookies`, and using `gotoOptions` to control page load behaviour - check the endpoint [reference](/api/resources/browser_rendering/subresources/screenshot/methods/create/) for all available parameters. --- # Take a webpage snapshot URL: https://developers.cloudflare.com/browser-rendering/rest-api/snapshot/ The `/snapshot` endpoint captures both the HTML content and a screenshot of the webpage in one request. It returns the HTML as a text string and the screenshot as a Base64-encoded image. ## Basic usage 1. Go to `https://example.com/`. 2. Inject custom JavaScript. 3. Capture the rendered HTML. 4. Take a screenshot. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/snapshot' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://example.com/", "addScriptTag": [ { "content": "document.body.innerHTML = \"Snapshot Page\";" } ] }' ``` ### JSON response ```json title="json response" { "success": true, "result": { "screenshot": "Base64EncodedScreenshotString", "content": "<html>...</html>" } } ``` ## Advanced usage The `html` property in the JSON payload, it sets the html to `<html><body>Advanced Snapshot</body></html>` then does the following steps: 1. Disable JavaScript. 2. Sets the screenshot to `fullPage`. 3. Changes the page size `(viewport)`. 4. Waits up to `30000ms` or until the `DOMContentLoaded` event fires. 5. Returns the rendered HTML content and a base-64 encoded screenshot of the page. ```bash curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/snapshot' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "html": "<html><body>Advanced Snapshot</body></html>", "setJavaScriptEnabled": false, "screenshotOptions": { "fullPage": true }, "viewport": { "width": 1200, "height": 800 }, "gotoOptions": { "waitUntil": "domcontentloaded", "timeout": 30000 } }' ``` ### JSON response ```json title="json response" { "success": true, "result": { "screenshot": "AdvancedBase64Screenshot", "content": "<html><body>Advanced Snapshot</body></html>" } } ``` Many more options exist, like setting HTTP credentials using `authenticate`, setting `cookies`, and using `gotoOptions` to control page load behaviour - check the endpoint [reference](/api/resources/browser_rendering/subresources/snapshot/) for all available parameters. --- # Deploy a Browser Rendering Worker with Durable Objects URL: https://developers.cloudflare.com/browser-rendering/workers-binding-api/browser-rendering-with-do/ import { Render, PackageManagers, WranglerConfig } from "~/components"; By following this guide, you will create a Worker that uses the Browser Rendering API along with [Durable Objects](/durable-objects/) to take screenshots from web pages and store them in [R2](/r2/). Using Durable Objects to persist browser sessions improves performance by eliminating the time that it takes to spin up a new browser session. Since Durable Objects re-uses sessions, it reduces the number of concurrent sessions needed. <Render file="prereqs" product="workers" /> ## 1. Create a Worker project [Cloudflare Workers](/workers/) provides a serverless execution environment that allows you to create new applications or augment existing ones without configuring or maintaining infrastructure. Your Worker application is a container to interact with a headless browser to do actions, such as taking screenshots. Create a new Worker project named `browser-worker` by running: <PackageManagers type="create" pkg="cloudflare@latest" args={"browser-worker"} /> ## 2. Enable Durable Objects in the dashboard To enable Durable Objects, you will need to purchase the Workers Paid plan: 1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/), and select your account. 2. Go to **Workers & Pages** > **Plans**. 3. Select **Purchase Workers Paid** and complete the payment process to enable Durable Objects. ## 3. Install Puppeteer In your `browser-worker` directory, install Cloudflare’s [fork of Puppeteer](/browser-rendering/platform/puppeteer/): ```sh npm install @cloudflare/puppeteer --save-dev ``` ## 4. Create a R2 bucket Create two R2 buckets, one for production, and one for development. Note that bucket names must be lowercase and can only contain dashes. ```sh wrangler r2 bucket create screenshots wrangler r2 bucket create screenshots-test ``` To check that your buckets were created, run: ```sh wrangler r2 bucket list ``` After running the `list` command, you will see all bucket names, including the ones you have just created. ## 5. Configure your Wrangler configuration file Configure your `browser-worker` project's [Wrangler configuration file](/workers/wrangler/configuration/) by adding a browser [binding](/workers/runtime-apis/bindings/) and a [Node.js compatibility flag](/workers/configuration/compatibility-flags/#nodejs-compatibility-flag). Browser bindings allow for communication between a Worker and a headless browser which allows you to do actions such as taking a screenshot, generating a PDF and more. Update your Wrangler configuration file with the Browser Rendering API binding, the R2 bucket you created and a Durable Object: <WranglerConfig> ```toml name = "rendering-api-demo" main = "src/index.js" compatibility_date = "2023-09-04" compatibility_flags = [ "nodejs_compat"] account_id = "<ACCOUNT_ID>" # Browser Rendering API binding browser = { binding = "MYBROWSER" } # Bind an R2 Bucket [[r2_buckets]] binding = "BUCKET" bucket_name = "screenshots" preview_bucket_name = "screenshots-test" # Binding to a Durable Object [[durable_objects.bindings]] name = "BROWSER" class_name = "Browser" [[migrations]] tag = "v1" # Should be unique for each entry new_classes = ["Browser"] # Array of new classes ``` </WranglerConfig> ## 6. Code The code below uses Durable Object to instantiate a browser using Puppeteer. It then opens a series of web pages with different resolutions, takes a screenshot of each, and uploads it to R2. The Durable Object keeps a browser session open for 60 seconds after last use. If a browser session is open, any requests will re-use the existing session rather than creating a new one. Update your Worker code by copy and pasting the following: ```js import puppeteer from "@cloudflare/puppeteer"; export default { async fetch(request, env) { let id = env.BROWSER.idFromName("browser"); let obj = env.BROWSER.get(id); // Send a request to the Durable Object, then await its response. let resp = await obj.fetch(request.url); return resp; }, }; const KEEP_BROWSER_ALIVE_IN_SECONDS = 60; export class Browser { constructor(state, env) { this.state = state; this.env = env; this.keptAliveInSeconds = 0; this.storage = this.state.storage; } async fetch(request) { // screen resolutions to test out const width = [1920, 1366, 1536, 360, 414]; const height = [1080, 768, 864, 640, 896]; // use the current date and time to create a folder structure for R2 const nowDate = new Date(); var coeff = 1000 * 60 * 5; var roundedDate = new Date( Math.round(nowDate.getTime() / coeff) * coeff, ).toString(); var folder = roundedDate.split(" GMT")[0]; //if there's a browser session open, re-use it if (!this.browser || !this.browser.isConnected()) { console.log(`Browser DO: Starting new instance`); try { this.browser = await puppeteer.launch(this.env.MYBROWSER); } catch (e) { console.log( `Browser DO: Could not start browser instance. Error: ${e}`, ); } } // Reset keptAlive after each call to the DO this.keptAliveInSeconds = 0; const page = await this.browser.newPage(); // take screenshots of each screen size for (let i = 0; i < width.length; i++) { await page.setViewport({ width: width[i], height: height[i] }); await page.goto("https://workers.cloudflare.com/"); const fileName = "screenshot_" + width[i] + "x" + height[i]; const sc = await page.screenshot({ path: fileName + ".jpg" }); await this.env.BUCKET.put(folder + "/" + fileName + ".jpg", sc); } // Close tab when there is no more work to be done on the page await page.close(); // Reset keptAlive after performing tasks to the DO. this.keptAliveInSeconds = 0; // set the first alarm to keep DO alive let currentAlarm = await this.storage.getAlarm(); if (currentAlarm == null) { console.log(`Browser DO: setting alarm`); const TEN_SECONDS = 10 * 1000; await this.storage.setAlarm(Date.now() + TEN_SECONDS); } return new Response("success"); } async alarm() { this.keptAliveInSeconds += 10; // Extend browser DO life if (this.keptAliveInSeconds < KEEP_BROWSER_ALIVE_IN_SECONDS) { console.log( `Browser DO: has been kept alive for ${this.keptAliveInSeconds} seconds. Extending lifespan.`, ); await this.storage.setAlarm(Date.now() + 10 * 1000); // You could ensure the ws connection is kept alive by requesting something // or just let it close automatically when there is no work to be done // for example, `await this.browser.version()` } else { console.log( `Browser DO: exceeded life of ${KEEP_BROWSER_ALIVE_IN_SECONDS}s.`, ); if (this.browser) { console.log(`Closing browser.`); await this.browser.close(); } } } } ``` ## 7. Test Run [`npx wrangler dev --remote`](/workers/wrangler/commands/#dev) to test your Worker remotely before deploying to Cloudflare's global network. Local mode support does not exist for Browser Rendering so `--remote` is required. ## 8. Deploy Run [`npx wrangler deploy`](/workers/wrangler/commands/#deploy) to deploy your Worker to the Cloudflare global network. ## Related resources - Other [Puppeteer examples](https://github.com/cloudflare/puppeteer/tree/main/examples) - Get started with [Durable Objects](/durable-objects/get-started/) - [Using R2 from Workers](/r2/api/workers/workers-api-usage/) --- # Workers Binding API URL: https://developers.cloudflare.com/browser-rendering/workers-binding-api/ import { DirectoryListing } from "~/components"; The Workers Binding API allows you to execute advanced browser rendering scripts within Cloudflare Workers. It provides developers the flexibility to automate and control complex workflows and browser interactions. The following options are available for browser rendering tasks: <DirectoryListing /> Use the Workers Binding API when you need advanced browser automation, custom workflows, or complex interactions beyond basic rendering. For quick, one-off tasks like capturing screenshots or extracting HTML, the [REST API](/browser-rendering/rest-api/) is the simpler choice. --- # Reuse sessions URL: https://developers.cloudflare.com/browser-rendering/workers-binding-api/reuse-sessions/ import { Render, PackageManagers, WranglerConfig } from "~/components"; The best way to improve the performance of your browser rendering Worker is to reuse sessions. One way to do that is via [Durable Objects](/browser-rendering/workers-binding-api/browser-rendering-with-do/), which allows you to keep a long running connection from a Worker to a browser. Another way is to keep the browser open after you've finished with it, and connect to that session each time you have a new request. In short, this entails using `browser.disconnect()` instead of `browser.close()`, and, if there are available sessions, using `puppeteer.connect(env.MY_BROWSER, sessionID)` instead of launching a new browser session. ## 1. Create a Worker project [Cloudflare Workers](/workers/) provides a serverless execution environment that allows you to create new applications or augment existing ones without configuring or maintaining infrastructure. Your Worker application is a container to interact with a headless browser to do actions, such as taking screenshots. Create a new Worker project named `browser-worker` by running: <PackageManagers type="create" pkg="cloudflare@latest" args={"browser-worker"} /> <Render file="c3-post-run-steps" product="workers" params={{ category: "hello-world", type: "Hello World Worker", lang: "TypeScript", }} /> ## 2. Install Puppeteer In your `browser-worker` directory, install Cloudflare's [fork of Puppeteer](/browser-rendering/platform/puppeteer/): ```sh npm install @cloudflare/puppeteer --save-dev ``` ## 3. Configure the [Wrangler configuration file](/workers/wrangler/configuration/) <WranglerConfig> ```toml name = "browser-worker" main = "src/index.ts" compatibility_date = "2023-03-14" compatibility_flags = [ "nodejs_compat" ] browser = { binding = "MYBROWSER" } ``` </WranglerConfig> ## 4. Code The script below starts by fetching the current running sessions. If there are any that don't already have a worker connection, it picks a random session ID and attempts to connect (`puppeteer.connect(..)`) to it. If that fails or there were no running sessions to start with, it launches a new browser session (`puppeteer.launch(..)`). Then, it goes to the website and fetches the dom. Once that's done, it disconnects (`browser.disconnect()`), making the connection available to other workers. Take into account that if the browser is idle, i.e. does not get any command, for more than the current [limit](/browser-rendering/platform/limits/), it will close automatically, so you must have enough requests per minute to keep it alive. ```ts import puppeteer from "@cloudflare/puppeteer"; interface Env { MYBROWSER: Fetcher; } export default { async fetch(request: Request, env: Env): Promise<Response> { const url = new URL(request.url); let reqUrl = url.searchParams.get("url") || "https://example.com"; reqUrl = new URL(reqUrl).toString(); // normalize // Pick random session from open sessions let sessionId = await this.getRandomSession(env.MYBROWSER); let browser, launched; if (sessionId) { try { browser = await puppeteer.connect(env.MYBROWSER, sessionId); } catch (e) { // another worker may have connected first console.log(`Failed to connect to ${sessionId}. Error ${e}`); } } if (!browser) { // No open sessions, launch new session browser = await puppeteer.launch(env.MYBROWSER); launched = true; } sessionId = browser.sessionId(); // get current session id // Do your work here const page = await browser.newPage(); const response = await page.goto(reqUrl); const html = await response!.text(); // All work done, so free connection (IMPORTANT!) await browser.disconnect(); return new Response( `${launched ? "Launched" : "Connected to"} ${sessionId} \n-----\n` + html, { headers: { "content-type": "text/plain", }, }, ); }, // Pick random free session // Other custom logic could be used instead async getRandomSession(endpoint: puppeteer.BrowserWorker): Promise<string> { const sessions: puppeteer.ActiveSession[] = await puppeteer.sessions(endpoint); console.log(`Sessions: ${JSON.stringify(sessions)}`); const sessionsIds = sessions .filter((v) => { return !v.connectionId; // remove sessions with workers connected to them }) .map((v) => { return v.sessionId; }); if (sessionsIds.length === 0) { return; } const sessionId = sessionsIds[Math.floor(Math.random() * sessionsIds.length)]; return sessionId!; }, }; ``` Besides `puppeteer.sessions()`, we've added other methods to facilitate [Session Management](/browser-rendering/platform/puppeteer/#session-management). ## 5. Test Run [`npx wrangler dev --remote`](/workers/wrangler/commands/#dev) to test your Worker remotely before deploying to Cloudflare's global network. Local mode support does not exist for Browser Rendering so `--remote` is required. To test go to the following URL: `<LOCAL_HOST_URL>/?url=https://example.com` ## 6. Deploy Run `npx wrangler deploy` to deploy your Worker to the Cloudflare global network and then to go to the following URL: `<YOUR_WORKER>.<YOUR_SUBDOMAIN>.workers.dev/?url=https://example.com` --- # Deploy a Browser Rendering Worker URL: https://developers.cloudflare.com/browser-rendering/workers-binding-api/screenshots/ import { Render, TabItem, Tabs, PackageManagers, WranglerConfig } from "~/components"; By following this guide, you will create a Worker that uses the Browser Rendering API to take screenshots from web pages. This is a common use case for browser automation. <Render file="prereqs" product="workers" /> ## 1. Create a Worker project [Cloudflare Workers](/workers/) provides a serverless execution environment that allows you to create new applications or augment existing ones without configuring or maintaining infrastructure. Your Worker application is a container to interact with a headless browser to do actions, such as taking screenshots. Create a new Worker project named `browser-worker` by running: <PackageManagers type="create" pkg="cloudflare@latest" args={"browser-worker"} /> <Render file="c3-post-run-steps" product="workers" params={{ category: "hello-world", type: "Hello World Worker", lang: "JavaScript / TypeScript", }} /> ## 2. Install Puppeteer In your `browser-worker` directory, install Cloudflare’s [fork of Puppeteer](/browser-rendering/platform/puppeteer/): ```sh npm install @cloudflare/puppeteer --save-dev ``` ## 3. Create a KV namespace Browser Rendering can be used with other developer products. You might need a [relational database](/d1/), an [R2 bucket](/r2/) to archive your crawled pages and assets, a [Durable Object](/durable-objects/) to keep your browser instance alive and share it with multiple requests, or [Queues](/queues/) to handle your jobs asynchronous. For the purpose of this guide, you are going to use a [KV store](/kv/concepts/kv-namespaces/) to cache your screenshots. Create two namespaces, one for production, and one for development. ```sh npx wrangler kv namespace create BROWSER_KV_DEMO npx wrangler kv namespace create BROWSER_KV_DEMO --preview ``` Take note of the IDs for the next step. ## 4. Configure the Wrangler configuration file Configure your `browser-worker` project's [Wrangler configuration file](/workers/wrangler/configuration/) by adding a browser [binding](/workers/runtime-apis/bindings/) and a [Node.js compatibility flag](/workers/configuration/compatibility-flags/#nodejs-compatibility-flag). Bindings allow your Workers to interact with resources on the Cloudflare developer platform. Your browser `binding` name is set by you, this guide uses the name `MYBROWSER`. Browser bindings allow for communication between a Worker and a headless browser which allows you to do actions such as taking a screenshot, generating a PDF and more. Update your [Wrangler configuration file](/workers/wrangler/configuration/) with the Browser Rendering API binding and the KV namespaces you created: <WranglerConfig> ```toml title="wrangler.toml" name = "browser-worker" main = "src/index.js" compatibility_date = "2023-03-14" compatibility_flags = [ "nodejs_compat" ] browser = { binding = "MYBROWSER" } kv_namespaces = [ { binding = "BROWSER_KV_DEMO", id = "22cf855786094a88a6906f8edac425cd", preview_id = "e1f8b68b68d24381b57071445f96e623" } ] ``` </WranglerConfig> ## 5. Code <Tabs> <TabItem label="JavaScript" icon="seti:javascript"> Update `src/index.js` with your Worker code: ```js import puppeteer from "@cloudflare/puppeteer"; export default { async fetch(request, env) { const { searchParams } = new URL(request.url); let url = searchParams.get("url"); let img; if (url) { url = new URL(url).toString(); // normalize img = await env.BROWSER_KV_DEMO.get(url, { type: "arrayBuffer" }); if (img === null) { const browser = await puppeteer.launch(env.MYBROWSER); const page = await browser.newPage(); await page.goto(url); img = await page.screenshot(); await env.BROWSER_KV_DEMO.put(url, img, { expirationTtl: 60 * 60 * 24, }); await browser.close(); } return new Response(img, { headers: { "content-type": "image/jpeg", }, }); } else { return new Response("Please add an ?url=https://example.com/ parameter"); } }, }; ``` </TabItem> <TabItem label="TypeScript" icon="seti:typescript"> Update `src/index.ts` with your Worker code: ```ts import puppeteer from "@cloudflare/puppeteer"; interface Env { MYBROWSER: Fetcher; BROWSER_KV_DEMO: KVNamespace; } export default { async fetch(request, env): Promise<Response> { const { searchParams } = new URL(request.url); let url = searchParams.get("url"); let img: Buffer; if (url) { url = new URL(url).toString(); // normalize img = await env.BROWSER_KV_DEMO.get(url, { type: "arrayBuffer" }); if (img === null) { const browser = await puppeteer.launch(env.MYBROWSER); const page = await browser.newPage(); await page.goto(url); img = (await page.screenshot()) as Buffer; await env.BROWSER_KV_DEMO.put(url, img, { expirationTtl: 60 * 60 * 24, }); await browser.close(); } return new Response(img, { headers: { "content-type": "image/jpeg", }, }); } else { return new Response("Please add an ?url=https://example.com/ parameter"); } }, } satisfies ExportedHandler<Env>; ``` </TabItem> </Tabs> This Worker instantiates a browser using Puppeteer, opens a new page, navigates to what you put in the `"url"` parameter, takes a screenshot of the page, stores the screenshot in KV, closes the browser, and responds with the JPEG image of the screenshot. If your Worker is running in production, it will store the screenshot to the production KV namespace. If you are running `wrangler dev`, it will store the screenshot to the dev KV namespace. If the same `"url"` is requested again, it will use the cached version in KV instead, unless it expired. ## 6. Test Run [`npx wrangler dev --remote`](/workers/wrangler/commands/#dev) to test your Worker remotely before deploying to Cloudflare's global network. Local mode support does not exist for Browser Rendering so `--remote` is required. To test taking your first screenshot, go to the following URL: `<LOCAL_HOST_URL>/?url=https://example.com` ## 7. Deploy Run `npx wrangler deploy` to deploy your Worker to the Cloudflare global network. To take your first screenshot, go to the following URL: `<YOUR_WORKER>.<YOUR_SUBDOMAIN>.workers.dev/?url=https://example.com` ## Related resources - Other [Puppeteer examples](https://github.com/cloudflare/puppeteer/tree/main/examples) ---