Crawl websites.

POST/accounts/{account_id}/browser-rendering/crawl

Starts a crawl job for the provided URL and its children. Check available options like gotoOptions and waitFor* to control page load behaviour.

Security

API Token

The preferred authorization scheme for interacting with the Cloudflare API. Create a token.

Example:Authorization: Bearer Sn3lZJTBX6kkg7OdcBUAxOO963GEIyGQqnFTOFYY

API Email + API Key

The previous authorization scheme for interacting with the Cloudflare API, used in conjunction with a Global API key.

Example:X-Auth-Email: user@example.com

The previous authorization scheme for interacting with the Cloudflare API. When possible, use API tokens instead of Global API keys.

Example:X-Auth-Key: 144c9defac04969c7bfad8efaa8ea194

Accepted Permissions (at least one required)

Browser Rendering Write

Path ParametersExpand Collapse

account_id: string

Account ID.

Query ParametersExpand Collapse

cacheTTL: optional number

Cache TTL default is 5s. Set to 0 to disable.

maximum86400

minimum0

Body ParametersJSONExpand Collapse

body: object { url, actionTimeout, addScriptTag, 25 more } or object { render, url, crawlPurposes, 8 more }

One of the following:

object { url, actionTimeout, addScriptTag, 25 more }

url: string

URL to navigate to, eg. https://example.com.

formaturi

actionTimeout: optional number

The maximum duration allowed for the browser action to complete after the page has loaded (such as taking screenshots, extracting content, or generating PDFs). If this time limit is exceeded, the action stops and returns a timeout error.

maximum120000

addScriptTag: optional array of object { id, content, type, url }

Adds a <script> tag into the page with the desired URL or content.

id: optional string

content: optional string

type: optional string

url: optional string

formaturi

addStyleTag: optional array of object { content, url }

Adds a <link rel="stylesheet"> tag into the page with the desired URL or a <style type="text/css"> tag with the content.

content: optional string

url: optional string

formaturi

allowRequestPattern: optional array of string

Only allow requests that match the provided regex patterns, eg. ’/^.*.(css)’.

allowResourceTypes: optional array of "document" or "stylesheet" or "image" or 15 more

Only allow requests that match the provided resource types, eg. ‘image’ or ‘script’.

One of the following:

"document"

"stylesheet"

"image"

"media"

"font"

"script"

"texttrack"

"xhr"

"fetch"

"prefetch"

"eventsource"

"websocket"

"manifest"

"signedexchange"

"ping"

"cspviolationreport"

"preflight"

"other"

authenticate: optional object { password, username }

Provide credentials for HTTP authentication.

password: string

minLength1

username: string

minLength1

bestAttempt: optional boolean

Attempt to proceed when ‘awaited’ events fail or timeout.

cookies: optional array of object { name, value, domain, 11 more }

Check options.

name: string

Cookie name.

value: string

domain: optional string

expires: optional number

httpOnly: optional boolean

partitionKey: optional string

path: optional string

priority: optional "Low" or "Medium" or "High"

One of the following:

"Low"

"Medium"

"High"

sameParty: optional boolean

sameSite: optional "Strict" or "Lax" or "None"

One of the following:

"Strict"

"Lax"

"None"

secure: optional boolean

sourcePort: optional number

sourceScheme: optional "Unset" or "NonSecure" or "Secure"

One of the following:

"Unset"

"NonSecure"

"Secure"

url: optional string

crawlPurposes: optional array of "search" or "ai-input" or "ai-train"

List of crawl purposes to respect Content-Signal directives in robots.txt. Allowed values: ‘search’, ‘ai-input’, ‘ai-train’. Learn more: https://contentsignals.org/. Default: [‘search’, ‘ai-input’, ‘ai-train’].

One of the following:

"search"

"ai-input"

"ai-train"

depth: optional number

Maximum number of levels deep the crawler will traverse from the starting URL.

maximum100000

minimum1

emulateMediaType: optional string

formats: optional array of "html" or "markdown" or "json"

Formats to return. Default is html.

One of the following:

"html"

"markdown"

"json"

gotoOptions: optional object { referer, referrerPolicy, timeout, waitUntil }

Check options.

referer: optional string

referrerPolicy: optional string

timeout: optional number

maximum60000

waitUntil: optional "load" or "domcontentloaded" or "networkidle0" or "networkidle2" or array of "load" or "domcontentloaded" or "networkidle0" or "networkidle2"

One of the following:

"load" or "domcontentloaded" or "networkidle0" or "networkidle2"

One of the following:

"load"

"domcontentloaded"

"networkidle0"

"networkidle2"

array of "load" or "domcontentloaded" or "networkidle0" or "networkidle2"

One of the following:

"load"

"domcontentloaded"

"networkidle0"

"networkidle2"

jsonOptions: optional object { custom_ai, prompt, response_format }

Options for JSON extraction.

custom_ai: optional array of object { model, authorization }

Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.

model: string

AI model to use for the request. Must be formed as <provider>/<model_name>, e.g. workers-ai/@cf/meta/llama-3.3-70b-instruct-fp8-fast.

authorization: optional string

Authorization token for the AI model: Bearer <token>. Not needed for workers-ai models.

prompt: optional string

response_format: optional object { type, json_schema }

type: string

json_schema: optional map[string or number or boolean or 2 more]

Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/

One of the following:

string

number

boolean

unknown

array of string

limit: optional number

Maximum number of URLs to crawl.

maximum100000

minimum1

maxAge: optional number

Maximum age of a resource that can be returned from cache in seconds. Default is 1 day.

maximum604800

minimum0

modifiedSince: optional number

Unix timestamp (seconds since epoch) indicating to only crawl pages that were modified since this time. For sitemap URLs with a lastmod field, this is compared directly. For other URLs, the crawler will use If-Modified-Since header when fetching. URLs without modification information (no lastmod in sitemap and no Last-Modified header support) will be crawled. Note: This works in conjunction with maxAge - both filters must pass for a cached resource to be used. Must be within the last year and not in the future.

exclusiveMinimum

minimum0

options: optional object { excludePatterns, includeExternalLinks, includePatterns, includeSubdomains }

Additional options for the crawler.

excludePatterns: optional array of string

Exclude links matching the provided wildcard patterns in the crawl job. Example: ‘https://example.com/privacy/**’.

includeExternalLinks: optional boolean

Include external links in the crawl job. If set to true, includeSubdomains is ignored.

includePatterns: optional array of string

Include only links matching the provided wildcard patterns in the crawl job. Include patterns are evaluated before exclude patterns. URLs that match any of the specified include patterns will be included in the crawl job. Example: ‘https://example.com/blog/**’.

includeSubdomains: optional boolean

Include links to subdomains in the crawl job. This option is ignored if includeExternalLinks is true.

rejectRequestPattern: optional array of string

Block undesired requests that match the provided regex patterns, eg. ’/^.*.(css)’.

rejectResourceTypes: optional array of "document" or "stylesheet" or "image" or 15 more

Block undesired requests that match the provided resource types, eg. ‘image’ or ‘script’.

One of the following:

"document"

"stylesheet"

"image"

"media"

"font"

"script"

"texttrack"

"xhr"

"fetch"

"prefetch"

"eventsource"

"websocket"

"manifest"

"signedexchange"

"ping"

"cspviolationreport"

"preflight"

"other"

render: optional true

Whether to render the page or fetch static content. True by default.

setExtraHTTPHeaders: optional map[string]

setJavaScriptEnabled: optional boolean

source: optional "sitemaps" or "links" or "all"

Source of links to crawl. ‘sitemaps’ - only crawl URLs from sitemaps, ‘links’ - only crawl URLs scraped from pages, ‘all’ - crawl both sitemap and scraped links (default).

One of the following:

"sitemaps"

"links"

"all"

viewport: optional object { height, width, deviceScaleFactor, 3 more }

Check options.

height: number

width: number

deviceScaleFactor: optional number

hasTouch: optional boolean

isLandscape: optional boolean

isMobile: optional boolean

waitForSelector: optional object { selector, hidden, timeout, visible }

Wait for the selector to appear in page. Check options.

selector: string

hidden: optional true

timeout: optional number

maximum120000

visible: optional true

waitForTimeout: optional number

Waits for a specified timeout before continuing.

maximum120000

object { render, url, crawlPurposes, 8 more }

render: false

Whether to render the page or fetch static content. True by default.

url: string

URL to navigate to, eg. https://example.com.

formaturi

crawlPurposes: optional array of "search" or "ai-input" or "ai-train"

One of the following:

"search"

"ai-input"

"ai-train"

depth: optional number

Maximum number of levels deep the crawler will traverse from the starting URL.

maximum100000

minimum1

formats: optional array of "html" or "markdown" or "json"

Formats to return. Default is html.

One of the following:

"html"

"markdown"

"json"

jsonOptions: optional object { custom_ai, prompt, response_format }

Options for JSON extraction.

custom_ai: optional array of object { model, authorization }

Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.

model: string

AI model to use for the request. Must be formed as <provider>/<model_name>, e.g. workers-ai/@cf/meta/llama-3.3-70b-instruct-fp8-fast.

authorization: optional string

Authorization token for the AI model: Bearer <token>. Not needed for workers-ai models.

prompt: optional string

response_format: optional object { type, json_schema }

type: string

json_schema: optional map[string or number or boolean or 2 more]

Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/

One of the following:

string

number

boolean

unknown

array of string

limit: optional number

Maximum number of URLs to crawl.

maximum100000

minimum1

maxAge: optional number

Maximum age of a resource that can be returned from cache in seconds. Default is 1 day.

maximum604800

minimum0

modifiedSince: optional number

exclusiveMinimum

minimum0

options: optional object { excludePatterns, includeExternalLinks, includePatterns, includeSubdomains }

Additional options for the crawler.

excludePatterns: optional array of string

Exclude links matching the provided wildcard patterns in the crawl job. Example: ‘https://example.com/privacy/**’.

includeExternalLinks: optional boolean

Include external links in the crawl job. If set to true, includeSubdomains is ignored.

includePatterns: optional array of string

includeSubdomains: optional boolean

Include links to subdomains in the crawl job. This option is ignored if includeExternalLinks is true.

source: optional "sitemaps" or "links" or "all"

Source of links to crawl. ‘sitemaps’ - only crawl URLs from sitemaps, ‘links’ - only crawl URLs scraped from pages, ‘all’ - crawl both sitemap and scraped links (default).

One of the following:

"sitemaps"

"links"

"all"

ReturnsExpand Collapse

result: string

Crawl job ID.

success: boolean

Response status.

errors: optional array of object { code, message }

code: number

Error code.

message: string

Error message.

Crawl websites.

curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/browser-rendering/crawl \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
    -d '{
          "url": "https://example.com"
        }'

{
  "result": "result",
  "success": true,
  "errors": [
    {
      "code": 0,
      "message": "message"
    }
  ]
}

{
  "errors": [
    {
      "code": 2001,
      "message": "Rate limit exceeded"
    }
  ],
  "success": false
}

Returns Examples

{
  "result": "result",
  "success": true,
  "errors": [
    {
      "code": 0,
      "message": "message"
    }
  ]
}

{
  "errors": [
    {
      "code": 2001,
      "message": "Rate limit exceeded"
    }
  ],
  "success": false
}