Crawl websites.
Starts a crawl job for the provided URL and its children. Check available options like gotoOptions and waitFor* to control page load behaviour.
Security
API Token
The preferred authorization scheme for interacting with the Cloudflare API. Create a token.
Authorization: Bearer Sn3lZJTBX6kkg7OdcBUAxOO963GEIyGQqnFTOFYYAPI Email + API Key
The previous authorization scheme for interacting with the Cloudflare API, used in conjunction with a Global API key.
X-Auth-Email: user@example.comThe previous authorization scheme for interacting with the Cloudflare API. When possible, use API tokens instead of Global API keys.
X-Auth-Key: 144c9defac04969c7bfad8efaa8ea194Accepted Permissions (at least one required)
Browser Rendering WriteQuery ParametersExpand Collapse
Body ParametersJSONExpand Collapse
body: object { url, actionTimeout, addScriptTag, 25 more } or object { render, url, crawlPurposes, 8 more }
object { url, actionTimeout, addScriptTag, 25 more }
The maximum duration allowed for the browser action to complete after the page has loaded (such as taking screenshots, extracting content, or generating PDFs). If this time limit is exceeded, the action stops and returns a timeout error.
addScriptTag: optional array of object { id, content, type, url } Adds a <script> tag into the page with the desired URL or content.
Adds a <script> tag into the page with the desired URL or content.
addStyleTag: optional array of object { content, url } Adds a <link rel="stylesheet"> tag into the page with the desired URL or a <style type="text/css"> tag with the content.
Adds a <link rel="stylesheet"> tag into the page with the desired URL or a <style type="text/css"> tag with the content.
Only allow requests that match the provided regex patterns, eg. '/^.*.(css)'.
allowResourceTypes: optional array of "document" or "stylesheet" or "image" or 15 moreOnly allow requests that match the provided resource types, eg. 'image' or 'script'.
Only allow requests that match the provided resource types, eg. 'image' or 'script'.
cookies: optional array of object { name, value, domain, 11 more } Check options.
Check options.
crawlPurposes: optional array of "search" or "ai-input" or "ai-train"List of crawl purposes to respect Content-Signal directives in robots.txt. Allowed values: 'search', 'ai-input', 'ai-train'. Learn more: https://contentsignals.org/. Default: ['search', 'ai-input', 'ai-train'].
List of crawl purposes to respect Content-Signal directives in robots.txt. Allowed values: 'search', 'ai-input', 'ai-train'. Learn more: https://contentsignals.org/. Default: ['search', 'ai-input', 'ai-train'].
Maximum number of levels deep the crawler will traverse from the starting URL.
gotoOptions: optional object { referer, referrerPolicy, timeout, waitUntil } Check options.
Check options.
jsonOptions: optional object { custom_ai, prompt, response_format } Options for JSON extraction.
Options for JSON extraction.
custom_ai: optional array of object { authorization, model } Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.
Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.
response_format: optional object { type, json_schema }
json_schema: optional map[string or number or boolean or 2 more]Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/.
Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/.
Maximum age of a resource that can be returned from cache in seconds. Default is 1 day.
Unix timestamp (seconds since epoch) indicating to only crawl pages that were modified since this time. For sitemap URLs with a lastmod field, this is compared directly. For other URLs, the crawler will use If-Modified-Since header when fetching. URLs without modification information (no lastmod in sitemap and no Last-Modified header support) will be crawled. Note: This works in conjunction with maxAge - both filters must pass for a cached resource to be used. Must be within the last year and not in the future.
options: optional object { excludePatterns, includeExternalLinks, includePatterns, includeSubdomains } Additional options for the crawler.
Additional options for the crawler.
Exclude links matching the provided wildcard patterns in the crawl job. Example: 'https://example.com/privacy/**'.
Include external links in the crawl job. If set to true, includeSubdomains is ignored.
Include only links matching the provided wildcard patterns in the crawl job. Include patterns are evaluated before exclude patterns. URLs that match any of the specified include patterns will be included in the crawl job. Example: 'https://example.com/blog/**'.
Block undesired requests that match the provided regex patterns, eg. '/^.*.(css)'.
rejectResourceTypes: optional array of "document" or "stylesheet" or "image" or 15 moreBlock undesired requests that match the provided resource types, eg. 'image' or 'script'.
Block undesired requests that match the provided resource types, eg. 'image' or 'script'.
source: optional "sitemaps" or "links" or "all"Source of links to crawl. 'sitemaps' - only crawl URLs from sitemaps, 'links' - only crawl URLs scraped from pages, 'all' - crawl both sitemap and scraped links (default).
Source of links to crawl. 'sitemaps' - only crawl URLs from sitemaps, 'links' - only crawl URLs scraped from pages, 'all' - crawl both sitemap and scraped links (default).
viewport: optional object { height, width, deviceScaleFactor, 3 more } Check options.
Check options.
waitForSelector: optional object { selector, hidden, timeout, visible } Wait for the selector to appear in page. Check options.
Wait for the selector to appear in page. Check options.
object { render, url, crawlPurposes, 8 more }
crawlPurposes: optional array of "search" or "ai-input" or "ai-train"List of crawl purposes to respect Content-Signal directives in robots.txt. Allowed values: 'search', 'ai-input', 'ai-train'. Learn more: https://contentsignals.org/. Default: ['search', 'ai-input', 'ai-train'].
List of crawl purposes to respect Content-Signal directives in robots.txt. Allowed values: 'search', 'ai-input', 'ai-train'. Learn more: https://contentsignals.org/. Default: ['search', 'ai-input', 'ai-train'].
Maximum number of levels deep the crawler will traverse from the starting URL.
jsonOptions: optional object { custom_ai, prompt, response_format } Options for JSON extraction.
Options for JSON extraction.
custom_ai: optional array of object { authorization, model } Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.
Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.
response_format: optional object { type, json_schema }
json_schema: optional map[string or number or boolean or 2 more]Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/.
Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/.
Maximum age of a resource that can be returned from cache in seconds. Default is 1 day.
Unix timestamp (seconds since epoch) indicating to only crawl pages that were modified since this time. For sitemap URLs with a lastmod field, this is compared directly. For other URLs, the crawler will use If-Modified-Since header when fetching. URLs without modification information (no lastmod in sitemap and no Last-Modified header support) will be crawled. Note: This works in conjunction with maxAge - both filters must pass for a cached resource to be used. Must be within the last year and not in the future.
options: optional object { excludePatterns, includeExternalLinks, includePatterns, includeSubdomains } Additional options for the crawler.
Additional options for the crawler.
Exclude links matching the provided wildcard patterns in the crawl job. Example: 'https://example.com/privacy/**'.
Include external links in the crawl job. If set to true, includeSubdomains is ignored.
Include only links matching the provided wildcard patterns in the crawl job. Include patterns are evaluated before exclude patterns. URLs that match any of the specified include patterns will be included in the crawl job. Example: 'https://example.com/blog/**'.
Crawl websites.
curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/browser-rendering/crawl \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-d '{
"url": "https://example.com"
}'{
"result": "result",
"success": true,
"errors": [
{
"code": 0,
"message": "message"
}
]
}{
"errors": [
{
"code": 2001,
"message": "Rate limit exceeded"
}
],
"success": false
}Returns Examples
{
"result": "result",
"success": true,
"errors": [
{
"code": 0,
"message": "message"
}
]
}{
"errors": [
{
"code": 2001,
"message": "Rate limit exceeded"
}
],
"success": false
}