Skip to content
Start here

Crawl websites.

browser_rendering.crawl.create(CrawlCreateParams**kwargs) -> CrawlCreateResponse
POST/accounts/{account_id}/browser-rendering/crawl

Starts a crawl job for the provided URL and its children. Check available options like gotoOptions and waitFor* to control page load behaviour.

Security
API Token

The preferred authorization scheme for interacting with the Cloudflare API. Create a token.

Example:Authorization: Bearer Sn3lZJTBX6kkg7OdcBUAxOO963GEIyGQqnFTOFYY
API Email + API Key

The previous authorization scheme for interacting with the Cloudflare API, used in conjunction with a Global API key.

Example:X-Auth-Email: user@example.com

The previous authorization scheme for interacting with the Cloudflare API. When possible, use API tokens instead of Global API keys.

Example:X-Auth-Key: 144c9defac04969c7bfad8efaa8ea194
Accepted Permissions (at least one required)
Browser Rendering Write
ParametersExpand Collapse
account_id: str

Account ID.

url: str

URL to navigate to, eg. https://example.com.

formaturi
cache_ttl: Optional[float]

Cache TTL default is 5s. Set to 0 to disable.

maximum86400
action_timeout: Optional[float]

The maximum duration allowed for the browser action to complete after the page has loaded (such as taking screenshots, extracting content, or generating PDFs). If this time limit is exceeded, the action stops and returns a timeout error.

maximum120000
add_script_tag: Optional[Iterable[Variant0AddScriptTag]]

Adds a <script> tag into the page with the desired URL or content.

id: Optional[str]
content: Optional[str]
type: Optional[str]
url: Optional[str]
add_style_tag: Optional[Iterable[Variant0AddStyleTag]]

Adds a <link rel="stylesheet"> tag into the page with the desired URL or a <style type="text/css"> tag with the content.

content: Optional[str]
url: Optional[str]
allow_request_pattern: Optional[SequenceNotStr[str]]

Only allow requests that match the provided regex patterns, eg. '/^.*.(css)'.

allow_resource_types: Optional[List[Literal["document", "stylesheet", "image", 15 more]]]

Only allow requests that match the provided resource types, eg. 'image' or 'script'.

One of the following:
"document"
"stylesheet"
"image"
"media"
"font"
"script"
"texttrack"
"xhr"
"fetch"
"prefetch"
"eventsource"
"websocket"
"manifest"
"signedexchange"
"ping"
"cspviolationreport"
"preflight"
"other"
authenticate: Optional[Variant0Authenticate]

Provide credentials for HTTP authentication.

password: str
minLength1
username: str
minLength1
best_attempt: Optional[bool]

Attempt to proceed when 'awaited' events fail or timeout.

cookies: Optional[Iterable[Variant0Cookie]]

Check options.

name: str

Cookie name.

value: str
domain: Optional[str]
expires: Optional[float]
http_only: Optional[bool]
partition_key: Optional[str]
path: Optional[str]
priority: Optional[Literal["Low", "Medium", "High"]]
One of the following:
"Low"
"Medium"
"High"
same_party: Optional[bool]
same_site: Optional[Literal["Strict", "Lax", "None"]]
One of the following:
"Strict"
"Lax"
"None"
secure: Optional[bool]
source_port: Optional[float]
source_scheme: Optional[Literal["Unset", "NonSecure", "Secure"]]
One of the following:
"Unset"
"NonSecure"
"Secure"
url: Optional[str]
crawl_purposes: Optional[List[Literal["search", "ai-input", "ai-train"]]]

List of crawl purposes to respect Content-Signal directives in robots.txt. Allowed values: 'search', 'ai-input', 'ai-train'. Learn more: https://contentsignals.org/. Default: ['search', 'ai-input', 'ai-train'].

One of the following:
"search"
"ai-input"
"ai-train"
depth: Optional[float]

Maximum number of levels deep the crawler will traverse from the starting URL.

maximum100000
minimum1
emulate_media_type: Optional[str]
formats: Optional[List[Literal["html", "markdown", "json"]]]

Formats to return. Default is html.

One of the following:
"html"
"markdown"
"json"
goto_options: Optional[Variant0GotoOptions]

Check options.

referer: Optional[str]
referrer_policy: Optional[str]
timeout: Optional[float]
maximum60000
wait_until: Optional[Union[Literal["load", "domcontentloaded", "networkidle0", "networkidle2"], List[Literal["load", "domcontentloaded", "networkidle0", "networkidle2"]]]]
One of the following:
Literal["load", "domcontentloaded", "networkidle0", "networkidle2"]
One of the following:
"load"
"domcontentloaded"
"networkidle0"
"networkidle2"
List[Literal["load", "domcontentloaded", "networkidle0", "networkidle2"]]
One of the following:
"load"
"domcontentloaded"
"networkidle0"
"networkidle2"
json_options: Optional[Variant0JsonOptions]

Options for JSON extraction.

custom_ai: Optional[Iterable[Variant0JsonOptionsCustomAI]]

Optional list of custom AI models to use for the request. The models will be tried in the order provided, and in case a model returns an error, the next one will be used as fallback.

authorization: str

Authorization token for the AI model: Bearer <token>.

model: str

AI model to use for the request. Must be formed as <provider>/<model_name>, e.g. workers-ai/@cf/meta/llama-3.3-70b-instruct-fp8-fast.

prompt: Optional[str]
response_format: Optional[Variant0JsonOptionsResponseFormat]
type: str
json_schema: Optional[Dict[str, Union[str, float, bool, 2 more]]]

Schema for the response format. More information here: https://developers.cloudflare.com/workers-ai/json-mode/.

One of the following:
str
float
bool
SequenceNotStr[str]
object
limit: Optional[float]

Maximum number of URLs to crawl.

maximum100000
minimum1
max_age: Optional[float]

Maximum age of a resource that can be returned from cache in seconds. Default is 1 day.

maximum604800
minimum0
modified_since: Optional[int]

Unix timestamp (seconds since epoch) indicating to only crawl pages that were modified since this time. For sitemap URLs with a lastmod field, this is compared directly. For other URLs, the crawler will use If-Modified-Since header when fetching. URLs without modification information (no lastmod in sitemap and no Last-Modified header support) will be crawled. Note: This works in conjunction with maxAge - both filters must pass for a cached resource to be used. Must be within the last year and not in the future.

exclusiveMinimum
minimum0
options: Optional[Variant0Options]

Additional options for the crawler.

exclude_patterns: Optional[SequenceNotStr[str]]

Exclude links matching the provided wildcard patterns in the crawl job. Example: 'https://example.com/privacy/**'.

include_patterns: Optional[SequenceNotStr[str]]

Include only links matching the provided wildcard patterns in the crawl job. Include patterns are evaluated before exclude patterns. URLs that match any of the specified include patterns will be included in the crawl job. Example: 'https://example.com/blog/**'.

include_subdomains: Optional[bool]

Include links to subdomains in the crawl job. This option is ignored if includeExternalLinks is true.

reject_request_pattern: Optional[SequenceNotStr[str]]

Block undesired requests that match the provided regex patterns, eg. '/^.*.(css)'.

reject_resource_types: Optional[List[Literal["document", "stylesheet", "image", 15 more]]]

Block undesired requests that match the provided resource types, eg. 'image' or 'script'.

One of the following:
"document"
"stylesheet"
"image"
"media"
"font"
"script"
"texttrack"
"xhr"
"fetch"
"prefetch"
"eventsource"
"websocket"
"manifest"
"signedexchange"
"ping"
"cspviolationreport"
"preflight"
"other"
render: Optional[Literal[true]]

Whether to render the page or fetch static content. True by default.

set_extra_http_headers: Optional[Dict[str, str]]
set_java_script_enabled: Optional[bool]
source: Optional[Literal["sitemaps", "links", "all"]]

Source of links to crawl. 'sitemaps' - only crawl URLs from sitemaps, 'links' - only crawl URLs scraped from pages, 'all' - crawl both sitemap and scraped links (default).

One of the following:
"sitemaps"
"links"
"all"
viewport: Optional[Variant0Viewport]

Check options.

height: float
width: float
device_scale_factor: Optional[float]
has_touch: Optional[bool]
is_landscape: Optional[bool]
is_mobile: Optional[bool]
wait_for_selector: Optional[Variant0WaitForSelector]

Wait for the selector to appear in page. Check options.

selector: str
hidden: Optional[Literal[true]]
timeout: Optional[float]
maximum120000
visible: Optional[Literal[true]]
wait_for_timeout: Optional[float]

Waits for a specified timeout before continuing.

maximum120000
ReturnsExpand Collapse
str

Crawl job ID.

Crawl websites.

import os
from cloudflare import Cloudflare

client = Cloudflare(
    api_token=os.environ.get("CLOUDFLARE_API_TOKEN"),  # This is the default and can be omitted
)
crawl = client.browser_rendering.crawl.create(
    account_id="account_id",
    url="https://example.com",
)
print(crawl)
{
  "result": "result",
  "success": true,
  "errors": [
    {
      "code": 0,
      "message": "message"
    }
  ]
}
{
  "errors": [
    {
      "code": 2001,
      "message": "Rate limit exceeded"
    }
  ],
  "success": false
}
Returns Examples
{
  "result": "result",
  "success": true,
  "errors": [
    {
      "code": 0,
      "message": "message"
    }
  ]
}
{
  "errors": [
    {
      "code": 2001,
      "message": "Rate limit exceeded"
    }
  ],
  "success": false
}