Skip to content

Redirects for AI Training

Redirects for AI Training enforces your existing <link rel="canonical"> tags as 301 redirects for verified AI training crawlers. When a verified bot with the AI Crawler category requests a page whose canonical tag points to a different same-origin URL, Cloudflare returns a 301 Moved Permanently to the canonical. All other visitors—browsers, search engines, AI Assistants—receive the original page unchanged.

To learn more about why this feature can be useful, refer to the announcement blog post.

Enable Redirects for AI Training

Redirects for AI Training is available as a toggle in AI Crawl Control > Quick Actions, alongside Markdown for Agents and Managed robots.txt.

To enable Redirects for AI Training for your zone in the dashboard:

  1. Log into the Cloudflare dashboard and select your account (you need a Pro or Business plan).
  2. Select the zone you want to configure.
  3. Visit the AI Crawl Control section.
  4. Enable Redirects for AI Training.

Enable for specific subdomains or paths

To enable Redirects for AI Training for specific subdomains or paths instead of your entire zone, create a configuration rule:

  1. Log in to the Cloudflare dashboard and select your account.
  2. Select the zone you want to configure.
  3. Go to Rules > Overview and select Create rule > Configuration Rules.
  4. Under When incoming requests match, build an expression to match your subdomain (for example, http.host eq "docs.example.com") or path.
  5. Under Then the settings are, select Add setting > Redirects for AI Training and set it to On.
  6. Select Deploy.

How it works

Cloudflare inspects the origin HTML response to verified AI training crawlers and:

  1. Stream-parses the <head> of the origin response to extract <link rel="canonical" href="...">
  2. Resolves relative canonical URLs against the request URL
  3. Validates that the canonical URL is same-origin and differs from the current URL
  4. Returns a 301 redirect to the canonical URL

If no canonical tag is found, the canonical is cross-origin, or the page is self-canonical, the origin response passes through unchanged.

Canonical tag parsing

The feature parses the <link rel="canonical"> tag from the <head> section of the origin response:

<!DOCTYPE html>
<html>
<head>
<link rel="canonical" href="https://example.com/current-version">
</head>
<body>
<!-- Page content -->
</body>
</html>

Both absolute and relative URLs are supported:

<!-- Absolute URL -->
<link rel="canonical" href="https://example.com/page">
<!-- Relative URL (resolved against request URL) -->
<link rel="canonical" href="/current-page">

Interaction with other redirect features

Redirects for AI Training operates at a different layer than Single Redirects and Bulk Redirects. Those redirect rules execute before the origin is contacted. Redirects for AI Training executes after the origin responds, because it needs to read the canonical tag from the origin HTML.

If a Single Redirect or Bulk Redirect matches first, the request is redirected before Redirects for AI Training has a chance to evaluate it.

Availability

Available on Pro, Business, and Enterprise plans at no additional cost.

Limitations

  • Only HTML responses (content-type: text/html) from the origin are evaluated. Other content types pass through unchanged.
  • The canonical tag must appear within the first 256 KB of the uncompressed HTML response body.
  • Only same-origin canonical URLs trigger a redirect. Cross-origin canonicals are ignored.
  • Only verified bots with the AI Crawler category are redirected. AI Assistants and AI Search bots are not affected.
  • Self-canonical pages (where the canonical URL matches the request URL) are not redirected.
  • Best-effort loop detection uses the Referer header. If a crawler was just redirected from the canonical URL back to the current page, the origin HTML is served instead of redirecting. This handles common two-page canonical misconfigurations (Page A canonical points to Page B, Page B canonical points to Page A).

Logging

When a redirect is issued, Cloudflare logs the canonical target URL in your HTTP request logs via the redirects_for_ai_training_target field. You can use the GraphQL Analytics API or Logpush to query this data.