AI Crawl Control with Cloudflare WAF
AI Crawl Control works alongside other Cloudflare products, such as Cloudflare Web Application Firewall (WAF). WAF checks incoming web and API requests, and filters undesired traffic based on rules. WAF custom rules allow you to perform certain actions such as enforcing robots.txt.
- AI Crawl Control uses WAF custom rules to block the selection of AI crawlers the site owner has decided to block.
- AI Crawl Control's pay per crawl feature takes place after WAF.
graph LR A[Traffic] --> B[WAF custom rules<br>AI Crawl Control: Crawler blocks] B --> C[Cloudflare<br>Bot Solutions] C --> D[AI Crawl Control:<br>Pay Per Crawl] classDef highlight fill:#F6821F,color:white
For this reason, if you plan on using AI Crawl Control to manage AI crawlers, you may wish to modify your existing WAF custom rules such that it does not affect AI crawlers. This will allow you to manage AI crawlers only from AI Crawl Control, thereby streamlining your workflow.
Consider the following examples.
You may have both of the following features enabled:
- WAF custom rule to block traffic from specific countries
- AI Crawl Control's pay per crawl to charge AI crawlers when they request access to your content
Since WAF custom rules are enforced before pay per crawl, traffic (including AI crawlers) from your blocked countries will continue to be blocked, even if they provide the required headers for pay per crawl.
You may have both of the following features enabled:
- WAF custom rule to allow search engine bots
- AI Crawl Control's pay per crawl to charge all AI crawlers when they request access to your content (including search engine bots).
Since custom rules are enforced before pay per crawl:
- Only search engine bots will be able to access your site (enforced by custom rule).
- The search engine bots will then be charged for access to your content (enforced by AI Crawl Control's pay per crawl).
If you have set certain AI crawlers to Allow in AI Crawl Control, but they are still being blocked, check for upstream WAF custom rules that may be blocking them. Since the AI Crawl Control rule only includes blocked bots, allowed bots may still be affected by other security rules that execute before the AI Crawl Control rule.
These upstream rules will affect traffic but may not be visible in AI Crawl Control analytics. Review your WAF custom rules to identify and modify any rules that may be blocking AI crawlers you intend to allow.
If you have set certain AI crawlers to Block in AI Crawl Control, but they are still accessing your content, check for upstream rules that may be bypassing the AI Crawl Control rule. Since the AI Crawl Control rule is added at the end of existing WAF custom rules, the following types of rules may allow bots to bypass the block:
- Skip rules that bypass WAF custom rules
- Redirect rules that change the request path
- Transform rules that modify the request
To ensure blocked bots are properly blocked, move the AI Crawl Control rule to the top of your WAF custom rules, so it executes before other rules.
You may have both of the following features enabled:
- A WAF custom rule which blocks all bots.
- AI Crawl Control selection which allows certain AI crawlers.
In this scenario, you have two custom rules, each directing a different logic for handling AI crawlers. To resolve this issue:
-
In the Cloudflare dashboard, go to the Security rules page.
Go to Security rules -
Filter by Custom rules.
-
Identify your custom rule and the AI Crawl Control rule.
-
Drag the rule you wish to prioritize to the top, or modify your custom rule to ensure it does not conflict with your AI Crawl Control configurations.
- Log in to the Cloudflare dashboard ↗, and select your account and domain.
- Go to Security > WAF > Custom rules tab.
- Identify your WAF custom rule and the AI Crawl Control rule.
- Drag the rule you wish to prioritize to the top, or modify your WAF custom rule to ensure it does not conflict with your AI Crawl Control configurations.