Troubleshooting crawl errors
Cloudflare allows search engine crawlers and bots. If you observe crawl issues or Cloudflare challenges presented to the search engine crawler or bot, with the information you gather when troubleshooting the crawl errors via the methods outlined in this guide.
Disable Anti-bot modules
Search engine crawlers’ requests, when proxied through Cloudflare, can be blocked by anti-bot modules installed on your origin server. Try disabling any anti-bot modules to prevent your origin from blocking these requests.
Adjust Google and Bing crawl rates
To optimize CDN performance, Google and Bing assign special crawl rates to websites that use CDN services in order. Special crawl rates do not negatively affect Search Engine Optimization (SEO) and Search Engine Results Pages (SERPs). To change your crawl rates for Bing and Google, follow the guides below:
- Change the Google crawl rate by .
- Change your Bing crawl rate via guidance from Bing’s documentation:
Prevent crawl errors
Review the following recommendations to prevent crawler errors:
Monitor the performance and availability of your website using a third-party tool:
- Do not block the United States via custom rules or IP Access rules within the Security app.
- Do not block or User-Agents in your .htaccess, server configuration, , or web application.
- Do not allow crawling of files in the /cdn-cgi/ directory. This path is used internally by Cloudflare and Google encounters errors when crawling it. Disallow crawls of cdn-cgi via robots.txt:
Troubleshoot crawl errors
Troubleshooting steps for the most commonly reported crawl errors are mentioned below.
HTTP 4XX Errors
are the most common type of crawl error. Cloudflare delivers these errors from your web server to Google. These errors are caused for various reasons such as a missing page on your web server or a malformed link in your HTML. The solution depends upon the problem encountered.
HTTP 5XX Errors
indicate that either Cloudflare or your origin web server experienced an internal error. To correlate occurrences of crawl errors with site outages, monitor your origin web server’s health. Monitoring your website health both through Cloudflare and directly to your origin web server IPs determines whether errors occurred due to Cloudflare or your origin web server.
Troubleshooting steps vary depending on whether your domain is on Cloudflare via a Full or CNAME setup. To verify which setup your domain uses, open a terminal and execute the following command (replace with your Cloudflare domain):
dig +short SOA
For domains on a CNAME setup, the result response contains cdn.cloudflare.net. For example:
For domains on a Full setup, the result response contains the cloudflare.com domain in the nameservers listed. For example:
josh.ns.cloudflare.com. dns.cloudflare.com. 2013050901 10000 2400 604800 3600
Once you’ve confirmed how your domain was setup with Cloudflare, proceed with the troubleshooting steps appropriate to your domain setup.
Contact your hosting provider to investigate DNS errors and provide the date Google encountered DNS errors. Additionally, review the page for any network outages on the date the errors were encountered by Google.
Requesting troubleshooting assistance
If the above troubleshooting steps do not resolve your crawl errors, follow the steps below to export crawler errors as a .csv file from your Google Webmaster Tools Dashboard. Include this .csv file when .
- Log in to your Google Webmaster Tools account and navigate to the Health section of the affected domain.
- Click Crawl Errors in the left hand navigation.
- Click Download to export the list of errors as a .csv file.
- Provide the downloaded .csv file to Cloudflare support.