Skip to content
Network Error Logging
Visit Network Error Logging on GitHub
Set theme to dark (⇧+D)

Failures

If a user is able to connect to Cloudflare and the site they connect to has NEL enabled, Cloudflare passes back two headers to the browser indicating that they should report any network failures to an endpoint specified in the headers. The browser will operate as usual, and if something happens that prevents the browser from connecting to the site, the browser will log the failure as a report and send it to the endpoint.

Network Error Logging failures can occur for different reasons which are outlined below.

Internet Service Provider (ISP) outage

An ISP outage appears to NEL users as failures from one particular last-mile network. By examining NEL data to look at the client autonomous system number (ASN) view, you can see which networks are causing the most impact.

For customers, this scenario appears as an influx of tcp.timed_out errors, as well as tcp.failed, h2.protocol_error and h3.protocol_error.

In the event of a last-mile outage, the best course of action is to contact the provider to investigate.

Transit Flap

Transit flaps look like momentary outages caused by transits re-establishing BGP sessions.

To customers, this will appear as tcp.timed_out reports from a variety of ASNs over a short period of time. This could happen for several reasons:

  • Maintenance in the transit network necessitated a reset of the session.
  • Maintenance or reboots in Cloudflare necessitated a reset of the BGP session.
  • Packet loss in the network caused the session to flap.

Heavy packet loss in the network will likely result in a series of flaps over time. Maintenance is typically one impact period that lasts no more than two minutes.

Infrastructure outage

Infrastructure outages occur at shared peering points, such as Internet exchanges.

These outages appear to customers as an increase in tcp.timed_out, tcp.failed, and tcp.aborted reports. These failures will likely appear across multiple networks for an extended period of time.

Depending on the severity of the report volume, Cloudflare may declare an incident to track remediation. Alternatively, Cloudflare may deactivate peering from these shared points until the issue is resolved.

Cloudflare outage

Cloudflare outages consist of issues within Cloudflare’s data-center fabric.

These outages appear to customers as an increase in tcp.timed_out, tcp.failed, and tcp.aborted reports and will likely appear across multiple networks for a short period of time.

By pivoting by data center, customers can track the impact across Cloudflare points of presence. Cloudflare-based incidents will always be tracked through a status page, which will indicate whether or not there are issues within the impacted region.

Provider sending traffic through scrubbing center/blocking traffic

This type of outage manifests as TLS errors, such as tls.cert.authority_invalid, tls.cert.name_invalid, or others and may also present with tcp.aborted errors.

Customers may uncover this behavior by looking at which last-mile ASNs are displaying increased failures, as it will typically be only one.

Customers can seek remediation by contacting the provider that they believe is scrubbing their traffic.

Certificate issues

Certificate issues are also detectable through NEL. The TLS.version, cipher_mismatch, or other errors may present across multiple ISPs in multiple Cloudflare locations.

If this is detected in NEL, the issue can be remediated by deploying new certificates or using Cloudflare’s SSL management suite to automatically deploy new certificates.