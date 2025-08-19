Rate limiting controls the traffic that reaches your application, which prevents expensive bills and suspicious activity.

Parameters

You can define rate limits as the number of requests that get sent in a specific time frame. For example, you can limit your application to 100 requests per 60 seconds.

You can also select if you would like a fixed or sliding rate limiting technique. With rate limiting, we allow a certain number of requests within a window of time. For example, if it is a fixed rate, the window is based on time, so there would be no more than x requests in a ten minute window. If it is a sliding rate, there would be no more than x requests in the last ten minutes.

To illustrate this, let us say you had a limit of ten requests per ten minutes, starting at 12:00. So the fixed window is 12:00-12:10, 12:10-12:20, and so on. If you sent ten requests at 12:09 and ten requests at 12:11, all 20 requests would be successful in a fixed window strategy. However, they would fail in a sliding window strategy since there were more than ten requests in the last ten minutes.

Handling rate limits

When your requests exceed the allowed rate, you will encounter rate limiting. This means the server will respond with a 429 Too Many Requests status code and your request will not be processed.

Default configuration

Dashboard

API To set the default rate limiting configuration in the dashboard: Log into the Cloudflare dashboard ↗ and select your account. Go to AI > AI Gateway. Go to Settings. Enable Rate-limiting. Adjust the rate, time period, and rate limiting method as desired. To set the default rate limiting configuration using the API: Create an API token with the following permissions: AI Gateway - Read

AI Gateway - Edit Get your Account ID. Using that API token and Account ID, send a POST request to create a new Gateway and include a value for the rate_limiting_interval , rate_limiting_limit , and rate_limiting_technique .

This rate limiting behavior will be uniformly applied to all requests for that gateway.