Cloudflare Docs
AI Gateway
Edit this page on GitHub
Set theme to dark (⇧+D)

AI Gateway

Beta
Observe and control your AI applications.
Available on all plans

Cloudflare’s AI Gateway allows you to gain visibility and control over your AI apps. By connecting your apps to AI Gateway, you can gather insights on how people are using your application with analytics and logging and then control how your application scales with features such as caching, rate limiting, as well as request retries, model fallback, and more. Better yet - it only takes one line of code to get started.

Key features include:

  • Analytics: View metrics such as the number of requests, tokens, and the cost it takes to run your application
  • Real-time logs: Gain insight on requests and errors
  • Caching: Serve requests directly from Cloudflare’s cache instead of the original model provider for faster requests and cost savings
  • Rate limiting: Control how your application scales by limiting the number of requests your application receives
  • Request retry and fallback: Improve resilience by defining request retry and model fallbacks in case of an error
  • Support for your favorite providers: Workers AI, OpenAI, Azure OpenAI, HuggingFace, Replicate all work with AI Gateway (more to come)
  • Response streaming : AI Gateway supports response streaming

Check out the Get started guide to learn how to configure your applications with AI Gateway.

​​ More resources