Skip to content

Fallbacks

Specify model or provider fallbacks with your Universal endpoint to handle request failures and ensure reliability.

Fallbacks are currently triggered only when a request encounters an error. We are working to expand fallback functionality to include time-based triggers, which will allow requests that exceed a predefined response time to timeout and fallback.

Example

In the following example, a request first goes to the Workers AI Inference API. If the request fails, it falls back to OpenAI. The response header cf-aig-step indicates which provider successfully processed the request.

  1. Sends a request to Workers AI Inference API.
  2. If that request fails, proceeds to OpenAI.
graph TD
    A[AI Gateway] --> B[Request to Workers AI Inference API]
    B -->|Success| C[Return Response]
    B -->|Failure| D[Request to OpenAI API]
    D --> E[Return Response]

You can add as many fallbacks as you need, just by adding another object in the array.

Request
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \
--header 'Content-Type: application/json' \
--data '[
{
"provider": "workers-ai",
"endpoint": "@cf/meta/llama-3.1-8b-instruct",
"headers": {
"Authorization": "Bearer {cloudflare_token}",
"Content-Type": "application/json"
},
"query": {
"messages": [
{
"role": "system",
"content": "You are a friendly assistant"
},
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}
},
{
"provider": "openai",
"endpoint": "chat/completions",
"headers": {
"Authorization": "Bearer {open_ai_token}",
"Content-Type": "application/json"
},
"query": {
"model": "gpt-4o-mini",
"stream": true,
"messages": [
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}
}
]'

Response header(cf-aig-step)

When using the Universal endpoint with fallbacks, the response header cf-aig-step indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.

  • cf-aig-step:0 – The first (primary) model was used successfully.
  • cf-aig-step:1 – The request fell back to the second model.
  • cf-aig-step:2 – The request fell back to the third model.
  • Subsequent steps – Each fallback increments the step number by 1.