Fallbacks
Specify model or provider fallbacks with your Universal endpoint to handle request failures and ensure reliability.
Fallbacks are currently triggered only when a request encounters an error. We are working to expand fallback functionality to include time-based triggers, which will allow requests that exceed a predefined response time to timeout and fallback.
In the following example, a request first goes to the Workers AI Inference API. If the request fails, it falls back to OpenAI. The response header cf-aig-step
indicates which provider successfully processed the request.
- Sends a request to Workers AI Inference API.
- If that request fails, proceeds to OpenAI.
graph TD A[AI Gateway] --> B[Request to Workers AI Inference API] B -->|Success| C[Return Response] B -->|Failure| D[Request to OpenAI API] D --> E[Return Response]
You can add as many fallbacks as you need, just by adding another object in the array.
When using the Universal endpoint with fallbacks, the response header cf-aig-step
indicates which model successfully processed the request by returning the step number. This header provides visibility into whether a fallback was triggered and which model ultimately processed the response.
cf-aig-step:0
– The first (primary) model was used successfully.cf-aig-step:1
– The request fell back to the second model.cf-aig-step:2
– The request fell back to the third model.- Subsequent steps – Each fallback increments the step number by 1.