Specify model or provider fallback with your Universal endpoint to specify what to do if a request fails.

For example, you could set up a gateway endpoint that:

  1. Sends a request to Workers AI Inference API.
  2. If that request fails, proceeds to OpenAI.

You can add as many fallbacks as you need, just by adding another object in the array.

curl{account_id}/{gateway_id} \
--header 'Content-Type: application/json' \
--data '[
"provider": "workers-ai",
"endpoint": "@cf/meta/llama-3.1-8b-instruct",
"headers": {
"Authorization": "Bearer {cloudflare_token}",
"Content-Type": "application/json"
"query": {
"messages": [
"role": "system",
"content": "You are a friendly assistant"
"role": "user",
"content": "What is Cloudflare?"
"provider": "openai",
"endpoint": "chat/completions",
"headers": {
"Authorization": "Bearer {open_ai_token}",
"Content-Type": "application/json"
"query": {
"model": "gpt-3.5-turbo",
"stream": true,
"messages": [
"role": "user",
"content": "What is Cloudflare?"