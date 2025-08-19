 Skip to content
Cerebras

Cerebras offers developers a low-latency solution for AI model inference.

Endpoint

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cerebras-ai

Prerequisites

When making requests to Cerebras, ensure you have the following:

  • Your AI Gateway Account ID.
  • Your AI Gateway gateway name.
  • An active Cerebras API token.
  • The name of the Cerebras model you want to use.

Examples

cURL

Example fetch request
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cerebras/chat/completions \
 --header 'content-type: application/json' \
 --header 'Authorization: Bearer CEREBRAS_TOKEN' \
 --data '{
    "model": "llama3.1-8b",
    "messages": [
        {
            "role": "user",
            "content": "What is Cloudflare?"
        }
    ]
}'

OpenAI-Compatible Endpoint

You can also use the OpenAI-compatible endpoint (/ai-gateway/usage/chat-completion/) to access Cerebras models using the OpenAI API schema. To do so, send your requests to:

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions

Specify:

{
"model": "cerebras/{model}"
}