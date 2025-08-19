Cerebras ↗ offers developers a low-latency solution for AI model inference.

Endpoint

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cerebras-ai

Prerequisites

When making requests to Cerebras, ensure you have the following:

Your AI Gateway Account ID.

Your AI Gateway gateway name.

An active Cerebras API token.

The name of the Cerebras model you want to use.

Examples

cURL

Example fetch request curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cerebras/chat/completions \ --header 'content-type: application/json' \ --header 'Authorization: Bearer CEREBRAS_TOKEN' \ --data '{ "model": "llama3.1-8b", "messages": [ { "role": "user", "content": "What is Cloudflare?" } ] }'

OpenAI-Compatible Endpoint

You can also use the OpenAI-compatible endpoint ( /ai-gateway/usage/chat-completion/ ) to access Cerebras models using the OpenAI API schema. To do so, send your requests to:

https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions

Specify: