Skip to content
Start here

Execute AI model

POST/accounts/{account_id}/ai/run/{model_name}

This endpoint provides users with the capability to run specific AI models on-demand.

By submitting the required input data, users can receive real-time predictions or results generated by the chosen AI model. The endpoint supports various AI model types, ensuring flexibility and adaptability for diverse use cases.

Model specific inputs available in Cloudflare Docs.

Security
API Token

The preferred authorization scheme for interacting with the Cloudflare API. Create a token.

Example:Authorization: Bearer Sn3lZJTBX6kkg7OdcBUAxOO963GEIyGQqnFTOFYY
API Email + API Key

The previous authorization scheme for interacting with the Cloudflare API, used in conjunction with a Global API key.

Example:X-Auth-Email: user@example.com

The previous authorization scheme for interacting with the Cloudflare API. When possible, use API tokens instead of Global API keys.

Example:X-Auth-Key: 144c9defac04969c7bfad8efaa8ea194
Accepted Permissions (at least one required)
Workers AI WriteWorkers AI Read
Path ParametersExpand Collapse
account_id: string
model_name: string
Body ParametersJSONExpand Collapse
body: optional { text } or { prompt, guidance, height, 8 more } or { prompt, lang } or 12 more
One of the following:
TextClassification { text }
text: string

The text that you want to classify

minLength1
TextToImage { prompt, guidance, height, 8 more }
prompt: string

A text description of the image you want to generate

minLength1
guidance: optional number

Controls how closely the generated image should adhere to the prompt; higher values make the image more aligned with the prompt

height: optional number

The height of the generated image in pixels

maximum2048
minimum256
image: optional array of number

For use with img2img tasks. An array of integers that represent the image data constrained to 8-bit unsigned integer values

image_b64: optional string

For use with img2img tasks. A base64-encoded string of the input image

mask: optional array of number

An array representing An array of integers that represent mask image data for inpainting constrained to 8-bit unsigned integer values

negative_prompt: optional string

Text describing elements to avoid in the generated image

num_steps: optional number

The number of diffusion steps; higher values can improve quality but take longer

maximum20
seed: optional number

Random seed for reproducibility of the image generation

strength: optional number

A value between 0 and 1 indicating how strongly to apply the transformation during img2img tasks; lower values make the output closer to the input image

width: optional number

The width of the generated image in pixels

maximum2048
minimum256
TextToSpeech { prompt, lang }
prompt: string

A text description of the audio you want to generate

minLength1
lang: optional string

The speech language (e.g., ‘en’ for English, ‘fr’ for French). Defaults to ‘en’ if not specified

TextEmbeddings { text }
text: string or array of string

The text to embed

One of the following:
string

The text to embed

array of string

Batch of text values to embed

AutomaticSpeechRecognition { audio, source_lang, target_lang }
audio: array of number

An array of integers that represent the audio data constrained to 8-bit unsigned integer values

source_lang: optional string

The language of the recorded audio

target_lang: optional string

The language to translate the transcription into. Currently only English is supported.

ImageClassification { image }
image: array of number

An array of integers that represent the image data constrained to 8-bit unsigned integer values

ObjectDetection { image }
image: optional array of number

An array of integers that represent the image data constrained to 8-bit unsigned integer values

Prompt { prompt, frequency_penalty, lora, 10 more }
prompt: string

The input text prompt for the model to generate a response.

minLength1
frequency_penalty: optional number

Decreases the likelihood of the model repeating the same lines verbatim.

maximum2
minimum-2
lora: optional string

Name of the LoRA (Low-Rank Adaptation) model to fine-tune the base model.

max_tokens: optional number

The maximum number of tokens to generate in the response.

presence_penalty: optional number

Increases the likelihood of the model introducing new topics.

maximum2
minimum-2
raw: optional boolean

If true, a chat template is not applied and you must adhere to the specific model’s expected formatting.

repetition_penalty: optional number

Penalty for repeated tokens; higher values discourage repetition.

maximum2
minimum0
response_format: optional { json_schema, type }
json_schema: optional unknown
type: optional "json_object" or "json_schema"
One of the following:
"json_object"
"json_schema"
seed: optional number

Random seed for reproducibility of the generation.

maximum9999999999
minimum1
stream: optional boolean

If true, the response will be streamed back incrementally using SSE, Server Sent Events.

temperature: optional number

Controls the randomness of the output; higher values produce more random results.

maximum5
minimum0
top_k: optional number

Limits the AI to choose from the top ‘k’ most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises.

maximum50
minimum1
top_p: optional number

Adjusts the creativity of the AI’s responses by controlling how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses.

maximum1
minimum0.001
TextGeneration { messages, frequency_penalty, functions, 11 more }
messages: array of { content, role }

An array of message objects representing the conversation history.

content: string or array of { text, type }

The content of the message as a string.

One of the following:
string

The content of the message as a string.

array of { text, type }

Array of text content parts.

text: optional string

Text content

type: optional string

Type of the content (text)

role: string

The role of the message sender (e.g., ‘user’, ‘assistant’, ‘system’, ‘tool’).

frequency_penalty: optional number

Decreases the likelihood of the model repeating the same lines verbatim.

maximum2
minimum-2
functions: optional array of { code, name }
code: string
name: string
max_tokens: optional number

The maximum number of tokens to generate in the response.

presence_penalty: optional number

Increases the likelihood of the model introducing new topics.

maximum2
minimum-2
raw: optional boolean

If true, a chat template is not applied and you must adhere to the specific model’s expected formatting.

repetition_penalty: optional number

Penalty for repeated tokens; higher values discourage repetition.

maximum2
minimum0
response_format: optional { json_schema, type }
json_schema: optional unknown
type: optional "json_object" or "json_schema"
One of the following:
"json_object"
"json_schema"
seed: optional number

Random seed for reproducibility of the generation.

maximum9999999999
minimum1
stream: optional boolean

If true, the response will be streamed back incrementally using SSE, Server Sent Events.

temperature: optional number

Controls the randomness of the output; higher values produce more random results.

maximum5
minimum0
tools: optional array of { description, name, parameters } or { function, type }

A list of tools available for the assistant to use.

One of the following:
{ description, name, parameters }
description: string

A brief description of what the tool does.

name: string

The name of the tool. More descriptive the better.

parameters: { properties, type, required }

Schema defining the parameters accepted by the tool.

properties: map[ { description, type } ]

Definitions of each parameter.

description: string

A description of the expected parameter.

type: string

The data type of the parameter.

type: string

The type of the parameters object (usually ‘object’).

required: optional array of string

List of required parameter names.

Function { function, type }
function: { description, name, parameters }

Details of the function tool.

description: string

A brief description of what the function does.

name: string

The name of the function.

parameters: { properties, type, required }

Schema defining the parameters accepted by the function.

properties: map[ { description, type } ]

Definitions of each parameter.

description: string

A description of the expected parameter.

type: string

The data type of the parameter.

type: string

The type of the parameters object (usually ‘object’).

required: optional array of string

List of required parameter names.

type: string

Specifies the type of tool (e.g., ‘function’).

top_k: optional number

Limits the AI to choose from the top ‘k’ most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises.

maximum50
minimum1
top_p: optional number

Adjusts the creativity of the AI’s responses by controlling how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses.

maximum1
minimum0.001
Translation { target_lang, text, source_lang }
target_lang: string

The language code to translate the text into (e.g., ‘es’ for Spanish)

text: string

The text to be translated

minLength1
source_lang: optional string

The language code of the source text (e.g., ‘en’ for English). Defaults to ‘en’ if not specified

Summarization { input_text, max_length }
input_text: string

The text that you want the model to summarize

minLength1
max_length: optional number

The maximum length of the generated summary in tokens

ImageToText { image, frequency_penalty, max_tokens, 8 more }
image: array of number

An array of integers that represent the image data constrained to 8-bit unsigned integer values

frequency_penalty: optional number

Decreases the likelihood of the model repeating the same lines verbatim.

max_tokens: optional number

The maximum number of tokens to generate in the response.

presence_penalty: optional number

Increases the likelihood of the model introducing new topics.

prompt: optional string

The input text prompt for the model to generate a response.

raw: optional boolean

If true, a chat template is not applied and you must adhere to the specific model’s expected formatting.

repetition_penalty: optional number

Penalty for repeated tokens; higher values discourage repetition.

seed: optional number

Random seed for reproducibility of the generation.

temperature: optional number

Controls the randomness of the output; higher values produce more random results.

top_k: optional number

Limits the AI to choose from the top ‘k’ most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises.

top_p: optional number

Controls the creativity of the AI’s responses by adjusting how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses.

{ image, prompt, frequency_penalty, 8 more }
image: string

Image in base64 encoded format.

prompt: string

The input text prompt for the model to generate a response.

minLength1
frequency_penalty: optional number

Decreases the likelihood of the model repeating the same lines verbatim.

ignore_eos: optional boolean

Whether to ignore the EOS token and continue generating tokens after the EOS token is generated.

max_tokens: optional number

The maximum number of tokens to generate in the response.

presence_penalty: optional number

Increases the likelihood of the model introducing new topics.

repetition_penalty: optional number

Penalty for repeated tokens; higher values discourage repetition.

seed: optional number

Random seed for reproducibility of the generation.

temperature: optional number

Controls the randomness of the output; higher values produce more random results.

top_k: optional number

Limits the AI to choose from the top ‘k’ most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises.

top_p: optional number

Controls the creativity of the AI’s responses by adjusting how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses.

ImageTextToText { image, messages, frequency_penalty, 8 more }
image: string

Image in base64 encoded format.

messages: array of { content, role }

An array of message objects representing the conversation history.

content: string or array of { type, image_url, text }

The content of the message as a string.

One of the following:
string

The content of the message as a string.

array of { type, image_url, text }

Array of content parts (text, image_url, etc.).

type: string

Type of the content part (e.g. ‘text’, ‘image_url’).

image_url: optional { url }

Image URL object (when type is ‘image_url’).

url: string

Image URI with data (e.g. data:image/jpeg;base64,/9j/…).

text: optional string

Text content (when type is ‘text’).

role: string

The role of the message sender (e.g., ‘user’, ‘assistant’, ‘system’, ‘tool’).

frequency_penalty: optional number

Decreases the likelihood of the model repeating the same lines verbatim.

ignore_eos: optional boolean

Whether to ignore the EOS token and continue generating tokens after the EOS token is generated.

max_tokens: optional number

The maximum number of tokens to generate in the response.

presence_penalty: optional number

Increases the likelihood of the model introducing new topics.

repetition_penalty: optional number

Penalty for repeated tokens; higher values discourage repetition.

seed: optional number

Random seed for reproducibility of the generation.

temperature: optional number

Controls the randomness of the output; higher values produce more random results.

top_k: optional number

Limits the AI to choose from the top ‘k’ most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises.

top_p: optional number

Controls the creativity of the AI’s responses by adjusting how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses.

MultimodalEmbeddings { image, text }
image: optional string

Image in base64 encoded format.

minLength1
text: optional array of string
ReturnsExpand Collapse
result: optional array of { label, score } or string or { audio } or 12 more

An array of classification results for the input text

One of the following:
TextClassification = array of { label, score }

An array of classification results for the input text

label: optional string

The classification label assigned to the text (e.g., ‘POSITIVE’ or ‘NEGATIVE’)

score: optional number

Confidence score indicating the likelihood that the text belongs to the specified label

TextToImage = string

The generated image in PNG format

Audio { audio }
audio: optional string

The generated audio in MP3 format, base64-encoded

string

The generated audio in MP3 format

TextEmbeddings { data, shape }
data: optional array of array of number

Embeddings of the requested text values

shape: optional array of number
AutomaticSpeechRecognition { text, vtt, word_count, words }
text: string

The transcription

vtt: optional string
word_count: optional number
words: optional array of { end, start, word }
end: optional number

The ending second when the word completes

start: optional number

The second this word begins in the recording

word: optional string
ImageClassification = array of { label, score }
label: optional string

The predicted category or class for the input image based on analysis

score: optional number

A confidence value, between 0 and 1, indicating how certain the model is about the predicted label

ObjectDetection = array of { box, label, score }

An array of detected objects within the input image

box: optional { xmax, xmin, ymax, ymin }

Coordinates defining the bounding box around the detected object

xmax: optional number

The x-coordinate of the bottom-right corner of the bounding box

xmin: optional number

The x-coordinate of the top-left corner of the bounding box

ymax: optional number

The y-coordinate of the bottom-right corner of the bounding box

ymin: optional number

The y-coordinate of the top-left corner of the bounding box

label: optional string

The class label or name of the detected object

score: optional number

Confidence score indicating the likelihood that the detection is correct

{ response, tool_calls, usage }
response: string

The generated text response from the model

tool_calls: optional array of { arguments, name }

An array of tool calls requests made during the response generation

arguments: optional unknown

The arguments passed to be passed to the tool call request

name: optional string

The name of the tool to be called

usage: optional { completion_tokens, prompt_tokens, total_tokens }

Usage statistics for the inference request

completion_tokens: optional number

Total number of tokens in output

prompt_tokens: optional number

Total number of tokens in input

total_tokens: optional number

Total number of input and output tokens

string
Translation { translated_text }
translated_text: optional string

The translated text in the target language

Summarization { summary }
summary: optional string

The summarized version of the input text

ImageToText { description }
description: optional string
ImageTextToText { description }
description: optional string
MultimodalEmbeddings { data, shape }
data: optional array of array of number
shape: optional array of number

Execute AI model

curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai/run/$MODEL_NAME \
    -X POST \
    -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"
{
  "result": [
    {
      "label": "label",
      "score": 0
    }
  ]
}
Returns Examples
{
  "result": [
    {
      "label": "label",
      "score": 0
    }
  ]
}