Skip to content
Zhipu AI logo

glm-4.7-flash

Text GenerationZhipu AIHosted

GLM-4.7-Flash is a fast and efficient multilingual text generation model with a 131,072 token context window. Optimized for dialogue, instruction-following, and multi-turn tool calling across 100+ languages.

Model Info
Context Window131,072 tokens
Function calling Yes
ReasoningYes
Unit Pricing$0.06 per M input tokens, $0.40 per M output tokens

Playground

Try out this model with Workers AI LLM Playground. It does not require any setup or authentication and an instant way to preview and test a model directly in the browser.

Launch the LLM Playground

Usage

TypeScript
export interface Env {
AI: Ai;
}
export default {
async fetch(request, env): Promise<Response> {
const messages = [
{ role: "system", content: "You are a friendly assistant" },
{
role: "user",
content: "What is the origin of the phrase Hello, World",
},
];
const stream = await env.AI.run("@cf/zai-org/glm-4.7-flash", {
messages,
stream: true,
});
return new Response(stream, {
headers: { "content-type": "text/event-stream" },
});
},
} satisfies ExportedHandler<Env>;

Parameters

Synchronous — Send a request and receive a complete response
prompt
stringrequiredminLength: 1The input text prompt for the model to generate a response.
model
stringID of the model to use (e.g. '@cf/zai-org/glm-4.7-flash, etc').
frequency_penalty
number | nullPenalizes new tokens based on their existing frequency in the text so far.
logit_bias
object | nullModify the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values from -100 to 100.
logprobs
boolean | nullWhether to return log probabilities of the output tokens.
top_logprobs
integer | nullHow many top log probabilities to return at each token position (0-20). Requires logprobs=true.
max_tokens
integer | nullDeprecated in favor of max_completion_tokens. The maximum number of tokens to generate.
max_completion_tokens
integer | nullAn upper bound for the number of tokens that can be generated for a completion.
metadata
object | nullSet of 16 key-value pairs that can be attached to the object.
modalities
array | nullOutput types requested from the model (e.g. ['text'] or ['text', 'audio']).
n
integer | nullHow many chat completion choices to generate for each input message.
parallel_tool_calls
booleandefault: trueWhether to enable parallel function calling during tool use.
presence_penalty
number | nullPenalizes new tokens based on whether they appear in the text so far.
reasoning_effort
string | nullConstrains effort on reasoning for reasoning models (o1, o3-mini, etc.).
seed
integer | nullIf specified, the system will make a best effort to sample deterministically.
service_tier
string | nullSpecifies the processing type used for serving the request.
store
boolean | nullWhether to store the output for model distillation / evals.
stream
boolean | nullIf true, partial message deltas will be sent as server-sent events.
temperature
number | nullSampling temperature between 0 and 2.
top_p
number | nullNucleus sampling: considers the results of the tokens with top_p probability mass.
user
stringA unique identifier representing your end-user, for abuse monitoring.
Streaming — Send a request with `stream: true` and receive server-sent events
prompt
stringrequiredminLength: 1The input text prompt for the model to generate a response.
model
stringID of the model to use (e.g. '@cf/zai-org/glm-4.7-flash, etc').
frequency_penalty
number | nullPenalizes new tokens based on their existing frequency in the text so far.
logit_bias
object | nullModify the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values from -100 to 100.
logprobs
boolean | nullWhether to return log probabilities of the output tokens.
top_logprobs
integer | nullHow many top log probabilities to return at each token position (0-20). Requires logprobs=true.
max_tokens
integer | nullDeprecated in favor of max_completion_tokens. The maximum number of tokens to generate.
max_completion_tokens
integer | nullAn upper bound for the number of tokens that can be generated for a completion.
metadata
object | nullSet of 16 key-value pairs that can be attached to the object.
modalities
array | nullOutput types requested from the model (e.g. ['text'] or ['text', 'audio']).
n
integer | nullHow many chat completion choices to generate for each input message.
parallel_tool_calls
booleandefault: trueWhether to enable parallel function calling during tool use.
presence_penalty
number | nullPenalizes new tokens based on whether they appear in the text so far.
reasoning_effort
string | nullConstrains effort on reasoning for reasoning models (o1, o3-mini, etc.).
seed
integer | nullIf specified, the system will make a best effort to sample deterministically.
service_tier
string | nullSpecifies the processing type used for serving the request.
store
boolean | nullWhether to store the output for model distillation / evals.
stream
boolean | nullIf true, partial message deltas will be sent as server-sent events.
temperature
number | nullSampling temperature between 0 and 2.
top_p
number | nullNucleus sampling: considers the results of the tokens with top_p probability mass.
user
stringA unique identifier representing your end-user, for abuse monitoring.

API Schemas (Raw)

Synchronous — Send a request and receive a complete response
{
"title": "Prompt",
"properties": {
"prompt": {
"type": "string",
"minLength": 1,
"description": "The input text prompt for the model to generate a response."
},
"model": {
"type": "string",
"description": "ID of the model to use (e.g. '@cf/zai-org/glm-4.7-flash, etc')."
},
"audio": {
"anyOf": [
{
"type": "object",
"description": "Parameters for audio output. Required when modalities includes 'audio'.",
"properties": {
"voice": {
"oneOf": [
{
"type": "string"
},
{
"type": "object",
"properties": {
"id": {
"type": "string"
}
},
"required": [
"id"
]
}
]
},
"format": {
"type": "string",
"enum": [
"wav",
"aac",
"mp3",
"flac",
"opus",
"pcm16"
]
}
},
"required": [
"voice",
"format"
]
}
]
},
"frequency_penalty": {
"anyOf": [
{
"type": "number",
"minimum": -2,
"maximum": 2
},
{
"type": "null"
}
],
"default": 0,
"description": "Penalizes new tokens based on their existing frequency in the text so far."
},
"logit_bias": {
"anyOf": [
{
"type": "object"
},
{
"type": "null"
}
],
"description": "Modify the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values from -100 to 100."
},
"logprobs": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to return log probabilities of the output tokens."
},
"top_logprobs": {
"anyOf": [
{
"type": "integer",
"minimum": 0,
"maximum": 20
},
{
"type": "null"
}
],
"description": "How many top log probabilities to return at each token position (0-20). Requires logprobs=true."
},
"max_tokens": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "Deprecated in favor of max_completion_tokens. The maximum number of tokens to generate."
},
"max_completion_tokens": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "An upper bound for the number of tokens that can be generated for a completion."
},
"metadata": {
"anyOf": [
{
"type": "object"
},
{
"type": "null"
}
],
"description": "Set of 16 key-value pairs that can be attached to the object."
},
"modalities": {
"anyOf": [
{
"type": "array",
"items": {
"type": "string",
"enum": [
"text",
"audio"
]
}
},
{
"type": "null"
}
],
"description": "Output types requested from the model (e.g. ['text'] or ['text', 'audio'])."
},
"n": {
"anyOf": [
{
"type": "integer",
"minimum": 1,
"maximum": 128
},
{
"type": "null"
}
],
"default": 1,
"description": "How many chat completion choices to generate for each input message."
},
"parallel_tool_calls": {
"type": "boolean",
"default": true,
"description": "Whether to enable parallel function calling during tool use."
},
"prediction": {
"anyOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"content"
]
},
"content": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
},
"text": {
"type": "string"
}
},
"required": [
"type",
"text"
]
}
}
]
}
},
"required": [
"type",
"content"
]
}
]
},
"presence_penalty": {
"anyOf": [
{
"type": "number",
"minimum": -2,
"maximum": 2
},
{
"type": "null"
}
],
"default": 0,
"description": "Penalizes new tokens based on whether they appear in the text so far."
},
"reasoning_effort": {
"anyOf": [
{
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
{
"type": "null"
}
],
"description": "Constrains effort on reasoning for reasoning models (o1, o3-mini, etc.)."
},
"chat_template_kwargs": {
"type": "object",
"properties": {
"enable_thinking": {
"type": "boolean",
"default": true,
"description": "Whether to enable reasoning, enabled by default."
},
"clear_thinking": {
"type": "boolean",
"default": false,
"description": "If false, preserves reasoning context between turns."
}
}
},
"response_format": {
"anyOf": [
{
"description": "Specifies the format the model must output.",
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"json_object"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"json_schema"
]
},
"json_schema": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"schema": {
"type": "object"
},
"strict": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
]
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"json_schema"
]
}
]
}
]
},
"seed": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "If specified, the system will make a best effort to sample deterministically."
},
"service_tier": {
"anyOf": [
{
"type": "string",
"enum": [
"auto",
"default",
"flex",
"scale",
"priority"
]
},
{
"type": "null"
}
],
"default": "auto",
"description": "Specifies the processing type used for serving the request."
},
"stop": {
"description": "Up to 4 sequences where the API will stop generating further tokens.",
"anyOf": [
{
"type": "null"
},
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"maxItems": 4
}
]
},
"store": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to store the output for model distillation / evals."
},
"stream": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "If true, partial message deltas will be sent as server-sent events."
},
"stream_options": {
"anyOf": [
{
"type": "object",
"properties": {
"include_usage": {
"type": "boolean"
},
"include_obfuscation": {
"type": "boolean"
}
}
}
]
},
"temperature": {
"anyOf": [
{
"type": "number",
"minimum": 0,
"maximum": 2
},
{
"type": "null"
}
],
"default": 1,
"description": "Sampling temperature between 0 and 2."
},
"tool_choice": {
"anyOf": [
{
"description": "Controls which (if any) tool is called by the model. 'none' = no tools, 'auto' = model decides, 'required' = must call a tool.",
"oneOf": [
{
"type": "string",
"enum": [
"none",
"auto",
"required"
]
},
{
"type": "object",
"description": "Force a specific function tool.",
"properties": {
"type": {
"type": "string",
"enum": [
"function"
]
},
"function": {
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"function"
]
},
{
"type": "object",
"description": "Force a specific custom tool.",
"properties": {
"type": {
"type": "string",
"enum": [
"custom"
]
},
"custom": {
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"custom"
]
},
{
"type": "object",
"description": "Constrain to an allowed subset of tools.",
"properties": {
"type": {
"type": "string",
"enum": [
"allowed_tools"
]
},
"allowed_tools": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"enum": [
"auto",
"required"
]
},
"tools": {
"type": "array",
"items": {
"type": "object"
}
}
},
"required": [
"mode",
"tools"
]
}
},
"required": [
"type",
"allowed_tools"
]
}
]
}
]
},
"tools": {
"type": "array",
"description": "A list of tools the model may call.",
"items": {
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"function"
]
},
"function": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the function to be called."
},
"description": {
"type": "string",
"description": "A description of what the function does."
},
"parameters": {
"type": "object",
"description": "The parameters the function accepts, described as a JSON Schema object."
},
"strict": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to enable strict schema adherence."
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"function"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"custom"
]
},
"custom": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"format": {
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"grammar"
]
},
"grammar": {
"type": "object",
"properties": {
"definition": {
"type": "string"
},
"syntax": {
"type": "string",
"enum": [
"lark",
"regex"
]
}
},
"required": [
"definition",
"syntax"
]
}
},
"required": [
"type",
"grammar"
]
}
]
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"custom"
]
}
]
}
},
"top_p": {
"anyOf": [
{
"type": "number",
"minimum": 0,
"maximum": 1
},
{
"type": "null"
}
],
"default": 1,
"description": "Nucleus sampling: considers the results of the tokens with top_p probability mass."
},
"user": {
"type": "string",
"description": "A unique identifier representing your end-user, for abuse monitoring."
},
"web_search_options": {
"anyOf": [
{
"type": "object",
"description": "Options for the web search tool (when using built-in web search).",
"properties": {
"search_context_size": {
"type": "string",
"enum": [
"low",
"medium",
"high"
],
"default": "medium"
},
"user_location": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"approximate"
]
},
"approximate": {
"type": "object",
"properties": {
"city": {
"type": "string"
},
"country": {
"type": "string"
},
"region": {
"type": "string"
},
"timezone": {
"type": "string"
}
}
}
},
"required": [
"type",
"approximate"
]
}
}
}
]
},
"function_call": {
"anyOf": [
{
"type": "string",
"enum": [
"none",
"auto"
]
},
{
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
]
},
"functions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the function to be called."
},
"description": {
"type": "string",
"description": "A description of what the function does."
},
"parameters": {
"type": "object",
"description": "The parameters the function accepts, described as a JSON Schema object."
},
"strict": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to enable strict schema adherence."
}
},
"required": [
"name"
]
},
"minItems": 1,
"maxItems": 128
}
},
"required": [
"prompt"
]
}
Streaming — Send a request with `stream: true` and receive server-sent events
{
"title": "Prompt",
"properties": {
"prompt": {
"type": "string",
"minLength": 1,
"description": "The input text prompt for the model to generate a response."
},
"model": {
"type": "string",
"description": "ID of the model to use (e.g. '@cf/zai-org/glm-4.7-flash, etc')."
},
"audio": {
"anyOf": [
{
"type": "object",
"description": "Parameters for audio output. Required when modalities includes 'audio'.",
"properties": {
"voice": {
"oneOf": [
{
"type": "string"
},
{
"type": "object",
"properties": {
"id": {
"type": "string"
}
},
"required": [
"id"
]
}
]
},
"format": {
"type": "string",
"enum": [
"wav",
"aac",
"mp3",
"flac",
"opus",
"pcm16"
]
}
},
"required": [
"voice",
"format"
]
}
]
},
"frequency_penalty": {
"anyOf": [
{
"type": "number",
"minimum": -2,
"maximum": 2
},
{
"type": "null"
}
],
"default": 0,
"description": "Penalizes new tokens based on their existing frequency in the text so far."
},
"logit_bias": {
"anyOf": [
{
"type": "object"
},
{
"type": "null"
}
],
"description": "Modify the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values from -100 to 100."
},
"logprobs": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to return log probabilities of the output tokens."
},
"top_logprobs": {
"anyOf": [
{
"type": "integer",
"minimum": 0,
"maximum": 20
},
{
"type": "null"
}
],
"description": "How many top log probabilities to return at each token position (0-20). Requires logprobs=true."
},
"max_tokens": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "Deprecated in favor of max_completion_tokens. The maximum number of tokens to generate."
},
"max_completion_tokens": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "An upper bound for the number of tokens that can be generated for a completion."
},
"metadata": {
"anyOf": [
{
"type": "object"
},
{
"type": "null"
}
],
"description": "Set of 16 key-value pairs that can be attached to the object."
},
"modalities": {
"anyOf": [
{
"type": "array",
"items": {
"type": "string",
"enum": [
"text",
"audio"
]
}
},
{
"type": "null"
}
],
"description": "Output types requested from the model (e.g. ['text'] or ['text', 'audio'])."
},
"n": {
"anyOf": [
{
"type": "integer",
"minimum": 1,
"maximum": 128
},
{
"type": "null"
}
],
"default": 1,
"description": "How many chat completion choices to generate for each input message."
},
"parallel_tool_calls": {
"type": "boolean",
"default": true,
"description": "Whether to enable parallel function calling during tool use."
},
"prediction": {
"anyOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"content"
]
},
"content": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
},
"text": {
"type": "string"
}
},
"required": [
"type",
"text"
]
}
}
]
}
},
"required": [
"type",
"content"
]
}
]
},
"presence_penalty": {
"anyOf": [
{
"type": "number",
"minimum": -2,
"maximum": 2
},
{
"type": "null"
}
],
"default": 0,
"description": "Penalizes new tokens based on whether they appear in the text so far."
},
"reasoning_effort": {
"anyOf": [
{
"type": "string",
"enum": [
"low",
"medium",
"high"
]
},
{
"type": "null"
}
],
"description": "Constrains effort on reasoning for reasoning models (o1, o3-mini, etc.)."
},
"chat_template_kwargs": {
"type": "object",
"properties": {
"enable_thinking": {
"type": "boolean",
"default": true,
"description": "Whether to enable reasoning, enabled by default."
},
"clear_thinking": {
"type": "boolean",
"default": false,
"description": "If false, preserves reasoning context between turns."
}
}
},
"response_format": {
"anyOf": [
{
"description": "Specifies the format the model must output.",
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"json_object"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"json_schema"
]
},
"json_schema": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"schema": {
"type": "object"
},
"strict": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
]
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"json_schema"
]
}
]
}
]
},
"seed": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"description": "If specified, the system will make a best effort to sample deterministically."
},
"service_tier": {
"anyOf": [
{
"type": "string",
"enum": [
"auto",
"default",
"flex",
"scale",
"priority"
]
},
{
"type": "null"
}
],
"default": "auto",
"description": "Specifies the processing type used for serving the request."
},
"stop": {
"description": "Up to 4 sequences where the API will stop generating further tokens.",
"anyOf": [
{
"type": "null"
},
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"maxItems": 4
}
]
},
"store": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to store the output for model distillation / evals."
},
"stream": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "If true, partial message deltas will be sent as server-sent events."
},
"stream_options": {
"anyOf": [
{
"type": "object",
"properties": {
"include_usage": {
"type": "boolean"
},
"include_obfuscation": {
"type": "boolean"
}
}
}
]
},
"temperature": {
"anyOf": [
{
"type": "number",
"minimum": 0,
"maximum": 2
},
{
"type": "null"
}
],
"default": 1,
"description": "Sampling temperature between 0 and 2."
},
"tool_choice": {
"anyOf": [
{
"description": "Controls which (if any) tool is called by the model. 'none' = no tools, 'auto' = model decides, 'required' = must call a tool.",
"oneOf": [
{
"type": "string",
"enum": [
"none",
"auto",
"required"
]
},
{
"type": "object",
"description": "Force a specific function tool.",
"properties": {
"type": {
"type": "string",
"enum": [
"function"
]
},
"function": {
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"function"
]
},
{
"type": "object",
"description": "Force a specific custom tool.",
"properties": {
"type": {
"type": "string",
"enum": [
"custom"
]
},
"custom": {
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"custom"
]
},
{
"type": "object",
"description": "Constrain to an allowed subset of tools.",
"properties": {
"type": {
"type": "string",
"enum": [
"allowed_tools"
]
},
"allowed_tools": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"enum": [
"auto",
"required"
]
},
"tools": {
"type": "array",
"items": {
"type": "object"
}
}
},
"required": [
"mode",
"tools"
]
}
},
"required": [
"type",
"allowed_tools"
]
}
]
}
]
},
"tools": {
"type": "array",
"description": "A list of tools the model may call.",
"items": {
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"function"
]
},
"function": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the function to be called."
},
"description": {
"type": "string",
"description": "A description of what the function does."
},
"parameters": {
"type": "object",
"description": "The parameters the function accepts, described as a JSON Schema object."
},
"strict": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to enable strict schema adherence."
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"function"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"custom"
]
},
"custom": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"format": {
"oneOf": [
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"text"
]
}
},
"required": [
"type"
]
},
{
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"grammar"
]
},
"grammar": {
"type": "object",
"properties": {
"definition": {
"type": "string"
},
"syntax": {
"type": "string",
"enum": [
"lark",
"regex"
]
}
},
"required": [
"definition",
"syntax"
]
}
},
"required": [
"type",
"grammar"
]
}
]
}
},
"required": [
"name"
]
}
},
"required": [
"type",
"custom"
]
}
]
}
},
"top_p": {
"anyOf": [
{
"type": "number",
"minimum": 0,
"maximum": 1
},
{
"type": "null"
}
],
"default": 1,
"description": "Nucleus sampling: considers the results of the tokens with top_p probability mass."
},
"user": {
"type": "string",
"description": "A unique identifier representing your end-user, for abuse monitoring."
},
"web_search_options": {
"anyOf": [
{
"type": "object",
"description": "Options for the web search tool (when using built-in web search).",
"properties": {
"search_context_size": {
"type": "string",
"enum": [
"low",
"medium",
"high"
],
"default": "medium"
},
"user_location": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": [
"approximate"
]
},
"approximate": {
"type": "object",
"properties": {
"city": {
"type": "string"
},
"country": {
"type": "string"
},
"region": {
"type": "string"
},
"timezone": {
"type": "string"
}
}
}
},
"required": [
"type",
"approximate"
]
}
}
}
]
},
"function_call": {
"anyOf": [
{
"type": "string",
"enum": [
"none",
"auto"
]
},
{
"type": "object",
"properties": {
"name": {
"type": "string"
}
},
"required": [
"name"
]
}
]
},
"functions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name of the function to be called."
},
"description": {
"type": "string",
"description": "A description of what the function does."
},
"parameters": {
"type": "object",
"description": "The parameters the function accepts, described as a JSON Schema object."
},
"strict": {
"anyOf": [
{
"type": "boolean"
},
{
"type": "null"
}
],
"default": false,
"description": "Whether to enable strict schema adherence."
}
},
"required": [
"name"
]
},
"minItems": 1,
"maxItems": 128
}
},
"required": [
"prompt"
]
}