Skip to content
Cloudflare Docs
b

bge-base-en-v1.5

Text Embeddingsbaai
@cf/baai/bge-base-en-v1.5

BAAI general embedding (Base) model that transforms any given text into a 768-dimensional vector

Model Info
More informationlink
Maximum Input Tokens512
Output Dimensions768
BatchYes
Unit Pricing$0.067 per M input tokens

Usage

Workers - TypeScript

export interface Env {
AI: Ai;
}
export default {
async fetch(request, env): Promise<Response> {
// Can be a string or array of strings]
const stories = [
"This is a story about an orange cloud",
"This is a story about a llama",
"This is a story about a hugging emoji",
];
const embeddings = await env.AI.run(
"@cf/baai/bge-base-en-v1.5",
{
text: stories,
}
);
return Response.json(embeddings);
},
} satisfies ExportedHandler<Env>;

Python

import os
import requests
ACCOUNT_ID = "your-account-id"
AUTH_TOKEN = os.environ.get("CLOUDFLARE_AUTH_TOKEN")
stories = [
'This is a story about an orange cloud',
'This is a story about a llama',
'This is a story about a hugging emoji'
]
response = requests.post(
f"https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/baai/bge-base-en-v1.5",
headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
json={"text": stories}
)
print(response.json())

curl

Terminal window
curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/baai/bge-base-en-v1.5 \
-X POST \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-d '{ "text": ["This is a story about an orange cloud", "This is a story about a llama", "This is a story about a hugging emoji"] }'

Parameters

* indicates a required field

Input

  • 0 object

    • text * one of

      • 0 string min 1

        The text to embed

      • 1 array

        Batch of text values to embed

        • items string min 1

          The text to embed

    • pooling string default mean

      The pooling method used in the embedding process. `cls` pooling will generate more accurate embeddings on larger inputs - however, embeddings created with cls pooling are not compatible with embeddings generated with mean pooling. The default pooling method is `mean` in order for this to not be a breaking change, but we highly suggest using the new `cls` pooling for better accuracy.

  • 1 object

    • requests * array

      Batch of the embeddings requests to run using async-queue

      • items object

        • text * one of

          • 0 string min 1

            The text to embed

          • 1 array

            Batch of text values to embed

            • items string min 1

              The text to embed

        • pooling string default mean

          The pooling method used in the embedding process. `cls` pooling will generate more accurate embeddings on larger inputs - however, embeddings created with cls pooling are not compatible with embeddings generated with mean pooling. The default pooling method is `mean` in order for this to not be a breaking change, but we highly suggest using the new `cls` pooling for better accuracy.

Output

  • 0 object

    • shape array

      • items number

    • data array

      Embeddings of the requested text values

      • items array

        Floating point embedding representation shaped by the embedding model

        • items number

    • pooling string

      The pooling method used in the embedding process.

  • Async response object

    • request_id string

      The async request id that can be used to obtain the results.

API Schemas

The following schemas are based on JSON Schema

{
"type": "object",
"oneOf": [
{
"properties": {
"text": {
"oneOf": [
{
"type": "string",
"description": "The text to embed",
"minLength": 1
},
{
"type": "array",
"description": "Batch of text values to embed",
"items": {
"type": "string",
"description": "The text to embed",
"minLength": 1
},
"maxItems": 100
}
]
},
"pooling": {
"type": "string",
"enum": [
"mean",
"cls"
],
"default": "mean",
"description": "The pooling method used in the embedding process. `cls` pooling will generate more accurate embeddings on larger inputs - however, embeddings created with cls pooling are not compatible with embeddings generated with mean pooling. The default pooling method is `mean` in order for this to not be a breaking change, but we highly suggest using the new `cls` pooling for better accuracy."
}
},
"required": [
"text"
]
},
{
"properties": {
"requests": {
"type": "array",
"description": "Batch of the embeddings requests to run using async-queue",
"items": {
"properties": {
"text": {
"oneOf": [
{
"type": "string",
"description": "The text to embed",
"minLength": 1
},
{
"type": "array",
"description": "Batch of text values to embed",
"items": {
"type": "string",
"description": "The text to embed",
"minLength": 1
},
"maxItems": 100
}
]
},
"pooling": {
"type": "string",
"enum": [
"mean",
"cls"
],
"default": "mean",
"description": "The pooling method used in the embedding process. `cls` pooling will generate more accurate embeddings on larger inputs - however, embeddings created with cls pooling are not compatible with embeddings generated with mean pooling. The default pooling method is `mean` in order for this to not be a breaking change, but we highly suggest using the new `cls` pooling for better accuracy."
}
},
"required": [
"text"
]
}
}
},
"required": [
"requests"
]
}
]
}