bge-m3

Text Embeddings • BAAI

Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.

Model Info
Context Window ↗	60,000 tokens
Unit Pricing	$0.012 per M input tokens

Usage

export interface Env {
  AI: Ai;
}

export default {
  async fetch(request, env): Promise<Response> {

    // Can be a string or array of strings]
    const stories = [
      "This is a story about an orange cloud",
      "This is a story about a llama",
      "This is a story about a hugging emoji",
    ];

    const embeddings = await env.AI.run(
      "@cf/baai/bge-m3",
      {
        text: stories,
      }
    );

    return Response.json(embeddings);
  },
} satisfies ExportedHandler<Env>;

import os
import requests


ACCOUNT_ID = "your-account-id"
AUTH_TOKEN = os.environ.get("CLOUDFLARE_AUTH_TOKEN")

stories = [
  'This is a story about an orange cloud',
  'This is a story about a llama',
  'This is a story about a hugging emoji'
]

response = requests.post(
  f"https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/baai/bge-m3",
  headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
  json={"text": stories}
)

print(response.json())

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/baai/bge-m3  \
  -X POST  \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \
  -d '{ "text": ["This is a story about an orange cloud", "This is a story about a llama", "This is a story about a hugging emoji"] }'

Parameters

Synchronous — Send a request and receive a complete response

Input
Output

query

stringminLength: 1A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts

▶contexts[]

arrayrequiredList of provided contexts. Note that the index in this array is important, as the response will refer to it.

truncate_inputs

booleandefault: falseWhen provided with too long context should the model error out or truncate the context to fit?

request_id

stringThe async request id that can be used to obtain the results.

Batch — Send multiple requests in a single API call

Input
Output

▶requests[]

arrayrequiredBatch of the embeddings requests to run using async-queue

request_id

stringThe async request id that can be used to obtain the results.

API Schemas (Raw)

Synchronous Input

Synchronous Output

Batch Input

Batch Output