bge-m3

Text Embeddings • BAAI • Hosted

Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.

Model Info
Context Window ↗	60,000 tokens
Unit Pricing	$0.012 per M input tokens

Usage

export interface Env {
  AI: Ai;
}

export default {
  async fetch(request, env): Promise<Response> {

    // Can be a string or array of strings]
    const stories = [
      "This is a story about an orange cloud",
      "This is a story about a llama",
      "This is a story about a hugging emoji",
    ];

    const embeddings = await env.AI.run(
      "@cf/baai/bge-m3",
      {
        text: stories,
      }
    );

    return Response.json(embeddings);
  },
} satisfies ExportedHandler<Env>;

import os
import requests


ACCOUNT_ID = "your-account-id"
AUTH_TOKEN = os.environ.get("CLOUDFLARE_AUTH_TOKEN")

stories = [
  'This is a story about an orange cloud',
  'This is a story about a llama',
  'This is a story about a hugging emoji'
]

response = requests.post(
  f"https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/baai/bge-m3",
  headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
  json={"text": stories}
)

print(response.json())

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/baai/bge-m3  \
  -X POST  \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \
  -d '{ "text": ["This is a story about an orange cloud", "This is a story about a llama", "This is a story about a hugging emoji"] }'

Parameters

Synchronous — Send a request and receive a complete response

Input
Output

query

stringminLength: 1A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts

▶contexts[]

arrayrequiredList of provided contexts. Note that the index in this array is important, as the response will refer to it.

truncate_inputs

booleandefault: falseWhen provided with too long context should the model error out or truncate the context to fit?

request_id

stringThe async request id that can be used to obtain the results.

Batch — Send multiple requests in a single API call

Input
Output

▶requests[]

arrayrequiredBatch of the embeddings requests to run using async-queue

request_id

stringThe async request id that can be used to obtain the results.

API Schemas (Raw)

Synchronous — Send a request and receive a complete response

Input
Output

{
  "title": "Input Query and Contexts",
  "properties": {
    "query": {
      "type": "string",
      "minLength": 1,
      "description": "A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts"
    },
    "contexts": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "text": {
            "type": "string",
            "minLength": 1,
            "description": "One of the provided context content"
          }
        }
      },
      "description": "List of provided contexts. Note that the index in this array is important, as the response will refer to it."
    },
    "truncate_inputs": {
      "type": "boolean",
      "default": false,
      "description": "When provided with too long context should the model error out or truncate the context to fit?"
    }
  },
  "required": [
    "contexts"
  ]
}

{
  "type": "object",
  "contentType": "application/json",
  "title": "Async response",
  "properties": {
    "request_id": {
      "type": "string",
      "description": "The async request id that can be used to obtain the results."
    }
  }
}

Batch — Send multiple requests in a single API call

Input
Output

{
  "properties": {
    "requests": {
      "type": "array",
      "description": "Batch of the embeddings requests to run using async-queue",
      "items": {
        "type": "object",
        "oneOf": [
          {
            "title": "Input Query and Contexts",
            "properties": {
              "query": {
                "type": "string",
                "minLength": 1,
                "description": "A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts"
              },
              "contexts": {
                "type": "array",
                "items": {
                  "type": "object",
                  "properties": {
                    "text": {
                      "type": "string",
                      "minLength": 1,
                      "description": "One of the provided context content"
                    }
                  }
                },
                "description": "List of provided contexts. Note that the index in this array is important, as the response will refer to it."
              },
              "truncate_inputs": {
                "type": "boolean",
                "default": false,
                "description": "When provided with too long context should the model error out or truncate the context to fit?"
              }
            },
            "required": [
              "contexts"
            ]
          },
          {
            "title": "Input Embedding",
            "properties": {
              "text": {
                "oneOf": [
                  {
                    "type": "string",
                    "description": "The text to embed",
                    "minLength": 1
                  },
                  {
                    "type": "array",
                    "description": "Batch of text values to embed",
                    "items": {
                      "type": "string",
                      "description": "The text to embed",
                      "minLength": 1
                    },
                    "maxItems": 100
                  }
                ]
              },
              "truncate_inputs": {
                "type": "boolean",
                "default": false,
                "description": "When provided with too long context should the model error out or truncate the context to fit?"
              }
            },
            "required": [
              "text"
            ]
          }
        ]
      }
    }
  },
  "required": [
    "requests"
  ]
}

{
  "type": "object",
  "contentType": "application/json",
  "title": "Async response",
  "properties": {
    "request_id": {
      "type": "string",
      "description": "The async request id that can be used to obtain the results."
    }
  }
}