bge-m3

Text Embeddings • baai

@cf/baai/bge-m3

Multi-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.

Model Info
Pricing	$0.012 per M input tokens

Usage

Workers - TypeScript

export interface Env {
  AI: Ai;
}

export default {
  async fetch(request, env): Promise<Response> {

    // Can be a string or array of strings]
    const stories = [
      "This is a story about an orange cloud",
      "This is a story about a llama",
      "This is a story about a hugging emoji",
    ];

    const embeddings = await env.AI.run(
      "@cf/baai/bge-m3",
      {
        text: stories,
      }
    );

    return Response.json(embeddings);
  },
} satisfies ExportedHandler<Env>;

Python

import os
import requests


ACCOUNT_ID = "your-account-id"
AUTH_TOKEN = os.environ.get("CLOUDFLARE_AUTH_TOKEN")

stories = [
  'This is a story about an orange cloud',
  'This is a story about a llama',
  'This is a story about a hugging emoji'
]

response = requests.post(
  f"https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/baai/bge-m3",
  headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
  json={"text": stories}
)

print(response.json())

curl

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/baai/bge-m3  \
  -X POST  \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \
  -d '{ "text": ["This is a story about an orange cloud", "This is a story about a llama", "This is a story about a hugging emoji"] }'

Parameters

* indicates a required field

Input

BGE M3 Input Query and Contexts object
- query string min 1
  A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts
- contexts * array
  List of provided contexts. Note that the index in this array is important, as the response will refer to it.
  - items object
    - text string min 1
      One of the provided context content
- truncate_inputs boolean
  When provided with too long context should the model error out or truncate the context to fit?
BGE M3 Input Embedding object
- text * one of
  - 0 string min 1
    The text to embed
  - 1 array
    Batch of text values to embed
    - items string min 1
      The text to embed
- truncate_inputs boolean
  When provided with too long context should the model error out or truncate the context to fit?

Output

BGE M3 Ouput Query object
- response array
  - items object
    - id integer
      Index of the context in the request
    - score number
      Score of the context under the index.
BGE M3 Output Embedding for Contexts object
- response array
  - items array
    - items number
- shape array
  - items number
- pooling string
  The pooling method used in the embedding process.
BGE M3 Ouput Embedding object
- shape array
  - items number
- data array
  Embeddings of the requested text values
  - items array
    Floating point embedding representation shaped by the embedding model
    - items number
- pooling string
  The pooling method used in the embedding process.

API Schemas

The following schemas are based on JSON Schema

Input
Output

{
    "type": "object",
    "oneOf": [
        {
            "title": "BGE M3 Input Query and Contexts",
            "properties": {
                "query": {
                    "type": "string",
                    "minLength": 1,
                    "description": "A query you wish to perform against the provided contexts. If no query is provided the model with respond with embeddings for contexts"
                },
                "contexts": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "text": {
                                "type": "string",
                                "minLength": 1,
                                "description": "One of the provided context content"
                            }
                        }
                    },
                    "description": "List of provided contexts. Note that the index in this array is important, as the response will refer to it."
                },
                "truncate_inputs": {
                    "type": "boolean",
                    "default": false,
                    "description": "When provided with too long context should the model error out or truncate the context to fit?"
                }
            },
            "required": [
                "contexts"
            ]
        },
        {
            "title": "BGE M3 Input Embedding",
            "properties": {
                "text": {
                    "oneOf": [
                        {
                            "type": "string",
                            "description": "The text to embed",
                            "minLength": 1
                        },
                        {
                            "type": "array",
                            "description": "Batch of text values to embed",
                            "items": {
                                "type": "string",
                                "description": "The text to embed",
                                "minLength": 1
                            },
                            "maxItems": 100
                        }
                    ]
                },
                "truncate_inputs": {
                    "type": "boolean",
                    "default": false,
                    "description": "When provided with too long context should the model error out or truncate the context to fit?"
                }
            },
            "required": [
                "text"
            ]
        }
    ]
}

{
    "type": "object",
    "contentType": "application/json",
    "oneOf": [
        {
            "title": "BGE M3 Ouput Query",
            "properties": {
                "response": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "integer",
                                "description": "Index of the context in the request"
                            },
                            "score": {
                                "type": "number",
                                "description": "Score of the context under the index."
                            }
                        }
                    }
                }
            }
        },
        {
            "title": "BGE M3 Output Embedding for Contexts",
            "properties": {
                "response": {
                    "type": "array",
                    "items": {
                        "type": "array",
                        "items": {
                            "type": "number"
                        }
                    }
                },
                "shape": {
                    "type": "array",
                    "items": {
                        "type": "number"
                    }
                },
                "pooling": {
                    "type": "string",
                    "enum": [
                        "mean",
                        "cls"
                    ],
                    "description": "The pooling method used in the embedding process."
                }
            }
        },
        {
            "title": "BGE M3 Ouput Embedding",
            "properties": {
                "shape": {
                    "type": "array",
                    "items": {
                        "type": "number"
                    }
                },
                "data": {
                    "type": "array",
                    "description": "Embeddings of the requested text values",
                    "items": {
                        "type": "array",
                        "description": "Floating point embedding representation shaped by the embedding model",
                        "items": {
                            "type": "number"
                        }
                    }
                },
                "pooling": {
                    "type": "string",
                    "enum": [
                        "mean",
                        "cls"
                    ],
                    "description": "The pooling method used in the embedding process."
                }
            }
        }
    ]
}

Was this helpful?

Community
X
Discord
YouTube
GitHub