You can use Workers Bindings to interact with the Batch API.

Send a Batch request

Send your initial batch inference request by composing a JSON payload containing an array of individual inference requests and the queueRequest: true property (which is what controlls queueing behavior).

Note Ensure that the total payload is under 10 MB.

src/index.ts export interface Env { AI : Ai ; } export default { async fetch ( request , env ) : Promise < Response > { const embeddings = await env . AI . run ( "@cf/baai/bge-m3" , { requests : [ { query : "This is a story about Cloudflare" , contexts : [ { text : "This is a story about an orange cloud" , }, { text : "This is a story about a llama" , }, { text : "This is a story about a hugging emoji" , }, ] , }, ] , }, { queueRequest : true }, ) ; return Response . json ( embeddings ) ; }, } satisfies ExportedHandler < Env >;

{ " status " : "queued" , " model " : "@cf/baai/bge-m3" , " request_id " : "000-000-000" }

You will get a response with the following values:

: Indicates that your request is queued. request_id : A unique identifier for the batch request.

Of these, the request_id is important for when you need to poll the batch status.

Poll batch status

Once your batch request is queued, use the request_id to poll for its status. During processing, the API returns a status queued or running indicating that the request is still in the queue or being processed.

src/index.ts export interface Env { AI : Ai ; } export default { async fetch ( request , env ) : Promise < Response > { const status = await env . AI . run ( "@cf/baai/bge-m3" , { request_id : "000-000-000" , } ) ; return Response . json ( status ) ; }, } satisfies ExportedHandler < Env >;

{ " responses " : [ { " id " : 0 , " result " : { " response " : [ { " id " : 0 , " score " : 0.73974609375 }, { " id " : 1 , " score " : 0.642578125 }, { " id " : 2 , " score " : 0.6220703125 } ] }, " success " : true , " external_reference " : null } ], " usage " : { " prompt_tokens " : 12 , " completion_tokens " : 0 , " total_tokens " : 12 } }