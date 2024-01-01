Cloudflare Docs
Workers AI
Workers AI
  4. llava-1.5-7b-hf

llava-1.5-7b-hf

Beta

Model ID: @cf/llava-hf/llava-1.5-7b-hf

LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.

​​ Properties

Task Type: Image-to-Text

​​ Code Examples

Workers - TypeScript
export interface Env {
  AI: Ai;

}




export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const res: any = await fetch("https://cataas.com/cat");
    const blob = await res.arrayBuffer();
    const input = {
      image: [...new Uint8Array(blob)],
      prompt: "Generate a caption for this image",
      max_tokens: 512,
    };
    const response = await env.AI.run(
      "@cf/llava-hf/llava-1.5-7b-hf",
      input
      );
    return new Response(JSON.stringify(response));
  },

} satisfies ExportedHandler<Env>;

​​ API Schema

The following schema is based on JSON Schema

Input JSON Schema
{

"oneOf": [
  {
    "type": "string",
    "format": "binary"
  },
  {
    "type": "object",
    "properties": {
      "image": {
        "type": "array",
        "items": {
          "type": "number"
        }
      },
      "prompt": {
        "type": "string"
      },
      "max_tokens": {
        "type": "integer",
        "default": 512
      }
    }
  }

]

}
Output JSON Schema
{

"type": "object",

"contentType": "application/json",

"properties": {
  "description": {
    "type": "string"
  }

}

}