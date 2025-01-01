llama-guard-3-8b Text Generation • Meta

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.

Usage

Worker - Streaming export interface Env { AI : Ai ; } export default { async fetch ( request , env ) : Promise < Response > { const messages = [ { role : "system" , content : "You are a friendly assistant" }, { role : "user" , content : "What is the origin of the phrase Hello, World" , }, ] ; const stream = await env . AI . run ( "@cf/meta/llama-guard-3-8b" , { messages , stream : true , } ) ; return new Response ( stream , { headers : { "content-type" : "text/event-stream" }, } ) ; }, } satisfies ExportedHandler < Env >;

Worker export interface Env { AI : Ai ; } export default { async fetch ( request , env ) : Promise < Response > { const messages = [ { role : "system" , content : "You are a friendly assistant" }, { role : "user" , content : "What is the origin of the phrase Hello, World" , }, ] ; const response = await env . AI . run ( "@cf/meta/llama-guard-3-8b" , { messages } ) ; return Response . json ( response ) ; }, } satisfies ExportedHandler < Env >;

Python import os import requests ACCOUNT_ID = "your-account-id" AUTH_TOKEN = os . environ . get ( "CLOUDFLARE_AUTH_TOKEN" ) prompt = "Tell me all about PEP-8" response = requests . post ( f "https://api.cloudflare.com/client/v4/accounts/ { ACCOUNT_ID } /ai/run/@cf/meta/llama-guard-3-8b" , headers = { "Authorization" : f "Bearer { AUTH_TOKEN } " }, json = { "messages" : [ { "role" : "system" , "content" : "You are a friendly assistant" }, { "role" : "user" , "content" : prompt } ] } ) result = response . json () print ( result )

curl Terminal window curl https://api.cloudflare.com/client/v4/accounts/ $CLOUDFLARE_ACCOUNT_ID /ai/run/@cf/meta/llama-guard-3-8b \ -X POST \ -H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN " \ -d '{ "messages": [{ "role": "system", "content": "You are a friendly assistant" }, { "role": "user", "content": "Why is pizza so good" }]}'

Workers AI also supports OpenAI compatible API endpoints for /v1/chat/completions and /v1/embeddings.

Parameters

* indicates a required field

Input

messages * array An array of message objects representing the conversation history. items object role * string The role of the message sender (e.g., 'user', 'assistant', 'system', 'tool'). content * string max 131072 The content of the message as a string.

max_tokens integer default 256 The maximum number of tokens to generate in the response.

temperature number default 0.6 min 0 max 5 Controls the randomness of the output; higher values produce more random results.

response_format object Dictate the output format of the generated response. type string Set to json_object to process and output generated text as JSON.



Output

response one of 0 string The generated text response from the model. 1 object The json response parsed from the generated text response from the model. safe boolean Whether the conversation is safe or not. categories array A list of what hazard categories predicted for the conversation, if the conversation is deemed unsafe. items string Hazard category classname, from S1 to S14.

usage object Usage statistics for the inference request prompt_tokens number 0 Total number of tokens in input completion_tokens number 0 Total number of tokens in output total_tokens number 0 Total number of input and output tokens



