P-Video-Avatar

Image-to-Video • Pruna AI

Pruna's P-Video-Avatar generates talking-head videos from a single portrait image driven by a text script or audio file, with multiple voices, languages, and output resolutions.

Model Info
More information	link ↗
Pricing	View pricing in the Cloudflare dashboard ↗

Usage

TypeScript
cURL

const response = await env.AI.run(
  'pruna/p-video-avatar',
  {
    image: 'https://huggingface.co/spaces/yisol/IDM-VTON/resolve/main/example/human/00121_00.jpg',
    voice_script: 'Hello, welcome to our product demo!',
    voice: 'Zephyr (Female)',
    resolution: '720p',
  },
)
console.log(response)

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "model": "pruna/p-video-avatar",
  "input": {
    "image": "https://huggingface.co/spaces/yisol/IDM-VTON/resolve/main/example/human/00121_00.jpg",
    "voice_script": "Hello, welcome to our product demo!",
    "voice": "Zephyr (Female)",
    "resolution": "720p"
  }
}'

Output
Raw response

{
  "state": "Completed",
  "result": {
    "video": "https://examples.aig.cloudflare.com/pruna/p-video-avatar/product-demo-greeting.mp4"
  },
  "gatewayMetadata": {
    "keySource": "Unified"
  }
}

audio

stringURL of uploaded audio to drive speech. HTTP(S) URL or data URI. If both audio and voice_script are provided, audio takes priority.

disable_prompt_upsampling

booleanrequireddefault: falseWhen true, skip the prompt upsampler and pass the raw user prompt.

disable_safety_filter

booleanrequireddefault: trueDisable safety filter for prompts and input image.

image

stringrequiredInput portrait image (first frame). HTTP(S) URL or data URI. Supports jpg, jpeg, png, webp.

negative_prompt

stringrequireddefault: Mention what you do NOT want in the video. Disabled if empty.

resolution

stringrequireddefault: 720penum: 720p, 1080pResolution of the video.

seed

integermaximum: 9007199254740991minimum: -9007199254740991Random seed for reproducible generation.

strength_negative_prompt

numberrequireddefault: 0.5maximum: 4minimum: 0Strength of the negative prompt (0-4).

video_prompt

stringrequireddefault: The person is talking.Optional prompt for the video.

voice

stringrequireddefault: Zephyr (Female)enum: Zephyr (Female), Puck (Male), Charon (Male), Kore (Female), Fenrir (Male), Leda (Female), Orus (Male), Aoede (Female), Callirrhoe (Female), Autonoe (Female), Enceladus (Male), Iapetus (Male), Umbriel (Male), Algenib (Male), Despina (Female), Erinome (Female), Laomedeia (Female), Achernar (Female), Algieba (Male), Schedar (Male), Gacrux (Female), Pulcherrima (Female), Achird (Male), Zubenelgenubi (Male), Vindemiatrix (Female), Sadachbia (Male), Sadaltager (Male), Sulafat (Female), Alnilam (Male), Rasalgethi (Male)Voice for generated speech.

voice_language

stringrequireddefault: English (US)enum: English (US), English (UK), Spanish, French, German, Italian, Portuguese (Brazil), Japanese, Korean, HindiOutput language.

voice_prompt

stringrequireddefault: Say the following.Optional speaking style, tone, pacing or emotion instructions.

voice_script

stringrequireddefault: Script for the person to say when no audio is uploaded.

video

stringformat: uriPresigned URL for the generated avatar video.

API Schemas (Raw)

Input

Output