P-Video-Avatar

Image-to-Video • Pruna AI

Pruna's P-Video-Avatar generates talking-head videos from a single portrait image driven by a text script or audio file, with multiple voices, languages, and output resolutions.

Model Info
More information	link ↗
Pricing	View pricing in the Cloudflare dashboard ↗

Usage

TypeScript
cURL

const response = await env.AI.run(
  'pruna/p-video-avatar',
  {
    image: 'https://huggingface.co/spaces/yisol/IDM-VTON/resolve/main/example/human/00121_00.jpg',
    voice_script: 'Hello, welcome to our product demo!',
    voice: 'Zephyr (Female)',
    resolution: '720p',
  },
)
console.log(response)

curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "model": "pruna/p-video-avatar",
  "input": {
    "image": "https://huggingface.co/spaces/yisol/IDM-VTON/resolve/main/example/human/00121_00.jpg",
    "voice_script": "Hello, welcome to our product demo!",
    "voice": "Zephyr (Female)",
    "resolution": "720p"
  }
}'

Output
Raw response

{
  "state": "Completed",
  "result": {
    "video": "https://examples.aig.cloudflare.com/pruna/p-video-avatar/product-demo-greeting.mp4"
  },
  "gatewayMetadata": {
    "keySource": "Unified"
  }
}

image

stringrequiredInput portrait image (first frame). HTTP(S) URL or data URI. Supports jpg, jpeg, png, webp.

audio

stringURL of uploaded audio to drive speech. HTTP(S) URL or data URI. If both audio and voice_script are provided, audio takes priority.

voice

stringrequireddefault: Zephyr (Female)enum: Zephyr (Female), Puck (Male), Charon (Male), Kore (Female), Fenrir (Male), Leda (Female), Orus (Male), Aoede (Female), Callirrhoe (Female), Autonoe (Female), Enceladus (Male), Iapetus (Male), Umbriel (Male), Algenib (Male), Despina (Female), Erinome (Female), Laomedeia (Female), Achernar (Female), Algieba (Male), Schedar (Male), Gacrux (Female), Pulcherrima (Female), Achird (Male), Zubenelgenubi (Male), Vindemiatrix (Female), Sadachbia (Male), Sadaltager (Male), Sulafat (Female), Alnilam (Male), Rasalgethi (Male)Voice for generated speech.

voice_script

stringrequireddefault: Script for the person to say when no audio is uploaded.

voice_language

stringrequireddefault: English (US)enum: English (US), English (UK), Spanish, French, German, Italian, Portuguese (Brazil), Japanese, Korean, HindiOutput language.

resolution

stringrequireddefault: 720penum: 720p, 1080pResolution of the video.

video_prompt

stringrequireddefault: The person is talking.Optional prompt for the video.

voice_prompt

stringrequireddefault: Say the following.Optional speaking style, tone, pacing or emotion instructions.

negative_prompt

stringrequireddefault: Mention what you do NOT want in the video. Disabled if empty.

strength_negative_prompt

numberrequireddefault: 0.5minimum: 0maximum: 4Strength of the negative prompt (0-4).

seed

integerminimum: -9007199254740991maximum: 9007199254740991Random seed for reproducible generation.

disable_safety_filter

booleanrequireddefault: trueDisable safety filter for prompts and input image.

disable_prompt_upsampling

booleanrequireddefault: falseWhen true, skip the prompt upsampler and pass the raw user prompt.

video

stringformat: uriPresigned URL for the generated avatar video.

API Schemas (Raw)

Input

Output