P-Video-Avatar
Image-to-Video • Pruna AIPruna's P-Video-Avatar generates talking-head videos from a single portrait image driven by a text script or audio file, with multiple voices, languages, and output resolutions.
| Model Info | |
|---|---|
| More information | link ↗ |
| Pricing | View pricing in the Cloudflare dashboard ↗ |
Usage
const response = await env.AI.run( 'pruna/p-video-avatar', { image: 'https://huggingface.co/spaces/yisol/IDM-VTON/resolve/main/example/human/00121_00.jpg', voice_script: 'Hello, welcome to our product demo!', voice: 'Zephyr (Female)', resolution: '720p', },)console.log(response)curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run \ --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "model": "pruna/p-video-avatar", "input": { "image": "https://huggingface.co/spaces/yisol/IDM-VTON/resolve/main/example/human/00121_00.jpg", "voice_script": "Hello, welcome to our product demo!", "voice": "Zephyr (Female)", "resolution": "720p" }}'{ "state": "Completed", "result": { "video": "https://examples.aig.cloudflare.com/pruna/p-video-avatar/product-demo-greeting.mp4" }, "gatewayMetadata": { "keySource": "Unified" }}Parameters
image
stringrequiredInput portrait image (first frame). HTTP(S) URL or data URI. Supports jpg, jpeg, png, webp.audio
stringURL of uploaded audio to drive speech. HTTP(S) URL or data URI. If both audio and voice_script are provided, audio takes priority.voice
stringrequireddefault: Zephyr (Female)enum: Zephyr (Female), Puck (Male), Charon (Male), Kore (Female), Fenrir (Male), Leda (Female), Orus (Male), Aoede (Female), Callirrhoe (Female), Autonoe (Female), Enceladus (Male), Iapetus (Male), Umbriel (Male), Algenib (Male), Despina (Female), Erinome (Female), Laomedeia (Female), Achernar (Female), Algieba (Male), Schedar (Male), Gacrux (Female), Pulcherrima (Female), Achird (Male), Zubenelgenubi (Male), Vindemiatrix (Female), Sadachbia (Male), Sadaltager (Male), Sulafat (Female), Alnilam (Male), Rasalgethi (Male)Voice for generated speech.voice_script
stringrequireddefault: Script for the person to say when no audio is uploaded.voice_language
stringrequireddefault: English (US)enum: English (US), English (UK), Spanish, French, German, Italian, Portuguese (Brazil), Japanese, Korean, HindiOutput language.resolution
stringrequireddefault: 720penum: 720p, 1080pResolution of the video.video_prompt
stringrequireddefault: The person is talking.Optional prompt for the video.voice_prompt
stringrequireddefault: Say the following.Optional speaking style, tone, pacing or emotion instructions.negative_prompt
stringrequireddefault: Mention what you do NOT want in the video. Disabled if empty.strength_negative_prompt
numberrequireddefault: 0.5minimum: 0maximum: 4Strength of the negative prompt (0-4).seed
integerminimum: -9007199254740991maximum: 9007199254740991Random seed for reproducible generation.disable_safety_filter
booleanrequireddefault: trueDisable safety filter for prompts and input image.disable_prompt_upsampling
booleanrequireddefault: falseWhen true, skip the prompt upsampler and pass the raw user prompt.video
stringformat: uriPresigned URL for the generated avatar video.