1. Create the workflow
2. Attach a voiceover with captions
3. Attach a music bed with ducking
ducking: true drops the music ~8 dB whenever the voiceover speaks,
so the narration stays on top.
4. Set the audio mix
We want the shot’s baked audio blended at 60% with the new tracks:5. Render and poll
The same flow, agent-driven
Once an agent has the Lavendly MCP and a skill loaded, all five steps above collapse into one prompt:Make me a 5-second video: a sleepy fox in a bookshop discovering an old map. Narrate it warmly. Cozy acoustic bed underneath. Captions on. Send me the URL.The agent runs the same calls in the same order, applies the same idempotency keys, and returns the URL. See the storyteller skill for the exact operating manual.