Build your first video

This guide walks through building a 5-second clip with a voiceover and a music bed, end-to-end through the API. The same flow takes about 30 seconds on the canvas, but seeing the API version makes the data model click.

export LAVENDLY_API_KEY=lv_live_…
export API=https://api.lavendly.ai
AUTH="Authorization: Bearer $LAVENDLY_API_KEY"

1. Create the workflow

WF=$(curl -s -X POST $API/v1/workflows \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{
    "name": "Bookshop fox",
    "shots": [
      { "id": "shot_1",
        "prompt": "a sleepy fox in a bookshop discovering an old map",
        "duration": 5 }
    ]
  }' | jq -r .id)

2. Attach a voiceover with captions

curl -s -X POST $API/v1/workflows/$WF/clips/shot_1/tracks \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -H "Idempotency-Key: vo-$WF" \
  -d '{
    "kind":          "voiceover",
    "script":        "Some maps lead nowhere. This one led home.",
    "subtitleStyle": "tiktok"
  }'

The platform synthesizes the voiceover, transcribes it with word-level timing, and stores both the audio and the caption data on the workflow.

3. Attach a music bed with ducking

curl -s -X POST $API/v1/workflows/$WF/clips/shot_1/tracks \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -H "Idempotency-Key: music-$WF" \
  -d '{
    "kind":    "music",
    "mood":    "warm cozy acoustic",
    "volume":  0.4,
    "ducking": true
  }'

ducking: true drops the music ~8 dB whenever the voiceover speaks, so the narration stays on top.

4. Set the audio mix

We want the shot’s baked audio blended at 60% with the new tracks:

curl -s -X PATCH $API/v1/workflows/$WF/clips/shot_1/audio \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{ "native_audio": { "mode": "mix", "volume": 0.6 } }'

5. Render and poll

JOB=$(curl -s -X POST $API/v1/workflows/$WF/renders \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -H "Idempotency-Key: render-$WF" -d '{}' | jq -r .job_id)

while :; do
  S=$(curl -s $API/v1/workflows/$WF/renders/$JOB -H "$AUTH" | jq -r .status)
  echo "status=$S"
  [[ "$S" == "done" || "$S" == "failed" ]] && break
  sleep 4
done

curl -s $API/v1/workflows/$WF/renders/$JOB -H "$AUTH" | jq .result

Expected result:

{
  "video_url":     "https://cdn.lavendly.ai/videos/abc.mp4",
  "duration":      5,
  "quality_check": { "ok": true, "rubric": { "overall": 0.86 } }
}

The same flow, agent-driven

Once an agent has the Lavendly MCP and a skill loaded, all five steps above collapse into one prompt:

Make me a 5-second video: a sleepy fox in a bookshop discovering an old map. Narrate it warmly. Cozy acoustic bed underneath. Captions on. Send me the URL.

The agent runs the same calls in the same order, applies the same idempotency keys, and returns the URL. See the storyteller skill for the exact operating manual.

​1. Create the workflow

​2. Attach a voiceover with captions

​3. Attach a music bed with ducking

​4. Set the audio mix

​5. Render and poll

​The same flow, agent-driven