> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lavendly.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Build your first video

> End-to-end walkthrough: shot → voiceover → music → render.

This guide walks through building a 5-second clip with a voiceover and
a music bed, end-to-end through the API. The same flow takes about 30
seconds on the canvas, but seeing the API version makes the data
model click.

```bash theme={null}
export LAVENDLY_API_KEY=lv_live_…
export API=https://api.lavendly.ai
AUTH="Authorization: Bearer $LAVENDLY_API_KEY"
```

## 1. Create the workflow

```bash theme={null}
WF=$(curl -s -X POST $API/v1/workflows \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{
    "name": "Bookshop fox",
    "shots": [
      { "id": "shot_1",
        "prompt": "a sleepy fox in a bookshop discovering an old map",
        "duration": 5 }
    ]
  }' | jq -r .id)
```

## 2. Attach a voiceover with captions

```bash theme={null}
curl -s -X POST $API/v1/workflows/$WF/clips/shot_1/tracks \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -H "Idempotency-Key: vo-$WF" \
  -d '{
    "kind":          "voiceover",
    "script":        "Some maps lead nowhere. This one led home.",
    "subtitleStyle": "tiktok"
  }'
```

The platform synthesizes the voiceover, transcribes it with
word-level timing, and stores both the audio and the caption data on
the workflow.

## 3. Attach a music bed with ducking

```bash theme={null}
curl -s -X POST $API/v1/workflows/$WF/clips/shot_1/tracks \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -H "Idempotency-Key: music-$WF" \
  -d '{
    "kind":    "music",
    "mood":    "warm cozy acoustic",
    "volume":  0.4,
    "ducking": true
  }'
```

`ducking: true` drops the music \~8 dB whenever the voiceover speaks,
so the narration stays on top.

## 4. Set the audio mix

We want the shot's baked audio blended at 60% with the new tracks:

```bash theme={null}
curl -s -X PATCH $API/v1/workflows/$WF/clips/shot_1/audio \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{ "native_audio": { "mode": "mix", "volume": 0.6 } }'
```

## 5. Render and poll

```bash theme={null}
JOB=$(curl -s -X POST $API/v1/workflows/$WF/renders \
  -H "$AUTH" -H 'Content-Type: application/json' \
  -H "Idempotency-Key: render-$WF" -d '{}' | jq -r .job_id)

while :; do
  S=$(curl -s $API/v1/workflows/$WF/renders/$JOB -H "$AUTH" | jq -r .status)
  echo "status=$S"
  [[ "$S" == "done" || "$S" == "failed" ]] && break
  sleep 4
done

curl -s $API/v1/workflows/$WF/renders/$JOB -H "$AUTH" | jq .result
```

Expected result:

```json theme={null}
{
  "video_url":     "https://cdn.lavendly.ai/videos/abc.mp4",
  "duration":      5,
  "quality_check": { "ok": true, "rubric": { "overall": 0.86 } }
}
```

## The same flow, agent-driven

Once an agent has the Lavendly MCP and a skill loaded, all five steps
above collapse into one prompt:

> Make me a 5-second video: a sleepy fox in a bookshop discovering an
> old map. Narrate it warmly. Cozy acoustic bed underneath. Captions
> on. Send me the URL.

The agent runs the same calls in the same order, applies the same
idempotency keys, and returns the URL. See
[the storyteller skill](/agent-skills/storyteller) for the exact
operating manual.