Overview

Upload Workflow

End-to-end flow when a user uploads a file via the CLI. Involves three systems: Python SDK (CLI), BSS (storage), and DreamLake Server (metadata).

Sequence

CLI (dreamlake upload)
 │
 ├─ 1. Detect file type from extension
 │     .mp4 → video, .wav → audio, .vtt → text-track, .jsonl → label-track
 │
 ├─ 2. Upload to BSS (multipart)
 │     POST /videos/upload/multipart/init     → { uploadId, key }
 │     POST /videos/upload/multipart/parts    → { parts: [{ url }] }
 │     PUT  each part to presigned S3 URL
 │     POST /videos/upload/multipart/complete → { success }
 │
 │     File lands in S3: {owner}/{project}/staging/{hash}
 │
 ├─ 3. Register in BSS
 │     POST /videos  → { videoId }
 │     (or /audio, /labels, /text-tracks depending on type)
 │
 ├─ 4. Trigger Lambda (video/audio only)
 │     POST /lambdas/hls-split    { videoId }    → HLS chunks + playlist
 │     POST /lambdas/audio-process { audioId }   → processed audio
 │
 └─ 5. Register in DreamLake Server
       POST /assets/video  → { id, projectId, episodeId }
       (or /assets/audio, /assets/label-track, /assets/text-track)
 
       Server side effects:
       ├─ Resolve namespace +project (upsert)
       ├─ Resolve episode (upsert if episodeName provided)
       ├─ Parse name "/camera/front/run01.mp4"
       │   → folder path: /camera/front
       │   → filename: run01.mp4
       ├─ ensureFolderHierarchy → upsert Node records
       └─ Create Video/Audio/etc. with folderId + filename

Python SDK (CLI)

The CLI auto-detects file type and routes to the correct BSS + DreamLake Server endpoints:

Extension	Type	BSS Endpoint	DL Server Endpoint	Lambda
`.mp4`, `.mov`, `.avi`, `.mkv`, `.webm`	video	`/videos`	`/assets/video`	`hls-split`
`.wav`, `.mp3`, `.flac`, `.aac`, `.ogg`	audio	`/audio`	`/assets/audio`	`audio-process`
`.vtt`, `.srt`	text-track	`/text-tracks`	`/assets/text-track`	—
`.jsonl`, `.csv`	label-track	`/labels`	`/assets/label-track`	—

BSS (Storage)

BSS handles all binary data. Key responsibilities:

Multipart upload — chunked upload to S3 staging pool
Metadata — JSON sidecar per video/audio/track in S3
Lambda dispatch — trigger HLS splitting or audio processing
Presigned URLs — download/streaming URLs for clients

S3 layout after upload:

{owner}/{project}/
  staging/{hash}                    # Raw file
  videos/{videoId}/meta.json        # After Lambda completes
  videos/{videoId}/stream/*.m3u8    # HLS playlists
chunks/{hash}.ts                    # Global deduped chunk pool

DreamLake Server (Metadata)

DreamLake Server is the index layer. It does NOT touch S3 directly. On asset registration:

Resolves namespace slug → Namespace record
Upserts project → Project record
Upserts episode (if provided) → Episode record
Parses name into folder path + filename
Calls ensureFolderHierarchy → upserts Node records for each path segment
Creates the type-specific record (Video, Audio, etc.) with folderId, filename, and BSS reference

Lambda Workflow

Lambdas are triggered by BSS after upload completes. They run asynchronously.

HLS Split (video)

BSS                          AWS Lambda                    S3
 │                              │                           │
 ├─ POST /lambdas/hls-split ──→│                           │
 │   { videoId,                 │                           │
 │     stagingHash,             │                           │
 │     owner, project }         │                           │
 │                              ├─ GET staging/{hash} ─────→│
 │                              │←─ raw video bytes ────────│
 │                              │                           │
 │                              ├─ ffmpeg: split into       │
 │                              │  TS segments + m3u8       │
 │                              │                           │
 │                              ├─ PUT chunks/{hash}.ts ───→│ (deduped)
 │                              ├─ PUT stream/{h}.m3u8 ────→│
 │                              ├─ PUT meta.json ──────────→│
 │                              │                           │
 │←─ update video record ───────│                           │
 │   (length, durationSec,      │                           │
 │    streams, chunks)          │                           │

Audio Process

Same pattern — Lambda reads raw audio from staging, processes it, writes output back to S3, and updates the BSS audio record.

Error Handling

Stage	Failure	Recovery
Multipart upload	Part upload fails	CLI retries; BSS tracks completed parts
Lambda	Timeout or crash	Re-trigger via `POST /lambdas/hls-split`
Asset registration	Server down	CLI retries; BSS record exists independently
Folder creation	Race condition	`upsert` handles concurrent creates