Key Design
Chunked Upload
The CLI uploads files to BSS using S3 multipart upload with resumable state and parallel workers.
Constants
| Setting | Value |
|---|---|
| Chunk size | 10 MB (S3 minimum is 5 MB except last part) |
| Max parallel workers | 4 |
| Hash algorithm | SHA256, first 16 hex chars |
| S3 PUT timeout | 300s per part |
| State directory | ~/.dreamlake/uploads/ |
Full Upload Flow
Example: uploading a video. All 4 types follow the same pattern — only the BSS URL prefix differs.
dreamlake upload ./run01.mp4 --episode alice@robotics:run-042 --to /camera/front
│
├─ 1. Read file, compute hash
│ hash = SHA256(content)[:16]
│ parts = ceil(fileSize / 10MB)
│
├─ 2. Check for resume state
│ ~/.dreamlake/uploads/{hash}.json
│ If exists → verify with BSS parts-done endpoint
│ If valid → skip already-uploaded parts
│
├─ 3. Init multipart (if no resume)
│ POST /videos/upload/multipart/init
│ → { uploadId, key }
│ Save state to disk
│
├─ 4. Get presigned S3 URLs for remaining parts
│ POST /videos/upload/multipart/parts
│ → { "1": "https://s3...", "2": "https://s3...", ... }
│
├─ 5. Upload parts in parallel (4 workers)
│ PUT chunk bytes directly to S3 presigned URLs
│ Extract ETag from response
│ Save state after each part completes
│
├─ 6. Complete multipart
│ POST /videos/upload/multipart/complete
│ → { success: true }
│ Clear state file
│
├─ 7. Register in BSS
│ POST /videos { owner, project, stagingHash, ... }
│ → { id: bssVideoId }
│
├─ 8. Register in DreamLake Server
│ POST /assets/video { namespace, space, episodeName, name, bssVideoId }
│ → { id, lambdaUrl }
│ Server creates: episode node → folder hierarchy → asset leaf node
│
└─ 9. Trigger Lambda via presigned URL
POST {lambdaUrl} (no auth header — URL is HMAC-signed)
→ 202 dispatched
Lambda splits file into HLS segments in the backgroundResume & Retry
Uploads are resumable across CLI invocations. State is persisted to ~/.dreamlake/uploads/{hash}.json:
{
"uploadId": "s3-multipart-upload-id",
"key": "alice/robotics/staging/abc123def456",
"totalParts": 12,
"completedParts": [
{ "partNumber": 1, "etag": "\"abc...\"" },
{ "partNumber": 2, "etag": "\"def...\"" }
]
}On resume:
- Load state file by hash
- Call
GET /{type}/upload/multipart/parts-done?uploadId=...&key=... - If not expired → skip completed parts, upload only remaining
- If expired → start fresh
On failure mid-upload, the CLI prints upload paused — re-run to resume and exits. There is no explicit abort — the S3 multipart upload expires naturally (typically 7 days).
Parallel Upload
Parts are uploaded concurrently using ThreadPoolExecutor:
Worker 1: ──── part 1 ──── part 5 ──── part 9 ────
Worker 2: ──── part 2 ──── part 6 ──── part 10 ────
Worker 3: ──── part 3 ──── part 7 ──── part 11 ────
Worker 4: ──── part 4 ──── part 8 ──── part 12 ────- State is saved (with a thread lock) after each part completes
- On first failure, all remaining futures are cancelled
- ETag from S3 response is stored per part for the complete call
BSS Endpoints (per type)
Each file type uses its own set of multipart endpoints. The protocol is identical — only the URL prefix differs.
| Type | Prefix |
|---|---|
| Video | /videos/upload/multipart/ |
| Audio | /audio/upload/multipart/ |
| Labels | /labels/upload/multipart/ |
| Text Tracks | /text-tracks/upload/multipart/ |
Each prefix supports: POST .../init, POST .../parts, GET .../parts-done, POST .../complete, POST .../abort
S3 Staging Pool
All types land in the same flat staging pool, content-addressed by hash:
{owner}/{project}/staging/{hash}Re-uploading the same file produces the same S3 key — the file is overwritten, not duplicated.
Lambda Trigger
After registration, DreamLake Server returns a lambdaUrl — a presigned BSS URL with HMAC signature. The CLI POSTs to this URL without any auth header. BSS verifies the signature and dispatches the Lambda.
This avoids the CLI needing a BSS auth token. See the Presigned URL Auth design doc for details.