DreamLake

Semantic Search

Search video content with natural language. Videos are split into 2s HLS chunks, embedded with CLIP, and indexed in Qdrant.

Pipeline

Upload video → Lambda splits into 2s HLS chunks
    → dreamlake vectorize (CLIP + LLaVA per chunk)
    → Qdrant index
    → Query: text → CLIP embed → nearest neighbor

Vectorize

bash
dreamlake vectorize --episode robotics@alice:run-042
dreamlake vectorize --bindr "front-camera" --project robotics@alice
dreamlake vectorize --dataset "training-v1" --project robotics@alice

Add --zaku-url http://host:9000 for distributed processing via Zaku task queue.

Per chunk: CLIP ViT-L/14 (768d image embedding) + LLaVA 13B (caption → CLIP text embedding).

Search

GET /namespaces/:ns/projects/:project/semantic-search?q=robot+picking+up+cup
ParamDescription
qNatural language query
episodeScope to episode
bindrScope to bindr
limitMax results (default 10)
usingimage (default) or caption

Returns matched 2s clips with playback URLs and scores.

Performance

StepTime
Vectorize per chunk~14.5s (CLIP + LLaVA)
Search query~50ms
Storage per chunk~6KB
1 hour video1,800 points, ~11MB in Qdrant