DB Schema
Embedding
Vector embeddings for semantic search across track content. Uses MongoDB Atlas Vector Search.
| Field | Type | Description |
|---|---|---|
id | ObjectId | Primary key |
videoId | String | → Video.id |
trackHash | String | → chunks/{hash}.jsonl in S3 |
lineNum | Int | Line number within the chunk file |
text | String | Cached text for display (avoids re-fetching S3) |
vector | Float[] | Embedding vector (MongoDB Atlas Vector Search) |
startMs | Int? | Start time in milliseconds |
endMs | Int? | End time in milliseconds |
createdAt | DateTime | Created timestamp |
Unique: [videoId, trackHash, lineNum]
Indexes: videoId
How it works
- Track content (JSONL chunks) is stored in S3
- Each line is embedded via a model (text, audio, frame)
- The vector + a cached copy of the text are stored here
trackHash+lineNumuniquely identify the source line in S3- Queries use MongoDB Atlas Vector Search for nearest-neighbor lookup