Views
DetectionBoxView
Overlay per-frame bounding-box detections on top of a video or image.
Compatible dtypes: detection_2d
Purpose
Spatial companion to ActionLabelView. While ActionLabelView shows when something is happening on a temporal ribbon, DetectionBoxView shows where it's happening on a spatial overlay. The two are designed to compose with VideoPlayer inside a position: relative wrapper.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
src | string | — | m3u8 playlist URL |
clock | TimelineClock | null | context | Falls back to <ClockProvider> |
className | string | — | Extra classes on the overlay container |
fps | number | 10 | Refresh rate for the active-set check |
hideLabels | boolean | false | Render just the rectangles, no label pill |
The view is absolute inset-0 pointer-events-none by default — the parent must be position: relative (or similar) for the overlay to fill it.
Data Schema
import type { DetectionEvent } from '@vuer-ai/vuer-m3u';
type DetectionEvent = {
ts: number; // event start, seconds
te: number; // event end, seconds
label: string; // class name
bbox: [number, number, number, number]; // [x, y, w, h] normalized 0..1
confidence?: number; // 0..1 shown next to label
id?: number | string; // stable tracked-object id (color key)
[extra: string]: unknown;
};Bbox coordinates are normalized to the overlay's width / height so the view renders correctly no matter the underlying video resolution. Use id to keep a color stable across frames for a single tracked object; without id the color is hashed from label.
For dense per-frame detections, emit one event per frame with a short te - ts (e.g. the frame duration). The library's segment cache keeps the current chunk in memory.
Example
Four concurrent objects being tracked:
{"ts": 0.00, "te": 0.25, "label": "cube", "id": "cube_A", "bbox": [0.08, 0.55, 0.08, 0.08], "confidence": 0.87}
{"ts": 0.50, "te": 0.75, "label": "gripper", "id": "gripper_1","bbox": [0.49, 0.45, 0.07, 0.07], "confidence": 0.95}
{"ts": 2.00, "te": 2.25, "label": "person", "id": "op_alice", "bbox": [0.78, 0.18, 0.14, 0.55], "confidence": 0.92}
{"ts": 7.00, "te": 7.25, "label": "shelf", "id": "shelf_B", "bbox": [0.60, 0.60, 0.18, 0.15], "confidence": 0.88}Usage
import {
useTimeline,
ClockProvider,
TimelineController,
VideoPlayer,
DetectionBoxView,
} from '@vuer-ai/vuer-m3u'
const VIDEO_URL = '/vuer-m3u-demo/video/playlist.m3u8'
const DETECTIONS_URL = '/vuer-m3u-demo/detections/playlist.m3u8'
export function DetectionBoxViewDemo() {
const { clock, state, play, pause, seek, setPlaybackRate, setLoop } = useTimeline()
return (
<ClockProvider clock={clock}>
<div className="relative aspect-video bg-black">
<VideoPlayer src={VIDEO_URL} className="w-full h-full" />
<DetectionBoxView src={DETECTIONS_URL} />
</div>
<TimelineController
state={state}
onPlay={play}
onPause={pause}
onSeek={seek}
onSpeedChange={setPlaybackRate}
onLoopChange={setLoop}
/>
</ClockProvider>
)
}Under the Hood
useSegment<DetectionEvent[]>(engine) returns the current chunk's detections; the view filters to events overlapping the current clock time and renders one <div> per active box with absolute-percentage positioning. No 3D, no Canvas — just HTML + CSS, so it composes cleanly on top of any element (including a <video> or <img>).