Views

DetectionBoxView

Overlay per-frame bounding-box detections on top of a video or image.

Compatible dtypes: detection_2d

Purpose

Spatial companion to ActionLabelView. While ActionLabelView shows when something is happening on a temporal ribbon, DetectionBoxView shows where it's happening on a spatial overlay. The two are designed to compose with VideoPlayer inside a position: relative wrapper.

Props

Prop	Type	Default	Description
`src`	`string`	—	m3u8 playlist URL
`clock`	`TimelineClock \| null`	context	Falls back to `<ClockProvider>`
`className`	`string`	—	Extra classes on the overlay container
`fps`	`number`	`10`	Refresh rate for the active-set check
`hideLabels`	`boolean`	`false`	Render just the rectangles, no label pill

The view is absolute inset-0 pointer-events-none by default — the parent must be position: relative (or similar) for the overlay to fill it.

Data Schema

import type { DetectionEvent } from '@vuer-ai/vuer-m3u';
 
type DetectionEvent = {
  ts: number;                               // event start, seconds
  te: number;                               // event end, seconds
  label: string;                            // class name
  bbox: [number, number, number, number];   // [x, y, w, h] normalized 0..1
  confidence?: number;                      // 0..1 shown next to label
  id?: number | string;                     // stable tracked-object id (color key)
  [extra: string]: unknown;
};

Bbox coordinates are normalized to the overlay's width / height so the view renders correctly no matter the underlying video resolution. Use id to keep a color stable across frames for a single tracked object; without id the color is hashed from label.

For dense per-frame detections, emit one event per frame with a short te - ts (e.g. the frame duration). The library's segment cache keeps the current chunk in memory.

Example

Four concurrent objects being tracked:

{"ts": 0.00, "te": 0.25, "label": "cube",    "id": "cube_A",   "bbox": [0.08, 0.55, 0.08, 0.08], "confidence": 0.87}
{"ts": 0.50, "te": 0.75, "label": "gripper", "id": "gripper_1","bbox": [0.49, 0.45, 0.07, 0.07], "confidence": 0.95}
{"ts": 2.00, "te": 2.25, "label": "person",  "id": "op_alice", "bbox": [0.78, 0.18, 0.14, 0.55], "confidence": 0.92}
{"ts": 7.00, "te": 7.25, "label": "shelf",   "id": "shelf_B",  "bbox": [0.60, 0.60, 0.18, 0.15], "confidence": 0.88}

Usage

Loading demo...

import {
  useTimeline,
  ClockProvider,
  TimelineController,
  VideoPlayer,
  DetectionBoxView,
} from '@vuer-ai/vuer-m3u'

const VIDEO_URL = '/vuer-m3u-demo/video/playlist.m3u8'
const DETECTIONS_URL = '/vuer-m3u-demo/detections/playlist.m3u8'

export function DetectionBoxViewDemo() {
  const { clock, state, play, pause, seek, setPlaybackRate, setLoop } = useTimeline()

  return (
    <ClockProvider clock={clock}>
      <div className="relative aspect-video bg-black">
        <VideoPlayer src={VIDEO_URL} className="w-full h-full" />
        <DetectionBoxView src={DETECTIONS_URL} />
      </div>
      <TimelineController
        state={state}
        onPlay={play}
        onPause={pause}
        onSeek={seek}
        onSpeedChange={setPlaybackRate}
        onLoopChange={setLoop}
      />
    </ClockProvider>
  )
}

Under the Hood

useSegment<DetectionEvent[]>(engine) returns the current chunk's detections; the view filters to events overlapping the current clock time and renders one <div> per active box with absolute-percentage positioning. No 3D, no Canvas — just HTML + CSS, so it composes cleanly on top of any element (including a <video> or <img>).