Part of the Agentic Family · Zero Dependencies · Browser Native

Agentic Sense

Give your AI agent eyes. 478 face landmarks, 52 blendshapes, hand tracking, body pose, object detection — all running locally in the browser via MediaPipe. Single file, zero dependencies.

Live Demo GitHub →
Core Output

Raw perception, not opinions

Every frame, agentic-sense outputs structured data about every face, hand, and body in view. Your agent decides what it means.

{ "faceCount": 1, "faces": [{ "head": { "yaw": -0.023, "pitch": 0.041, "facing": true }, "eyes": { "avgEAR": 0.312, "ipd": 0.089, "iris": { ... } }, "blendshapes": { "jawOpen": 0.03, "smileL": 0.12, ... 38 values }, "interpretation": { "expression": "smiling", "focus": { "score": 82, "level": "high" }, "gaze": { "region": "center", "looking": true }, "blinkRate": 14, "distance": 0.8 } }], "hands": [{ "gesture": "Open_Palm", "fingers": { ... } }], "body": { "joints": { ... }, "shoulderWidth": 0.34 }, "segmentation": { "personRatio": 0.42 }, "objects": [{ "label": "laptop", "confidence": 0.91 }] }
👁
478 Landmarks
Full face mesh with iris tracking. Head pose, eye corners, lip contour — sub-pixel precision at 30fps.
🎭
52 Blendshapes
Every facial muscle as a 0–1 weight. jawOpen, smileL, browDown — the raw signal for expression analysis.
Hand Tracking
21 landmarks per hand, gesture recognition (8 built-in), per-finger extension state. Up to 2 hands.
🦴
Body Pose
33 skeletal landmarks. Shoulder width, torso length, joint positions with visibility scores.
🔒
Fully Local
Zero network requests after model load. MediaPipe WASM in-browser. Camera feed never leaves the device.
📦
Single File
~480 lines of JS. No npm, no build step. One import and you're sensing.
Try It

See it in action

Click to activate your camera. All processing happens locally.

👁
Camera off
Status Waiting...
Head Yaw
Head Pitch
Facing
Eye Aspect Ratio
Expression
Focus
Smile
Hands
Architecture

Library returns data, you draw

AgenticSense wraps MediaPipe into a single class. detect() gives you structured data. Overlay, dashboard, synthesis — all optional, in the demo folder.

AgenticSense
agentic-sense.js (~480 lines) init({ face, hands, pose, segment, objects }) detect() → SenseFrame rawResults → MediaPipe objects
↑ the library
Interpretation
expression classifier focus scorer gaze estimator blink detector head pose EMA
↑ built into detect()
Demo Only
overlay.js dashboard.js synthesis engine camera switcher
Quick Start

Three steps to perceive

// 1. Import import { AgenticSense } from 'agentic-sense' // 2. Init const video = document.getElementById('cam') const sense = new AgenticSense(video) await sense.init({ wasmPath: './mediapipe/', face: true, hands: true }) // 3. Sense function loop() { const frame = sense.detect() if (frame?.faceCount > 0) { console.log(frame.faces[0].interpretation.expression) // 'smiling' console.log(frame.faces[0].interpretation.focus.score) // 82 console.log(frame.faces[0].blendshapes.jawOpen) // 0.031 } if (frame?.hands.length > 0) { console.log(frame.hands[0].gesture) // 'Open_Palm' } requestAnimationFrame(loop) } loop()
Ecosystem

Part of the agentic family

🧠
agentic-core
LLM + vision. The brain.
👁
agentic-sense
Perception engine. The eyes.
agentic-act
Intent → action. The will.
🎨
agentic-render
Dynamic UI generation. The expression.
🗣️
agentic-voice
TTS + STT. The voice.
💭
agentic-memory
Context + retrieval. The memory.
📦
agentic-store
SQLite persistence. Long-term storage.
🦀
agentic-claw
Runtime + skills. The body.