A local-first application for semantic video search and discovery. Index videos via YouTube URL or local upload and perform natural language queries to find, play, and download specific moments.
- Semantic Search: Query video content using natural language (e.g., "Where is the battery replaced?").
- Local Vector Storage: Uses ChromaDB for local indexing and storage.
- Hardware Acceleration: Optimized with hardware-accelerated transcoding (h264_videotoolbox) for near-instant clipping.
- Progress Tracking: Real-time feedback during the multi-stage indexing process.
- Segment Downloads: Export any discovered video segment directly to your disk.
- Project-Based Library: Each ingestion creates its own project index so previously indexed videos are preserved.
- Ingest a video from YouTube (
/process) or local upload (/upload). - Backend creates a new
project_idand analyzes the video into semantic segments. - Segment descriptions are embedded and indexed in a project-scoped ChromaDB collection.
- Queries (
/query) search only within the selected project using semantic similarity. - Matching time ranges can be clipped and downloaded via
/clip.
- Python 3.10+
- FFmpeg
- uv (Recommended for dependency management)
- Node.js & npm
- Google AI API Key (Set in
backend/.envasGOOGLE_API_KEY)
The project includes a Makefile for simplified orchestration.
This will configure the Python environment and install frontend dependencies.
make setupCreate a .env file in the backend/ directory:
echo "GOOGLE_API_KEY=your_key_here" > backend/.envLaunches both the Backend and Frontend in a single terminal.
make devPress Ctrl+C to stop both servers.
POST /process: Index a YouTube video via URL.POST /upload: Index a local video file.POST /query: Search within a project index (requiresproject_idandquery).POST /clip: Generate a segment from a project video (requiresproject_id,start_time,end_time).GET /projects: List indexed projects.GET /projects/{project_id}: Get project metadata.DELETE /projects/{project_id}: Delete a project and its indexed data.
- AI: Gemini (Analysis), sentence-transformers (Local Embeddings).
- Database: ChromaDB.
- Media: FFmpeg, yt-dlp.
- Frontend: React, Tailwind CSS.
- Backend: Flask.
MIT
