A personal semantic search engine for your documents and knowledge base
Features β’ Quick Start β’ Usage β’ Architecture β’ API Reference β’ Configuration β’ MCP Integration β’ Troubleshooting β’ Full Technical Writeup
Vector Knowledge Base is a vector database application that transforms your documents into a searchable knowledge base using semantic search. Upload PDFs, Word documents, PowerPoint, Excel, images (with OCR), and code files, then search using natural language to find exactly what you need.
- Semantic Search - Find documents by meaning, not just keywords
- Auto-Clustering - Automatically organize documents into semantic clusters using HDBSCAN (density-based clustering)
- Semantic Cluster Naming - Clusters are automatically named using TF-IDF keyword extraction (e.g., "Shakespeare & Drama", "Python & Programming")
- Cluster-Based Filtering - Filter search results by document clusters for more focused searches
- Batch Upload & Folder Preservation - Drag and drop entire folders to upload, automatically preserving folder structure in your knowledge base
- 3D Embedding Visualization - Interactive 3D visualization of your document embeddings using Three.js
- Multi-Format Support - PDF, DOCX, PPTX, XLSX, CSV, images (OCR), TXT, Markdown, and code files (Python, JavaScript, C#, etc.)
- Intelligent Chunking - AST-aware parsing for code, sentence-boundary awareness for prose
- Folder Organization - Drag-and-drop file management with custom folder hierarchy
- File Viewer - Double-click any file to preview it directly in the browser
- Multi-Page Navigation - Dedicated pages for search, documents, and file management
- Data Management - Export all data as ZIP or reset the entire database with one click
- Modern UI - Clean, responsive interface with dark mode and modular CSS architecture
- Vector Embeddings - Powered by SentenceTransformers (all-mpnet-base-v2, 768-dimensional embeddings)
- High-Performance Search - Qdrant vector database for sub-50ms search queries
- O(1) Document Listing - JSON-based document registry for instant document listing at any scale
- AI Agent Integration (MCP) - Connect Claude Desktop or other AI agents to search, create, and manage documents via Model Context Protocol
Clean, modern dark-mode interface with semantic search and filtering options
- Docker and Docker Compose (recommended)
- OR Python 3.11+ and Docker (for Performance Mode or Manual Installation)
The easiest way to run the entire application:
-
Clone the repository
git clone https://github.com/i3T4AN/Vector-Knowledge-Base.git cd Vector-Knowledge-Base -
Start all services with Docker Compose
docker-compose up -d
-
Open your browser
Navigate to
http://localhost:8001/index.html
That's it! Docker Compose will automatically:
- Start Qdrant vector database
- Build and start the backend API
- Start the frontend server with Nginx
Tip
On first run, the embedding model (~400MB) will be downloaded automatically. This may take a few minutes.
Managing the application:
# View logs
docker-compose logs -f
# Stop all services
docker-compose down
# Rebuild after code changes
docker-compose up -d --buildFor significantly faster embedding generation, run the backend natively with GPU support:
| Mode | Embedding Speed | Best For |
|---|---|---|
| Docker (CPU) | ~18s per batch | Cross-platform compatibility |
| Native (Apple M1/M2/M3) | ~3s per batch (6x faster) | Mac with Apple Silicon |
| Native (NVIDIA CUDA) | ~1s per batch (18x faster) | Windows/Linux with NVIDIA GPU |
Setup:
-
Start Qdrant and Frontend in Docker
docker-compose -f docker-compose.native.yml up -d # Or simply: docker-compose up -d qdrant frontend -
Run the backend natively
macOS/Linux:
./scripts/start-backend-native.sh
Windows:
scripts\start-backend-native.bat
The script will:
- Create a virtual environment
- Install dependencies
- Auto-detect your GPU (MPS for Apple Silicon, CUDA for NVIDIA)
- Start the backend with GPU acceleration
Note
GPU acceleration requires PyTorch with MPS support (macOS 12.3+) or CUDA toolkit (Windows/Linux with NVIDIA).
Deployment Options Summary:
| Mode | Command | GPU | Speed | Use Case |
|---|---|---|---|---|
| Full Docker | docker-compose up -d |
β | ~18s/batch | Production, cross-platform |
| Native (Mac/Linux) | ./scripts/start-backend-native.sh |
β | ~1-3s/batch | Development, large uploads |
| Native (Windows) | scripts\start-backend-native.bat |
β | ~1-3s/batch | Development, large uploads |
For development or if you prefer not to use Docker for the backend:
-
Clone the repository
git clone https://github.com/i3T4AN/Vector-Knowledge-Base.git cd Vector-Knowledge-Base -
Start Qdrant with Docker
docker run -d -p 6333:6333 -v ./qdrant_storage:/qdrant/storage:z qdrant/qdrant
-
Set up Python environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate python -m pip install -r requirements.txt
-
Start the backend server
cd backend python -m uvicorn main:app --reload --port 8000 --host 0.0.0.0 -
Start the frontend server
cd frontend python -m http.server 8001[!NOTE] On Mac, use
python3instead ofpythonif the command is not found. -
Open your browser
Navigate to
http://localhost:8001/index.html
Tip
On first run, the embedding model (~400MB) will be downloaded automatically. This may take a few minutes.
- Navigate to the My Documents page (
documents.html) - Drag and drop files or click to browse
- Batch Upload: Drop entire folders to upload multiple files at once
- Folder Preservation: Folder structure is automatically maintained in the "Files" tab
- Add metadata (course name, document type, tags)
- Click Upload
- Monitor progress in the Queue card for batch uploads
The backend will:
- Extract text from your files
- Split content into intelligent chunks
- Generate vector embeddings
- Store in Qdrant for fast retrieval
- Organize files in folders matching your source structure
Upload interface with drag-and-drop support, batch queue, and document management
- Navigate to the Search page (
index.html) - Enter your query in natural language
- Optionally filter by:
- Cluster - Filter results by document cluster (requires clustering first)
- Date range - Filter by upload date
- Result limit - Number of results to display (5, 10, or 20)
- Click Search to see ranked results with similarity scores
Semantic search results showing similarity scores and relevant text snippets
- Navigate to the Search page (
index.html) - Upload several documents first (clustering works best with 5+ documents)
- Click Auto-Cluster Documents
- The system will:
- Automatically determine the optimal number of clusters using HDBSCAN
- Group similar documents together using density-based clustering
- Generate semantic names for each cluster (e.g., "Python & Programming")
- Update document metadata with cluster assignments and names
- Use the Cluster filter to search within specific document groups (shown as "ID: Cluster Name")
Interactive 3D embedding space showing document clusters and search results with cluster information
Use the Files page (files.html) to:
- Create custom folders
- Drag files between folders
- View unsorted files in the sidebar
- Navigate with breadcrumb navigation
- Double-click any file to open it in the built-in file viewer
File management interface with folder hierarchy and drag-and-drop organization
In the My Documents tab, you can:
- Export Data - Download all uploaded files as a ZIP archive for backup
- Delete Data - Reset the entire database (requires confirmation)
- Clears all vector embeddings from Qdrant
- Removes all folder organization
- Deletes all uploaded files
- This action is irreversible
- Navigate to the Search page (index.html)
- Click Show 3D Embedding Space to reveal the interactive visualization
- Explore your document corpus in 3D space
- Enter a search query to see:
- Your query point highlighted in gold
- Top matching documents connected with colored lines
- Line colors indicating similarity (green = high, red = low)
- Hover over points to see document details
βββββββββββββββ
β Frontend β Multi-Page Application
β (Port 8001)β index.html, documents.html, files.html
ββββββββ¬βββββββ
β HTTP
βΌ
βββββββββββββββ βββββββββββββββ
β Backend β βββ β MCP Server β AI Agent Integration
β (Port 8000)β β (/mcp) β (Claude Desktop, etc.)
ββββββββ¬βββββββ βββββββββββββββ
β
βββββ΄βββββ¬βββββββββββββ
βΌ βΌ βΌ
ββββββββ ββββββββ ββββββββββββ
βSQLiteβ βQdrantβ βSentence β
β(Meta)β β(Vec) β βTransform β
ββββββββ ββββββββ ββββββββββββ
Port 6333
ββββββββββββ βββββββββββββ βββββββββββ ββββββββββββ ββββββββββ
β Upload β -> β Extractor β -> β Chunker β -> β Embedder β -> β Qdrant β
β (File) β β (Text) β β (Chunks)β β(Vectors) β β (Store)β
ββββββββββββ βββββββββββββ βββββββββββ ββββββββββββ ββββββββββ
How Chunks Relate to Documents:
- Each uploaded file is processed by the appropriate Extractor to extract raw text
- The Chunker splits the text into smaller pieces (default: 500 tokens with 50-token overlap)
- Each chunk is converted to a 768-dimensional vector by the Embedder (SentenceTransformers)
- Chunks are stored in Qdrant with metadata linking them back to the original document
- A single document may produce 10-100+ chunks depending on its length
- Search queries match against individual chunks, but results show which document they came from
Multi-Page Application (MPA):
index.html- Search interface with 3D visualizationdocuments.html- Document upload and managementfiles.html- File organization with drag-and-drop
Pages communicate with the backend API and share a modular CSS architecture.
Backend:
- FastAPI - Modern async web framework
- Qdrant - High-performance vector database (Dockerized)
- SentenceTransformers - State-of-the-art embeddings
- SQLite - Lightweight metadata storage
Frontend:
- Vanilla JavaScript (ES6+ modules)
- Modular CSS architecture (7 organized stylesheets)
- Three.js for 3D embedding visualization
- Fetch API for backend communication
Extractor Architecture:
The application uses a factory pattern for modular file processing:
- ExtractorFactory - Routes files to appropriate extractors based on file extension
- BaseExtractor - Interface that all extractors implement with
extract(file_path) β strmethod
Specialized Extractors:
- PDFExtractor - Uses
pypdffor PDF text extraction - DocxExtractor - Uses
docx2txtfor Word document parsing - PptxExtractor - Uses
python-pptxfor PowerPoint presentations - XlsxExtractor - Uses
openpyxlfor Excel spreadsheets with multi-sheet support - CsvExtractor - Uses
pandasfor CSV file processing with configurable delimiters - ImageExtractor - Uses
pytesseract+PILfor OCR on images (.jpg, .jpeg, .png, .webp) - TextExtractor - Handles plain text and Markdown files (.txt, .md)
- CodeExtractor - AST-aware parsing for Python code with function/class extraction
- CsExtractor - Dedicated C# file parsing with namespace and method detection
The frontend uses:
- base.css - CSS variables, reset, body, container
- animations.css - Keyframe animations and transitions
- components.css - Buttons, cards, forms, tables
- layout.css - Page-specific layouts
- filesystem.css - File manager UI
- batch-upload.css - Batch upload queue card and status indicators
- modals.css - Modal overlays and notifications
POST /upload
Content-Type: multipart/form-data
Parameters:
- file: File (required)
- category: string (required)
- tags: string[] (optional)
- relative_path: string (optional) - Folder path for batch uploads (e.g., "projects/homework")
Response: {
"filename": "doc.pdf",
"chunks_count": 42,
"document_id": "uuid"
}POST /search
Content-Type: application/json
Body: {
"query": "What is semantic search?",
"extension": ".pdf",
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"limit": 10,
"cluster_filter": "0" // Optional: filter by cluster ID
}
Response: {
"results": [
{
"text": "chunk content",
"score": 0.89,
"metadata": {
"cluster": 0,
...
}
}
]
}GET /documents
Response: [
{
"filename": "doc.pdf",
"category": "CS101",
"upload_date": 1705320000.0
}
]DELETE /documents/{filename}
Response: {
"message": "Document deleted successfully"
}GET /folders- List all foldersPOST /folders- Create folderPUT /folders/{id}- Update folderDELETE /folders/{id}- Delete empty folderPOST /files/move- Move file to folderGET /files/unsorted- List unsorted filesGET /files/in_folders- Get file-to-folder mappingsGET /files/content/{filename}- Retrieve file content for viewing
POST /api/cluster
Response: {
"message": "Clustering complete",
"total_documents": 150,
"clusters": 5
}
# Automatically clusters all documents in the database
# Automatically determines optimal number of clusters using HDBSCAN density-based algorithmGET /api/clusters
Response: {
"clusters": [0, 1, 2, 3, 4]
}
# Returns list of all cluster IDs currently assigned to documentsGET /api/embeddings/3d
Response: {
"coords": [[x, y, z], ...], // PCA-reduced 3D coordinates
"point_ids": ["uuid1", ...],
"metadata": [{"filename": "doc.pdf", ...}, ...]
}
# Returns 3D coordinates for all document chunks (cached for performance)POST /api/embeddings/3d/query
Content-Type: application/json
Body: {
"query": "machine learning",
"k": 5 // Number of nearest neighbors
}
Response: {
"query_coords": [x, y, z],
"neighbors": [{"id": "uuid", "coords": [x, y, z], "score": 0.89}, ...]
}
# Transforms a search query to 3D space and finds nearest neighborsPOST /upload-batch
Content-Type: multipart/form-data
Parameters:
- files: File[] (required) - Multiple files to upload
- category: string (required)
- tags: string[] (optional)
- relative_path: string (optional) - Shared folder path for all files
Response: {
"results": [...], // Array of upload results
"total": 10,
"successful": 10,
"failed": 0
}
# Optimized batch upload for files sharing the same folderGET /api/jobs
Response: {
"jobs": [
{"id": "uuid", "type": "clustering", "status": "completed", "progress": 100}
]
}
# List all background jobs (clustering, etc.)GET /api/jobs/{job_id}
Response: {
"id": "uuid",
"type": "clustering",
"status": "running",
"progress": 45,
"created_at": "2024-01-15T10:30:00",
"message": "Processing..."
}
# Get status of a specific background jobGET /export
Response: application/zip
# Downloads a ZIP archive of all uploaded filesDELETE /reset
Response: {
"status": "success",
"message": "All data has been reset"
}
# WARNING: Irreversibly deletes all dataCreate a .env file in the project root directory (copy from .env.example):
# Qdrant Configuration
QDRANT_HOST=localhost # Default: "localhost". Use "qdrant" when running in Docker Compose
QDRANT_PORT=6333 # Default: 6333
QDRANT_COLLECTION=vector_db # Default: "vector_db"
# File Upload Settings
UPLOAD_DIR=uploads # Default: "uploads" (relative to backend directory)
MAX_FILE_SIZE=52428800 # Default: 50MB (50 * 1024 * 1024 bytes)
# Embedding Model
EMBEDDING_MODEL=all-mpnet-base-v2 # Default: "all-mpnet-base-v2" (768-dimensional)
# Compute Device (for native mode)
DEVICE=auto # Options: "auto", "cpu", "cuda", "mps"
# auto = detect best available (MPS > CUDA > CPU)
# Chunking Settings
CHUNK_SIZE=500 # Default: 500 characters per chunk
CHUNK_OVERLAP=50 # Default: 50 characters overlap between chunks
# Security
ADMIN_KEY= # Optional: protects /reset endpoint. Leave empty to disable.
# Rate Limiting (High defaults for personal use)
RATE_LIMIT_UPLOAD=1000/minute # Default: 1000/minute (won't affect normal use)
RATE_LIMIT_SEARCH=1000/minute # Default: 1000/minute
RATE_LIMIT_RESET=60/minute # Default: 60/minute (stricter for destructive ops)Note
When using Docker Compose, QDRANT_HOST is automatically set to qdrant (the service name) in docker-compose.yml. You only need a .env file for manual installations or to override defaults.
The Vector Knowledge Base includes built-in support for the Model Context Protocol (MCP), allowing AI agents like Claude Desktop to interact with your knowledge base directly.
- Node.js 18+ - Required for the MCP bridge
- Download from nodejs.org (LTS recommended)
- Or on macOS:
brew install node
-
Ensure the backend is running
# Docker mode docker-compose up -d # OR Native mode - macOS/Linux ./scripts/start-backend-native.sh # OR Native mode - Windows scripts\start-backend-native.bat
-
Locate the Claude Desktop config file
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
[!TIP] In Claude Desktop: Claude β Settings β Developer β Edit Config
- macOS:
-
Add the MCP server configuration
Edit
claude_desktop_config.json:{ "mcpServers": { "vector-knowledge-base": { "command": "npx", "args": ["-y", "mcp-remote", "http://localhost:8000/mcp"] } } }[!NOTE] On macOS, if
npxis not in PATH, use the full path:/usr/local/bin/npx -
Restart Claude Desktop
- Fully quit (Cmd+Q / Alt+F4), don't just close the window
- Reopen Claude Desktop
- The MCP tools should now be available
Once connected, just ask Claude naturally:
| Example Prompt | Action |
|---|---|
| "Search my knowledge base for machine learning" | Semantic search |
| "List all documents in my knowledge base" | List documents |
| "Show me the document clusters" | Get clusters |
| "Run auto-clustering on my documents" | Cluster documents |
| "Check if my vector database is healthy" | Health check |
| "Get 3D embedding data for cluster 1" | Visualization data |
| "Create a summary document with my notes" | Create text document |
Claude Desktop searching and listing documents via MCP integration
Important
Claude Desktop has limitations when interacting with the knowledge base via MCP.
What Claude CAN do:
- β Search documents semantically
- β List all documents and folders
- β Delete documents by filename
- β Run clustering and get cluster info
- β Get 3D embedding coordinates for visualization
- β Check system health
- β Create text documents (.txt, .md, .json) - Claude can generate content and save it to your knowledge base
What Claude CANNOT do:
- β Upload binary files - PDFs, Word docs, images require multipart uploads which MCP cannot provide (at least from what I found with Claude Desktop)
- β Access your filesystem - Claude cannot read files from paths like
/Users/.../file.pdf
To upload files, use one of these methods instead:
- Web interface at
http://localhost:8001/documents.html - curl command:
curl -X POST http://localhost:8000/upload \ -F "file=@/path/to/document.pdf" \ -F "category=my-category"
| Tool | Description |
|---|---|
health_check |
Check if the API is running |
get_allowed_extensions |
Get list of supported file types |
search_documents |
Semantic search across all documents |
list_documents |
List all uploaded documents |
delete_document |
Delete a document by filename |
get_folders |
List folder structure |
create_folder |
Create a new folder |
update_folder |
Rename or move a folder |
delete_folder |
Delete an empty folder |
move_file |
Move file to folder |
get_unsorted_files |
List files not in any folder |
get_files_in_folders |
Get file-to-folder mappings |
cluster_documents |
Run auto-clustering |
get_clusters |
Get cluster information |
get_embeddings_3d |
Get 3D visualization coordinates |
transform_query_3d |
Project query into 3D space |
get_job_status |
Check background job progress |
mcp_create_document |
Create text documents (.txt, .md, .json) |
Tip
Claude can create searchable text documents using mcp_create_document. Ask it to "create a summary", "write notes", or "save a document" and it will add the content to your knowledge base.
MCP settings are configured in config.py (not in .env):
MCP_ENABLED=true # Enable/disable MCP endpoint
MCP_PATH=/mcp # URL path for MCP server
MCP_NAME=Vector Knowledge Base # Display name
MCP_AUTH_ENABLED=false # Enable OAuth (production)"Server disconnected" error in Claude Desktop:
- Ensure the backend is running:
curl http://localhost:8000/health - Check that Node.js is installed:
node --version - Try the full path to npx:
/usr/local/bin/npx
MCP tools not appearing:
- Fully quit and reopen Claude Desktop
- Check the Claude Desktop logs for errors
- Verify the config JSON is valid (no trailing commas)
Caution
MCP provides AI agents with full access to your knowledge base. In production environments, enable MCP_AUTH_ENABLED=true for OAuth protection.
If you see "Connection refused" errors:
# Check if Qdrant is running
docker ps
# Restart Qdrant container
docker restart <container-id>
# Or start a new container
docker run -d -p 6333:6333 -v ./qdrant_storage:/qdrant/storage:z qdrant/qdrantWarning
Dependency conflicts between sentence-transformers and huggingface-hub can cause startup failures.
Solution:
pip install --upgrade sentence-transformers huggingface-hubCheck supported file types:
- Documents:
.pdf,.docx,.pptx,.ppt,.xlsx,.csv,.txt,.md - Images:
.jpg,.jpeg,.png,.webp(OCR-processed) - Code:
.py,.js,.java,.cpp,.html,.css,.json,.xml,.yaml,.yml,.cs
Maximum file size: 50MB (configurable)
If you see "Failed to fetch" errors in the browser console:
-
Verify backend is running on port 8000:
curl http://127.0.0.1:8000/health
-
Check
frontend/config.jsuses127.0.0.1(notlocalhost):const API_BASE_URL = 'http://127.0.0.1:8000';
This avoids IPv6/IPv4 resolution issues on some systems.
Vector-Knowledge-Base/
βββ backend/
β βββ extractors/
β β βββ __init__.py
β β βββ base.py
β β βββ factory.py
β β βββ pdf_extractor.py
β β βββ docx_extractor.py
β β βββ pptx_extractor.py
β β βββ xlsx_extractor.py
β β βββ csv_extractor.py
β β βββ text_extractor.py
β β βββ code_extractor.py
β β βββ cs_extractor.py
β β βββ image_extractor.py
β βββ uploads/ # Uploaded files (gitignored except .gitkeep)
β β βββ .gitkeep
β βββ data/ # Runtime data (auto-created)
β β βββ documents.json # Document registry for O(1) listing
β βββ main.py
β βββ vector_db.py
β βββ embedding_service.py
β βββ ingestion.py
β βββ chunker.py
β βββ clustering.py
β βββ filesystem_db.py
β βββ document_registry.py # O(1) document listing registry
β βββ dimensionality_reduction.py
β βββ jobs.py # Background task tracking
β βββ config.py
β βββ constants.py # Shared constants
β βββ mcp_server.py # MCP server integration
β βββ exceptions.py
βββ frontend/
β βββ css/
β β βββ base.css
β β βββ animations.css
β β βββ components.css
β β βββ layout.css
β β βββ filesystem.css
β β βββ batch-upload.css
β β βββ modals.css
β βββ js/
β β βββ embedding-visualizer.js
β βββ index.html
β βββ documents.html
β βββ files.html
β βββ config.js
β βββ constants.js
β βββ search.js
β βββ upload.js
β βββ documents.js
β βββ filesystem.js
β βββ notifications.js
β βββ favicon.ico
βββ scripts/
β βββ start-backend-native.sh # GPU mode startup (Unix)
β βββ start-backend-native.bat # GPU mode startup (Windows)
βββ screenshots/
βββ Docs/
β βββ Vector_Knowledge_Base_Technical_Report.pdf # Full technical documentation
βββ qdrant_storage/ # Created at runtime (gitignored)
βββ uploads/ # Created at runtime by Docker (gitignored)
βββ backend_db/ # Created at runtime (gitignored)
βββ Dockerfile
βββ docker-compose.yml # Full Docker deployment
βββ docker-compose.native.yml # Native backend mode
βββ nginx.conf
βββ requirements.txt
βββ requirements.in
βββ LICENSE
βββ README.md
- Upload: ~2-5 seconds for typical PDF
- Search: 100-500ms depending on corpus size (sub-50ms for <10k vectors)
- Embedding: ~50-100ms per chunk
- Capacity: Scales to 100k+ documents with Qdrant
Built with β€οΈ using FastAPI, Qdrant, and SentenceTransformers



