feat: brain service — self-contained second brain knowledge manager

Full backend service with:
- FastAPI REST API with CRUD, search, reprocess endpoints
- PostgreSQL + pgvector for items and semantic search
- Redis + RQ for background job processing
- Meilisearch for fast keyword/filter search
- Browserless/Chrome for JS rendering and screenshots
- OpenAI structured output for AI classification
- Local file storage with S3-ready abstraction
- Gateway auth via X-Gateway-User-Id header
- Own docker-compose stack (6 containers)

Classification: fixed folders (Home/Family/Work/Travel/Knowledge/Faith/Projects)
and fixed tags (28 predefined). AI assigns exactly 1 folder, 2-3 tags, title,
summary, and confidence score per item.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Yusuf Suleman
2026-04-01 11:48:29 -05:00
parent 51a8157fd4
commit 8275f3a71b
73 changed files with 24081 additions and 4209 deletions

76
services/brain/README.md Normal file
View File

@@ -0,0 +1,76 @@
# Second Brain Service
A "save everything" knowledge management backend. Captures links, notes, PDFs, images, and documents. AI classifies everything automatically. Supports keyword, semantic, and hybrid search.
## Architecture
```
brain-api → FastAPI REST API (port 8200)
brain-worker → RQ background processor
brain-db → PostgreSQL 16 + pgvector
brain-redis → Redis 7 (job queue)
brain-meili → Meilisearch (keyword search)
brain-browserless → Headless Chrome (JS rendering + screenshots)
```
## Quick Start
```bash
cd services/brain
# Copy and edit env
cp .env.example .env
# Add your OPENAI_API_KEY
# Start the stack
docker compose up -d
# Check health
curl http://localhost:8200/api/health
```
## API Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/health` | Health check |
| GET | `/api/config` | List folders/tags |
| POST | `/api/items` | Create item (link/note) |
| POST | `/api/items/upload` | Upload file |
| GET | `/api/items` | List items (with filters) |
| GET | `/api/items/{id}` | Get item by ID |
| PATCH | `/api/items/{id}` | Update item |
| DELETE | `/api/items/{id}` | Delete item |
| POST | `/api/items/{id}/reprocess` | Re-run AI classification |
| POST | `/api/search` | Keyword search (Meilisearch) |
| POST | `/api/search/semantic` | Semantic search (pgvector) |
| POST | `/api/search/hybrid` | Combined keyword + semantic |
## Gateway Integration
The platform gateway proxies `/api/brain/*` to `brain-api:8200/api/*`.
Auth is handled by the gateway injecting `X-Gateway-User-Id` header.
The brain-api container joins the `pangolin` Docker network so the gateway can reach it.
## Processing Flow
1. User submits URL/note/file → stored immediately as `pending`
2. RQ worker picks it up → status becomes `processing`
3. Worker: fetches content, takes screenshot, extracts text
4. Worker: calls OpenAI for classification (folder, tags, title, summary)
5. Worker: generates embedding via OpenAI
6. Worker: indexes in Meilisearch
7. Status becomes `ready` (or `failed` on error)
## Storage
Files stored locally at `./storage/`. Each item gets a subdirectory:
```
storage/{item_id}/screenshot/screenshot.png
storage/{item_id}/archived_html/page.html
storage/{item_id}/original_upload/filename.pdf
```
S3-compatible storage can be added by implementing `S3Storage` in `app/services/storage.py`.