feat: brain service — self-contained second brain knowledge manager

Full backend service with: - FastAPI REST API with CRUD, search, reprocess endpoints - PostgreSQL + pgvector for items and semantic search - Redis + RQ for background job processing - Meilisearch for fast keyword/filter search - Browserless/Chrome for JS rendering and screenshots - OpenAI structured output for AI classification - Local file storage with S3-ready abstraction - Gateway auth via X-Gateway-User-Id header - Own docker-compose stack (6 containers) Classification: fixed folders (Home/Family/Work/Travel/Knowledge/Faith/Projects) and fixed tags (28 predefined). AI assigns exactly 1 folder, 2-3 tags, title, summary, and confidence score per item. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 11:48:29 -05:00
parent 51a8157fd4
commit 8275f3a71b
73 changed files with 24081 additions and 4209 deletions
--- a/services/brain/README.md
+++ b/services/brain/README.md
@@ -0,0 +1,76 @@
+# Second Brain Service
+
+A "save everything" knowledge management backend. Captures links, notes, PDFs, images, and documents. AI classifies everything automatically. Supports keyword, semantic, and hybrid search.
+
+## Architecture
+
+```
+brain-api        → FastAPI REST API (port 8200)
+brain-worker     → RQ background processor
+brain-db         → PostgreSQL 16 + pgvector
+brain-redis      → Redis 7 (job queue)
+brain-meili      → Meilisearch (keyword search)
+brain-browserless → Headless Chrome (JS rendering + screenshots)
+```
+
+## Quick Start
+
+```bash
+cd services/brain
+
+# Copy and edit env
+cp .env.example .env
+# Add your OPENAI_API_KEY
+
+# Start the stack
+docker compose up -d
+
+# Check health
+curl http://localhost:8200/api/health
+```
+
+## API Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| GET | `/api/health` | Health check |
+| GET | `/api/config` | List folders/tags |
+| POST | `/api/items` | Create item (link/note) |
+| POST | `/api/items/upload` | Upload file |
+| GET | `/api/items` | List items (with filters) |
+| GET | `/api/items/{id}` | Get item by ID |
+| PATCH | `/api/items/{id}` | Update item |
+| DELETE | `/api/items/{id}` | Delete item |
+| POST | `/api/items/{id}/reprocess` | Re-run AI classification |
+| POST | `/api/search` | Keyword search (Meilisearch) |
+| POST | `/api/search/semantic` | Semantic search (pgvector) |
+| POST | `/api/search/hybrid` | Combined keyword + semantic |
+
+## Gateway Integration
+
+The platform gateway proxies `/api/brain/*` to `brain-api:8200/api/*`.
+
+Auth is handled by the gateway injecting `X-Gateway-User-Id` header.
+
+The brain-api container joins the `pangolin` Docker network so the gateway can reach it.
+
+## Processing Flow
+
+1. User submits URL/note/file → stored immediately as `pending`
+2. RQ worker picks it up → status becomes `processing`
+3. Worker: fetches content, takes screenshot, extracts text
+4. Worker: calls OpenAI for classification (folder, tags, title, summary)
+5. Worker: generates embedding via OpenAI
+6. Worker: indexes in Meilisearch
+7. Status becomes `ready` (or `failed` on error)
+
+## Storage
+
+Files stored locally at `./storage/`. Each item gets a subdirectory:
+```
+storage/{item_id}/screenshot/screenshot.png
+storage/{item_id}/archived_html/page.html
+storage/{item_id}/original_upload/filename.pdf
+```
+
+S3-compatible storage can be added by implementing `S3Storage` in `app/services/storage.py`.