feat: brain service — self-contained second brain knowledge manager

Full backend service with:
- FastAPI REST API with CRUD, search, reprocess endpoints
- PostgreSQL + pgvector for items and semantic search
- Redis + RQ for background job processing
- Meilisearch for fast keyword/filter search
- Browserless/Chrome for JS rendering and screenshots
- OpenAI structured output for AI classification
- Local file storage with S3-ready abstraction
- Gateway auth via X-Gateway-User-Id header
- Own docker-compose stack (6 containers)

Classification: fixed folders (Home/Family/Work/Travel/Knowledge/Faith/Projects)
and fixed tags (28 predefined). AI assigns exactly 1 folder, 2-3 tags, title,
summary, and confidence score per item.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Yusuf Suleman
2026-04-01 11:48:29 -05:00
parent 51a8157fd4
commit 8275f3a71b
73 changed files with 24081 additions and 4209 deletions

View File

@@ -0,0 +1,109 @@
"""Pydantic schemas for API request/response."""
from __future__ import annotations
from datetime import datetime
from typing import Optional
from pydantic import BaseModel, Field
# ── Request schemas ──
class ItemCreate(BaseModel):
type: str = "link"
url: Optional[str] = None
raw_content: Optional[str] = None
title: Optional[str] = None
folder: Optional[str] = None
tags: Optional[list[str]] = None
class ItemUpdate(BaseModel):
title: Optional[str] = None
folder: Optional[str] = None
tags: Optional[list[str]] = None
raw_content: Optional[str] = None
class SearchQuery(BaseModel):
q: str
folder: Optional[str] = None
tags: Optional[list[str]] = None
type: Optional[str] = None
limit: int = Field(default=20, le=100)
offset: int = 0
class SemanticSearchQuery(BaseModel):
q: str
folder: Optional[str] = None
type: Optional[str] = None
limit: int = Field(default=20, le=100)
class HybridSearchQuery(BaseModel):
q: str
folder: Optional[str] = None
tags: Optional[list[str]] = None
type: Optional[str] = None
limit: int = Field(default=20, le=100)
# ── Response schemas ──
class AssetOut(BaseModel):
id: str
asset_type: str
filename: str
content_type: Optional[str] = None
size_bytes: Optional[int] = None
created_at: datetime
model_config = {"from_attributes": True}
class ItemOut(BaseModel):
id: str
type: str
title: Optional[str] = None
url: Optional[str] = None
folder: Optional[str] = None
tags: Optional[list[str]] = None
summary: Optional[str] = None
confidence: Optional[float] = None
processing_status: str
processing_error: Optional[str] = None
metadata_json: Optional[dict] = None
created_at: datetime
updated_at: datetime
assets: list[AssetOut] = []
model_config = {"from_attributes": True}
class ItemList(BaseModel):
items: list[ItemOut]
total: int
class SearchResult(BaseModel):
items: list[ItemOut]
total: int
query: str
class ConfigOut(BaseModel):
folders: list[str]
tags: list[str]
# ── OpenAI classification schema ──
class ClassificationResult(BaseModel):
"""What the AI returns for each item."""
folder: str
tags: list[str] = Field(min_length=2, max_length=3)
title: str
summary: str
confidence: float = Field(ge=0.0, le=1.0)