b179386a57118f00a2ec3a2442fc29665bf81675
- PDF: extracts selectable text via pymupdf, falls back to Tesseract OCR for scanned docs - PDF: renders first page as screenshot thumbnail - Images: Tesseract OCR for text extraction, OpenAI vision API fallback for photos - Plain text files: direct decode - All extracted text stored in extracted_text field for search/embedding - Tested: PDF upload → text extracted → AI classified → searchable New deps: pymupdf, pytesseract, Pillow System dep: tesseract-ocr added to both Dockerfiles Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gitea CI Workflows
security.yml
Runs on push/PR to master. Three jobs:
- dependency-audit —
npm audit --audit-level=highfor budget and frontend - secret-scanning — checks for tracked .env/.db files and hardcoded secret patterns
- dockerfile-lint — verifies all Dockerfiles have
USER(non-root) andHEALTHCHECK
Runner Setup
The runner is configured in the Gitea docker-compose at /media/yusiboyz/Media/Scripts/gitea/docker-compose.yml.
What was done:
- Added
[actions] ENABLED = trueto Gitea'sapp.ini - Added
runnerservice (gitea/act_runner) to Gitea's docker-compose - Generated runner token via
docker exec -u git gitea gitea actions generate-runner-token - Token stored in
/media/yusiboyz/Media/Scripts/gitea/.envasRUNNER_TOKEN - Runner registered as
platform-runnerwith labels: ubuntu-latest, ubuntu-24.04, ubuntu-22.04
To regenerate token (if needed):
cd /media/yusiboyz/Media/Scripts/gitea
docker exec -u git gitea gitea actions generate-runner-token
# Update .env with new RUNNER_TOKEN value
docker compose up -d runner
To check runner status:
docker logs gitea-runner
Description
Languages
Svelte
51.2%
Python
24.2%
Swift
13.5%
JavaScript
5.4%
TypeScript
3.3%
Other
2.4%