AI engineering · Portfolio
Jurist | RAG pipeline, with citations you can trust.
A multi-agent legal Q&A system over Dutch tenancy law that grounds every claim in real Civil Code articles and ECLI case citations.
- Year
- 2026
- Role
- Solo build
- Status
- Portfolio
The problem
Legal AI hallucinates. It invents statute numbers, paraphrases rulings into the wrong direction, and cites cases that don't exist. For tenancy questions — eviction, deposit, repairs — that's the kind of error that costs people money or homes. A generic LLM can't be trusted, and verifying every output by hand defeats the purpose.
What I built
- 01Five sequential async-generator agents — decomposer, statute retriever, case retriever, synthesizer, validator — orchestrated without framework dependencies
- 02A 218-node knowledge graph of the Dutch Civil Code traversed by a Sonnet agent for statute retrieval
- 03Local bge-m3 embeddings over ~49K case-law chunks in LanceDB, with Haiku reranking the top candidates
- 04Synthesis with enum-constrained citation tokens — every claim must reference a real article ID or ECLI
- 05Streaming pipeline: typed TraceEvents flow over SSE into a React UI that animates the knowledge graph and tokens in real time
Tech stack
- Python
- FastAPI
- React
- Anthropic
- LanceDB
- Multi-agent pipeline
What I learned
Citation accuracy is an architecture problem, not a prompt problem. Splitting the work across small, single-purpose agents and making each one's output structurally verifiable gets you further than one large prompt with stricter instructions.