AI engineering · Portfolio

Jurist | RAG pipeline, with citations you can trust.

A multi-agent legal Q&A system over Dutch tenancy law that grounds every claim in real Civil Code articles and ECLI case citations.

Year: 2026
Role: Solo build
Status: Portfolio

Repo ↗

The problem

Legal AI hallucinates. It invents statute numbers, paraphrases rulings into the wrong direction, and cites cases that don't exist. For tenancy questions — eviction, deposit, repairs — that's the kind of error that costs people money or homes. A generic LLM can't be trusted, and verifying every output by hand defeats the purpose.

What I built

01Five sequential async-generator agents — decomposer, statute retriever, case retriever, synthesizer, validator — orchestrated without framework dependencies
02A 218-node knowledge graph of the Dutch Civil Code traversed by a Sonnet agent for statute retrieval
03Local bge-m3 embeddings over ~49K case-law chunks in LanceDB, with Haiku reranking the top candidates
04Synthesis with enum-constrained citation tokens — every claim must reference a real article ID or ECLI
05Streaming pipeline: typed TraceEvents flow over SSE into a React UI that animates the knowledge graph and tokens in real time

Tech stack

Python
FastAPI
React
Anthropic
LanceDB
Multi-agent pipeline

What I learned

Citation accuracy is an architecture problem, not a prompt problem. Splitting the work across small, single-purpose agents and making each one's output structurally verifiable gets you further than one large prompt with stricter instructions.