AI engineering · Portfolio

Jurist | RAG pipeline, with citations you can trust.

A multi-agent legal Q&A system over Dutch tenancy law that grounds every claim in real Civil Code articles and ECLI case citations.

Year
2026
Role
Solo build
Status
Portfolio

The problem

Legal AI hallucinates. It invents statute numbers, paraphrases rulings into the wrong direction, and cites cases that don't exist. For tenancy questions — eviction, deposit, repairs — that's the kind of error that costs people money or homes. A generic LLM can't be trusted, and verifying every output by hand defeats the purpose.

What I built

  1. 01Five sequential async-generator agents — decomposer, statute retriever, case retriever, synthesizer, validator — orchestrated without framework dependencies
  2. 02A 218-node knowledge graph of the Dutch Civil Code traversed by a Sonnet agent for statute retrieval
  3. 03Local bge-m3 embeddings over ~49K case-law chunks in LanceDB, with Haiku reranking the top candidates
  4. 04Synthesis with enum-constrained citation tokens — every claim must reference a real article ID or ECLI
  5. 05Streaming pipeline: typed TraceEvents flow over SSE into a React UI that animates the knowledge graph and tokens in real time

Tech stack

  • Python
  • FastAPI
  • React
  • Anthropic
  • LanceDB
  • Multi-agent pipeline

What I learned

Citation accuracy is an architecture problem, not a prompt problem. Splitting the work across small, single-purpose agents and making each one's output structurally verifiable gets you further than one large prompt with stricter instructions.