DDD + Category Theory for Healthcare

Healthcare provider directories suffer from well-documented data quality issues. Studies report 40%+ inaccuracy rates in provider data — wrong addresses, stale credentials, phantom networks — leading to denied claims, patient frustration, and regulatory exposure.

A key contributing factor is that provider data is scattered across multiple bounded contexts (EHR systems, credentialing databases, contracting platforms, public directories) with limited tooling for principled merging, synchronization, or querying.

This project explores one possible approach: applying category theory as a formal foundation for reasoning about these integration challenges. It is not the only way to tackle these problems, and it comes with its own trade-offs (see Limitations), but it offers structural guarantees that ad-hoc approaches typically lack.

Five Structural Results

We show that several well-known categorical constructions map naturally to concrete infrastructure problems in this domain:

#	Problem	Categorical Tool	Module
1	Entity Resolution — merging partial, overlapping records into a single golden record	Colimit in	`fragment.ts`
2	CRDT Merge — reconciling concurrent updates without coordination	Join in a semilattice (not a colimit in )	`crdt.ts`, `semilattice.ts`
3	Schema Translation — safely moving data between different schemas	Adjoint triple	`schema.ts`
4	Event Sourcing — reconstructing state at any point in time	Presheaf over a time poset	`temporal.ts`, `snapshot.ts`
5	Consistency (Sheaf Condition) — guaranteeing convergence across replicas	Sheaf gluing axiom	Verified via chaos tests

Project Components

packages/implementation — TypeScript + fp-ts library implementing all five results with full test coverage
packages/pre-print — LaTeX manuscript with formal proofs and categorical diagrams
packages/docs — This documentation site

Limitations

Learning curve — Category theory introduces unfamiliar abstractions; teams without prior exposure will need ramp-up time.
Scope — The five results address structural integration problems (merging, translation, temporal consistency). They do not cover data entry errors at the source, organizational process failures, or incentive misalignment.
Validation — The implementation is a proof-of-concept project, not a production-hardened system. Real-world adoption would require significant engineering beyond what is shown here.
Alternatives exist — Master Data Management (MDM) platforms, probabilistic record linkage, and event-driven architectures address overlapping concerns with different trade-offs.

Quick Links

Installation — Get up and running
Project Structure — Navigate the monorepo
DDD Primer — Domain-Driven Design in context
Category Theory Primer — Accessible definitions
API Reference — Module-by-module exports
Worked Example — The Dr. Jane Doe walkthrough

Five Structural Results​

Project Components​

Limitations​

Quick Links​

Five Structural Results

Project Components

Limitations

Quick Links