Founder-led builds. You talk to the person writing the code.
Super Genius Labs is one engineer + the agent stack he wrote. Discovery → POV → ship is one conversation, not a relay race.
Methodology
Discover
Two-week paid POV. We learn the workflow + the regulatory ceiling.
e.g. Genius Care BAA + receptionist clinic profileSpec
Acceptance criteria written down. No prototype until they're locked.
e.g. agent_demo.py turn-by-turn evalsPrototype
Smallest agent that can do the wedge. Evals fail before features land.
e.g. LiveKit voice receptionist v0Harden
Production failure modes. PHI minimization. Fallback to human.
e.g. mic-prompt timeout → phone handoffShip
Live in clinic. Telemetry. Weekly review. You own the code.
e.g. genius-care in productionValues
The first version is the smallest agent that creates one undeniable outcome — not a platform.
Every prompt change runs against a fixed eval set before it ships. Regressions block merge.
When an EMR has no API tier we can sign for, the agent operates the browser. Coverage > coverage rationalization.
PHI minimization, audit trail, BAAs — chosen before architecture, not retrofitted.
