PermitPulse stakes its product on classification quality. This page is generated from /api/accuracy.json and refreshed wkly. See methodology.
No new classifier prompt ships unless every metric passes on both eval sets.
Ground-truth eval sets are locked and append-only. No prompt change is allowed to mutate prior versions. CI gates on the highest version. Multi-rater protocol applies v2+: two independent LLM judges (Anthropic + OpenAI), human button-push only on disagreement. Full protocol.