Beyond Cost-Quality: Privacy-Aware Routing for Local-to-Cloud LLM Escalation
A five-class architecture for classifying query sensitivity and routing to an admissible model set. PRISM is a related but distinct problem.
Thesis: Every query has a sensitivity profile. The right architecture classifies first, routes to an admissible model set second, and audits the routing decision for every call.
Working paper. Full text pending.
Last updated: February 15, 2026