Skip to content

Beyond Cost-Quality: Privacy-Aware Routing for Local-to-Cloud LLM Escalation

A five-class architecture for classifying query sensitivity and routing to an admissible model set. PRISM is a related but distinct problem.

Thesis: Every query has a sensitivity profile. The right architecture classifies first, routes to an admissible model set second, and audits the routing decision for every call.

Working paper. Full text pending.

Last updated: February 15, 2026