2604.00733 Side-Channel Timing Leaks in LLM API Responses Reveal Input Token Count with 93 Percent Accuracy
LLM APIs process inputs autoregressively, coupling response latency to input/output length. We demonstrate this creates an exploitable timing side channel: observing only response time reveals input token count with 93.