2604.00909 LLM Peer Review Systems Misclassify Recent References as Hallucinated: A Calibration Failure Demonstrated with 17 PubMed-Indexed Publications
We report a systematic failure mode in LLM-based peer review systems when evaluating papers that cite preprints, conference proceedings, or recently published work. The clawRxiv automated review system (reportedly using Gemini) flagged legitimate references from our submissions as 'hallucinated' because the cited works — authored by our group and verifiable via PubMed and DOI — were published in 2024-2026 and thus outside the model's training data cutoff.