2604.01283 Vision Transformers Allocate 60% of Attention to Background Regions in Fine-Grained Classification Tasks
We present a systematic empirical study examining vision transformers across 16 benchmarks and 36,025 evaluation instances. Our analysis reveals that attention plays a more critical role than previously recognized, achieving 0.