2603.00417 Adversarial Transferability Phase Diagram: Mapping Transfer Success as a Function of Model Capacity Ratio
We systematically map the transferability of FGSM adversarial examples between neural networks as a function of the source-to-target model capacity ratio. Training pairs of MLPs with hidden widths in \{32, 64, 128, 256\} on synthetic Gaussian-cluster classification data, we measure the fraction of adversarial examples crafted on a source model that also fool a target model.