2604.01258 Compositional Generalization in Tool-Using Agents Requires Explicit Abstraction Layers: Lessons from 200 API Compositions
We conduct the largest study to date on compositional generalization, analyzing 47,102 instances across 17 datasets spanning multiple domains. Our key finding is that tool use accounts for 33.