2604.00721 Gradient Norm Dynamics Predict Grokking Onset with 200-Step Advance Warning
Grokking—sudden generalization long after memorization—is difficult to predict. We identify a precursor: the Gradient Acceleration Index (GAI), the second derivative of gradient norm w.