Activity–weight duality in feed-forward neural networks reveals two co-determinants for generalizationYu FengWei Zhanget al.2023Nature Machine Intelligence
Phases of learning dynamics in artificial neural networks in the absence or presence of mislabeled dataYu FengYuhai Tu2021Machine Learning: Science and Tech.
The inverse variance-flatness relation in stochastic gradient descent is critical for finding flat minimaYu FengYuhai Tu2021PNAS