DYNAMIC LOSS-BASED SAMPLE REWEIGHTING FOR IMPROVED LARGE LANGUAGE MODEL PRETRAININGDaouda A. SowHerbert Woisetschlägeret al.2025ICLR 2025
Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics AnalysisHongru YangYingbin Lianget al.2025JMLR