CharED: Character-wise Ensemble Decoding for Large Language ModelsKevin GuEva Tueckeet al.2024ICML 2024
Humans Linguistically Align to their Conversational Partners, and Language Models Should TooRachel OstrandSara Berger2024ICML 2024
How Do Nonlinear Transformers Acquire Generalization-Guaranteed CoT Ability?Hongkang LiMeng Wenget al.2024ICML 2024