Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language ModelsChia-yi HsuYu-Lin Tsaiet al.2024NeurIPS 2024
Score Distillation via Reparametrized DDIMArtem LukoianovHaitz Saez De Ocariz Bordeet al.2024NeurIPS 2024
Privacy without Noisy Gradients: Slicing Mechanism for Generative Model TrainingKristjan GreenewaldYuancheng Yuet al.2024NeurIPS 2024
Distributional Preference Alignment of LLMs via Optimal TransportIgor MelnykYoussef Mrouehet al.2024NeurIPS 2024
Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit DifferenceJiabao JiYujian Liuet al.2024NeurIPS 2024
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational AgentsIvoline NgongSwanand Ravindra Kadheet al.2024NeurIPS 2024
Towards Using Large Language Models and Deep Reinforcement Learning for Inertial Fusion EnergyVadim ElisseevMax Espositoet al.2024NeurIPS 2024
Interleaving Text and Number Embeddings to Solve Mathemathics ProblemsMarvin AlbertsGianmarco Gabrieliet al.2024NeurIPS 2024
Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMsMegh ThakkarYash Moreet al.2024NeurIPS 2024
Enhancing Reasoning to Adapt Large Language Models for Domain-Specific ApplicationsBo WenXin Zhang2024NeurIPS 2024