Weakly Supervised Detection of Hallucinations in LLM ActivationsMiriam RateikeCelia Cintaset al.2023NeurIPS 2023
TRAD: Task-agnostic Representation of the Activation Space in Deep Neural NetworksTanya Leah AkumuCelia Cintaset al.2023IJCAI 2023
Detecting Systematic Deviations in Data and ModelsSkyler SpeakmanGirmaw Abebe Tadesseet al.2023IEEE Computer