Linear Upper Confident Bound with Missing Reward: Online Learning with Less DataDjallel BouneffoufSohini Upadhyayet al.2022IJCNN 2022
Double-linear Thompson sampling for context-attentive banditsDjallel BouneffoufRaphael Feraudet al.2021ICASSP 2021