Contextual bandit algorithms with supervised learning guaranteesAlina BeygelzimerJohn Langfordet al.2011JMLR