Jorge Luis Guevara Diaz, Maria Julia De Castro Villafranca Garcia, et al.
AGU Fall 2022
We prove a quantitative connection between the expected sum of rewards of a policy and binary classification performance on created subproblems. This connection holds without any unobservable assumptions (no assumption of independence, small mixing time, fully observable states, or even hidden states) and the resulting statement is independent of the number of states or actions. The statement is critically dependent on the size of the rewards and prediction performance of the created classifiers. We also provide some general guidelines for obtaining good classification performance on the created subproblems. In particular, we discuss possible methods for generating training examples for a classifier learning algorithm.
Jorge Luis Guevara Diaz, Maria Julia De Castro Villafranca Garcia, et al.
AGU Fall 2022
Mohamed Akram Zaytar, Bianca Zadrozny, et al.
EGU 2022
Edwin Pednault, Naoki Abe, et al.
KDD 2002
Tadeusz Pietraszek
ICML 2005