We consider online learning in Markov decision processes with adversarial reward functions. Depending on the information available to the decision maker, we analyze two scenarios:... Watch Video
We consider online learning in finite stochastic Markovian environments where in each time step a new reward function is chosen by an oblivious adversary. The goal of the learning... Watch Video
We propose a version of the follow-the-perturbed-leader online prediction algorithm in which the cumulative losses are perturbed by independent symmetric random walks. The forecaster... Watch Video
Jobilize.com uses cookies to ensure that you get the best experience. By continuing to use Jobilize.com web-site, you agree to the Terms of Use.