
Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem

We present an online learning framework tailored towards real-time learning from observed user behavior in search engines and other information retrieval systems. In particular, we only require pairwise comparisons, which were shown to be reliably inferred from implicit feedback. We will present an algorithm with theoretical guarantees as well as simulation results.
