Discrete Choice Multi-Armed Bandits

Kavli Affiliate: David Muller

| First 5 Authors: Emerson Melo, David Müller, , ,

| Summary:

This paper establishes a connection between a category of discrete choice
models and the realms of online learning and multiarmed bandit algorithms. Our
contributions can be summarized in two key aspects. Firstly, we furnish
sublinear regret bounds for a comprehensive family of algorithms, encompassing
the Exp3 algorithm as a particular case. Secondly, we introduce a novel family
of adversarial multiarmed bandit algorithms, drawing inspiration from the
generalized nested logit models initially introduced by citet{wen:2001}. These
algorithms offer users the flexibility to fine-tune the model extensively, as
they can be implemented efficiently due to their closed-form sampling
distribution probabilities. To demonstrate the practical implementation of our
algorithms, we present numerical experiments, focusing on the stochastic bandit
case.

| Search Query: ArXiv Query: search_query=au:”David Muller”&id_list=&start=0&max_results=3

Read More