Bandit Optimisation with Large Strategy Sets: Theory and Applications
Seminar Room 1, Newton Institute
AbstractIn this talk, we report recent results on bandit optimisation problems with large strategy (or decision) sets. These problems naturally arise in many contemporary applications found in communication networks, e-commerce, and recommendation systems. We address both stochastic or adversarial settings, depending on the way rewards obtained under various strategies are generated. We provide lower bounds on regret, which provide fundamental performance limits that any online algorithm cannot beat, and develop algorithms that approach these limits. Results are applied to resource allocation in wireless networks, and recommendation systems.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.