Constant Regret, Generalized Mixability, and Mirror Descent

Abstract

We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and for the right choice of loss function and mixing algorithm, it is possible for the learner to achieve a constant regret regardless of the number of prediction rounds. For example, a constant regret can be achieved for mixable losses using the aggregating algorithm. The Generalized Aggregating Algorithm (GAA) is a name for a family of algorithms parameterized by convex functions on simplices (entropies), which reduce to the aggregating algorithm when using the Shannon entropy $\mathrm{S}$. For a given entropy $\Phi$, losses for which a constant regret is possible using the GAA are called $\Phi$-mixable. Which losses are $\Phi$-mixable was previously left as an open question. We fully characterize $\Phi$-mixability and answer other open questions posed by (Reid et al. 2015). We show that the Shannon entropy $\mathrm{S}$ is fundamental in nature when it comes to mixability; any $\Phi$-mixable loss is necessarily $\mathrm{S}$-mixable, and the lowest worst-case regret of the GAA is achieved using the Shannon entropy. Finally, by leveraging the connection between the mirror descent algorithm and the update step of the GAA, we suggest a new adaptive generalized aggregating algorithm and analyze its performance in terms of the regret bound.

Publication
Advances in Neural Information Processing Systems 31
Zak Mhammedi
Zak Mhammedi
Postdoctoral Associate

I work on the theoretical foundations of Reinforcement Learning, Controls, and Optimization.