{"id":5390,"date":"2021-12-05T18:33:03","date_gmt":"2021-12-05T13:03:03","guid":{"rendered":"https:\/\/gtl.csa.iisc.ac.in\/hari\/?page_id=5390"},"modified":"2021-12-09T10:17:46","modified_gmt":"2021-12-09T04:47:46","slug":"how-do-we-make-stochastic-bandits-fair","status":"publish","type":"page","link":"https:\/\/gtl.csa.iisc.ac.in\/hari\/publications\/research-snippets\/how-do-we-make-stochastic-bandits-fair\/","title":{"rendered":"How do we make stochastic bandits fair?"},"content":{"rendered":"\n<p>Fairness is a key requirement in all decision making settings. This work focuses on achieving fairness in online learning algorithms. In particular, we study an interesting variant of the stochastic multi-armed bandit problem (MAB) which we call \u201cFair-MAB,\u201d where, in addition to minimizing expected regret, the algorithm must also ensure that each arm is pulled for a certain pre-specified fraction of time-steps. The performance of the algorithm is evaluated based on the regret and fairness guarantees. We show that our algorithm achieves O(log{T}) regret. The proposed class of algorithms remarkably provides a strong deterministic fairness guarantee that holds uniformly over the time horizon. Both the regret and fairness guarantees of our algorithm improve upon the best-known results in the literature.<\/p>\n\n\n\n<p><strong>Reference:<\/strong><\/p>\n\n\n\n<p>Vishakha Patil, Ganesh Ghalme, Vineet Nair, Y. Narahari. Achieving Fairness in the Stochastic Multi-armed Bandit Problem. <strong>Journal of Machine Learning Research (JMLR)<\/strong>,\u00a0 Volume 22, 2021, pp. 1-21.<\/p>\n\n\n\n<p>The above paper is an expanded and enhanced version of the following paper which contained the preliminary results:<\/p>\n\n\n\n<p>Vishakha Patil, Ganesh Ghalme, Vineet Nair, Y. Narahari. Achieving Fairness in the Stochastic Multi-armed Bandit Problem. <strong>AAAI-2020<\/strong>. Pages 5379-5386.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Fairness is a key requirement in all decision making settings. This work focuses on achieving fairness in online learning algorithms. In particular, we study an interesting variant of the stochastic multi-armed bandit problem (MAB) which we call \u201cFair-MAB,\u201d where, in addition to minimizing expected regret, the algorithm must also ensure that each arm is pulled [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":5372,"menu_order":13,"comment_status":"closed","ping_status":"closed","template":"template-researchsnippets.php","meta":{"kt_blocks_editor_width":""},"acf":[],"_links":{"self":[{"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/pages\/5390"}],"collection":[{"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/comments?post=5390"}],"version-history":[{"count":6,"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/pages\/5390\/revisions"}],"predecessor-version":[{"id":5640,"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/pages\/5390\/revisions\/5640"}],"up":[{"embeddable":true,"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/pages\/5372"}],"wp:attachment":[{"href":"https:\/\/gtl.csa.iisc.ac.in\/hari\/wp-json\/wp\/v2\/media?parent=5390"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}