## Figures and Tables from this paper

- figure 1
- table 1
- table 2
- figure 4

## Topics

Gang Of Bandits (opens in a new tab)Linear Bandit Algorithm (opens in a new tab)Bandit Algorithms (opens in a new tab)Recommendation Systems (opens in a new tab)Contextual Bandits (opens in a new tab)Network Structure (opens in a new tab)Clusters (opens in a new tab)Real-world Datasets (opens in a new tab)

## 151 Citations

- Qingyun WuHuazheng WangQuanquan GuHongning Wang
- 2016

Computer Science

SIGIR

This paper develops a collaborative contextual bandit algorithm, in which the adjacency graph among users is leveraged to share context and payoffs among neighboring users while online updating, and rigorously proves an improved upper regret bound.

- 105
- Highly Influenced
- PDF

- Meng FangD. Tao
- 2014

Computer Science, Mathematics

KDD

This paper formalizes the networked bandit problem and proposes an algorithm that considers not only the selected arm, but also the relationships between arms, in that it decides an arm depending on integrated confidence sets constructed from historical data.

- 27

- Bryce BernDawson D'almeidaWill KnospePaul Reich
- 2020

Computer Science

This paper explores the process of replicating the experiments and results from A Gang of Bandits in order to validate the paper and formalize the question about recommendation systems to be a multi-armed ban-dit problem.

- Highly Influenced
- PDF

- Xiaotong ChengCheng PanS. Maghsudi
- 2023

Computer Science

ICML

This work proposes CLUB-HG, a novel algorithm that integrates a game-theoretic approach into clustering inference and discovers the underlying users’ clusters inContextual bandit algorithms.

- Sharan Vaswani
- 2018

Computer Science, Mathematics

This thesis goes beyond the well-studied multi-armed bandit model to consider structured bandit settings and their applications, and proposes a bootstrapping approach and establishes theoretical regret bounds for it.

- Highly Influenced

- Liu YangBo LiuLeyu LinFeng XiaKai ChenQiang Yang
- 2020

Computer Science

RecSys

The proposed ClexB policy for online RecSys explores knowledge transfer and further aids the inferences about user interests and estimates user clustering more accurately and with less uncertainty via explorable-clustering.

- 16
- PDF

- M. HerbsterStephen PasterisFabio VitaleM. Pontil
- 2021

Computer Science

NeurIPS

Two learning algorithms are presented, GABA-I and GABA-II, which exploit the network structure to bias towards functions of low Ψ values and highlight improvements of both algorithms over running independent standard MABs across users.

- 9
- PDF

- A. CarpentierMichal Valko
- 2016

Computer Science

AISTATS

BARE is proposed, a bandit strategy for which a regret guarantee is proved that scales with the detectable dimension, a problem dependent quantity that is often much smaller than the number of nodes.

- 38
- PDF

- A. GhoshAbishek SankararamanK. Ramchandran
- 2022

Computer Science

ECML/PKDD

This paper seeks to theoretically understand the problem of minimizing regret in an N users heterogeneous stochastic linear bandits framework by studying the problem of minimizing regret in an N users heterogeneous stochastic linear bandits framework.

- PDF

- Trong-The NguyenHady W. Lauw
- 2014

Computer Science

CIKM

This work proposes an algorithm to divide the population of users into multiple clusters, and to customize the bandits to each cluster, and this clustering is dynamic, i.e., users can switch from one cluster to another, as their preferences change.

- 52
- Highly Influenced
- PDF

...

...

## 25 References

- Balázs SzörényiR. Busa-FeketeIstván HegedüsRóbert OrmándiMárk JelasityB. Kégl
- 2013

Computer Science

ICML

This work shows that the probability of playing a suboptimal arm at a peer in iteration t = Ω(log N) is proportional to 1/(Nt) where N denotes the number of peers participating in the network.

- 102
- PDF

- Aleksandrs Slivkins
- 2011

Mathematics, Computer Science

COLT

This work considers similarity information in the setting of contextual bandits, a natural extension of the basic MAB problem, and presents algorithms that are based on adaptive partitions, and take advantage of "benign" payoffs and context arrivals without sacrificing the worst-case performance.

- 437 [PDF]

- S. CaronB. KvetonM. LelargeSmriti Bhagat
- 2012

Computer Science

UAI

This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms and provides efficient algorithms based on upper confidence bounds to leverage this additional information and derive new bounds improving on standard regret guarantees.

- 105
- PDF

- Swapna BuccapatnamA. EryilmazN. Shroff
- 2013

Computer Science

52nd IEEE Conference on Decision and Control

The investigations in this work reveal the significant gains that can be obtained even through static network-aware policies, and proposes a randomized policy that explores actions for each user at a rate that is a function of her network position.

- 38
- PDF

- S. KarH. PoorShuguang Cui
- 2011

Computer Science, Mathematics

IEEE Conference on Decision and Control and…

A collaborative and adaptive distributed allocation rule DA is proposed and is shown to achieve the lower bound on the expected average regret for a connected inter-bandit communication network.

- 32
- PDF

- Lihong LiWei ChuJ. LangfordR. Schapire
- 2010

Computer Science

WWW '10

This work model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks.

- 2,645 [PDF]

- Wei ChuLihong LiL. ReyzinR. Schapire
- 2011

Computer Science, Mathematics

AISTATS

An O (√ Td ln (KT ln(T )/δ) ) regret bound is proved that holds with probability 1− δ for the simplest known upper confidence bound algorithm for this problem.

- 936
- Highly Influential
- PDF

- Yasin Abbasi-YadkoriD. PálCsaba Szepesvari
- 2011

Computer Science, Mathematics

NIPS

A simple modification of Auer's UCB algorithm achieves with high probability constant regret and improves the regret bound by a logarithmic factor, though experiments show a vast improvement.

- 1,540
- PDF

- Varsha DaniThomas P. HayesS. Kakade
- 2008

Mathematics, Computer Science

COLT

A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.

- 829
- PDF

- Shie MannorOhad Shamir
- 2011

Computer Science, Mathematics

NIPS

Practical algorithms with provable regret guarantees are developed, which depend on non-trivial graph-theoretic properties of the information feedback structure and partially-matching lower bounds are provided.

- 204 [PDF]

...

...

## Related Papers

Showing 1 through 3 of 0 Related Papers