Improved Regret Guarantees for Online Smooth by Ankan @VideoLectures

English

Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback

The study of online convex optimization in the bandit setting was initiated by Kleinberg (2004) and Flaxman et al. (2005). Such a setting models a decision maker that has to make decisions in the face of adversarially chosen convex loss functions. Moreover, the only information the decision maker receives are the losses. The identities of the loss functions themselves are not revealed. In this setting, we reduce the gap between the best known lower and upper bounds for the class of smooth convex functions, i.e. convex functions with a Lipschitz continuous gradient. Building upon existing work on selfconcordant regularizers and one-point gradient estimation, we give the rst algorithm whose expected regret is O(T2=3), ignoring constant and logarithmic factors.

Find OpenCourseWare Online Exams!

Attribution: The Open Education Consortium
http://www.ocwconsortium.org/courses/view/b5dc51ddb96575d9b4bfa5cae1095d86/
Course Home http://videolectures.net/aistats2011_saha_guarantees/

	28 Biology 28 Invertebrates MCQ By OpenStax Start Quiz
©flickr: Luis	Atoms By Carly Allen Start Quiz
	Biology Exam 2 By Vanessa Soledad Start Exam
	Macroeconomics MCQ By Candice Butts Start Quiz
	38 Biology 38 The Musculoskeletal System MCQ By OpenStax Start Quiz
	13 Dr Garry GI Ruminants quiz By Brooke Delaney Start Exam
	Deciduous Forest By Hope Percle Start Quiz
	Computer System Engineering By Robert Morris Start Exam
©flickr: Gareth	Resume Writing MCQ By Abby Sharp Start Quiz
	22 AP 22 Respiratory System MCQ By OpenStax Start Quiz