A Tractable Algorithm for Finite-Horizon Continuous Reinforcement Learning

Show simple item record

dc.contributor.author Gampa, P.
dc.contributor.author Kondamudi, S.S.
dc.contributor.author Kailasam, L.
dc.date.accessioned 2021-01-05T05:22:41Z
dc.date.available 2021-01-05T05:22:41Z
dc.date.issued 2019-02
dc.identifier.issn 978-172812662-3
dc.identifier.uri http://localhost:8080/xmlui/handle/123456789/1234
dc.description.abstract We consider the finite horizon continuous reinforcement learning problem. Our contribution is three-fold. First,we give a tractable algorithm based on optimistic value iteration for the problem. Next,we give a lower bound on regret of order Ω(T2/3) for any algorithm discretizes the state space, improving the previous regret bound of Ω(T1/2) of Ortner and Ryabko [1] for the same problem. Next,under the assumption that the rewards and transitions are Hölder Continuous we show that the upper bound on the discretization error is const.Ln-α T. Finally, we give some simple experiments to validate our propositions. © 2019 IEEE. en_US
dc.language.iso en_US en_US
dc.publisher Institute of Electrical and Electronics Engineers Inc. en_US
dc.relation.ispartofseries Proceedings - 2019 2nd International Conference on Intelligent Autonomous Systems, ICoIAS 2019;
dc.subject Reinforcement Learning en_US
dc.subject Markov Decision Process(MDP) en_US
dc.subject Regret en_US
dc.subject Continuous State Space en_US
dc.subject Bonus en_US
dc.subject Finite Horizon en_US
dc.title A Tractable Algorithm for Finite-Horizon Continuous Reinforcement Learning en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search in IDR


Advanced Search

Browse

My Account