A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Honghao Wei, Xin Liu, Lei Ying

[AAAI-22] Main Track
Abstract: Model-free reinforcement learning (RL) algorithms are known to be memory and computation-efficient compared with model-based approaches. This paper presents a model-free RL algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs), which achieves sub-linear regret and zero constraint violation. To the best of our knowledge, this is the first {\em model-free} algorithm for general CMDPs in the infinite-horizon average-reward setting with provable guarantees.

Introduction Video

Sessions where this paper appears

  • Poster Session 1

    Thu, February 24 4:45 PM - 6:30 PM (+00:00)
    Blue 1
    Add to Calendar

  • Poster Session 8

    Sun, February 27 12:45 AM - 2:30 AM (+00:00)
    Blue 1
    Add to Calendar