Papers

State Deviation Correction for Offline Reinforcement Learning
Hongchang Zhang, Jianzhun Shao, Yuhang Jiang, Shuncheng He, Guanwen Zhang, Xiangyang Ji
[AAAI-22] Main Track
Robust Action Gap Increasing with Clipped Advantage Learning
Zhe Zhang, Yaozhong Gan, Xiaoyang Tan
[AAAI-22] Main Track
Controlling Underestimation Bias in Reinforcement Learning via Quasi-Median Operation
Wei Wei, Yujia Zhang, Jiye Liang, Lin Li, Yuze Li
[AAAI-22] Main Track
Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning
Jinxin Liu, Donglin Wang, Qiangxing Tian, Zhengyu Chen
[AAAI-22] Main Track
Generalizing Reinforcement Learning through Fusing Self-Supervised Learning into Intrinsic Motivation
Keyu Wu, Min Wu, Zhenghua Chen, Yuecong Xu, Xiaoli Li
[AAAI-22] Main Track
Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms
Liyuan Zheng, Tanner Fiez, Zane Alumbaugh, Benjamin Chasnov, Lillian J. Ratliff
[AAAI-22] Main Track
Programmatic Modeling and Generation of Real-Time Strategic Soccer Environments for Reinforcement Learning
Abdus Salam Azad, Edward Kim, Mark Wu, Kimin Lee, Ion Stoica, Pieter Abbeel, Alberto Sangiovanni-Vincentelli, Sanjit Seshia
[AAAI-22] Main Track
Efficient Continuous Control with Double Actors and Regularized Critics
Jiafei Lyu, Xiaoteng Ma, Jiangpeng Yan, Xiu Li
[AAAI-22] Main Track
Wasserstein Unsupervised Reinforcement Learning
Shuncheng He, Yuhang Jiang, Hongchang Zhang, Jianzhun Shao, Xiangyang Ji
[AAAI-22] Main Track
Numerical Approximations of Log Gaussian Cox Process (Student Abstract)
Francois Buet-Golfouse, Hans Roggeman
[AAAI-22] Student Abstract and Poster Program
Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model
Jialian Li, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu
[AAAI-22] Main Track
Partial Wasserstein Covering
Keisuke Kawano, Satoshi Koide, Keisuke Otaki
[AAAI-22] Main Track
Adaptive Pairwise Weights for Temporal Credit Assignment
Zeyu Zheng, Risto Vuorio, Richard Lewis, Satinder Singh
[AAAI-22] Main Track
Learning to Identify Top Elo Ratings with A Dueling Bandits Approach
Xue Yan, Yali Du, Binxin Ru, Jun Wang, Haifeng Zhang, Xu Chen
[AAAI-22] Main Track
Smoothing Advantage Learning
Yaozhong Gan, Zhe Zhang, Xiaoyang Tan
[AAAI-22] Main Track
Episodic Policy Gradient Training
Hung Le, Majid Abdolshah, Thommen K. George, Kien Do, Dung Nguyen, Svetha Venkatesh
[AAAI-22] Main Track
Policy Optimization with Stochastic Mirror Descent
Long Yang, Yu Zhang, Gang Zheng, Qian Zheng, Pengfei Li, Jianhang Huang, Gang Pan
[AAAI-22] Main Track
Training a Resilient Q-Network against Observational Interference
Chao-Han Huck Yang, I-Te Danny Hung, Yi Ouyang, Pin-Yu Chen
[AAAI-22] Main Track
Controlling the Spread of Two Secrets in Diverse Social Networks (Student Abstract)
Václav Blažej, Dušan Knop, Šimon Schierreich
[AAAI-22] Student Abstract and Poster Program
Optimistic Initialization for Exploration in Continuous Control
Samuel Lobel, Omer Gottesman, Cam Allen, Akhil Bagaria, George Konidaris
[AAAI-22] Main Track
Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy
Fan-Ming Luo, Shengyi Jiang, Yang Yu, Zongzhang Zhang, Yi-Feng Zhang
[AAAI-22] Main Track
We use cookies to store which papers have been visited.
I agree