【强化学习论文合集】ICML-2021 强化学习论文
创始人
2024-02-16 11:52:56
0

强化学习(Reinforcement Learning, RL),又称再励学习、评价学习或增强学习,是机器学习的范式和方法论之一,用于描述和解决智能体(agent)在与环境的交互过程中通过学习策略以达成回报最大化或实现特定目标的问题。
本专栏整理了近几年国际顶级会议中,涉及强化学习(Reinforcement Learning, RL)领域的论文。顶级会议包括但不限于:ICML、AAAI、IJCAI、NIPS、ICLR、AAMAS、CVPR、ICRA等。

在这里插入图片描述

今天给大家分享的是2021年国际机器学习会议(International Conference on Machine Learning, ICML)中涉及“强化学习”主题的论文。ICML如今已发展为由国际机器学习学会(IMLS)主办的年度机器学习国际顶级会议。

  • [1]. Safe Reinforcement Learning with Linear Function Approximation.
  • [2]. Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees.
  • [3]. Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision.
  • [4]. Reinforcement Learning of Implicit and Explicit Control Flow Instructions.
  • [5]. Learning Routines for Effective Off-Policy Reinforcement Learning.
  • [6]. Goal-Conditioned Reinforcement Learning with Imagined Subgoals.
  • [7]. Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment.
  • [8]. Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning.
  • [9]. Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills.
  • [10]. Improved Corruption Robust Algorithms for Episodic Reinforcement Learning.
  • [11]. Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning.
  • [12]. Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing.
  • [13]. Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning.
  • [14]. Offline Reinforcement Learning with Pseudometric Learning.
  • [15]. Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation.
  • [16]. SAINT-ACC: Safety-Aware Intelligent Adaptive Cruise Control for Autonomous Vehicles Using Deep Reinforcement Learning.
  • [17]. Kernel-Based Reinforcement Learning: A Finite-Time Analysis.
  • [18]. Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning.
  • [19]. Reinforcement Learning Under Moral Uncertainty.
  • [20]. Self-Paced Context Evaluation for Contextual Reinforcement Learning.
  • [21]. Model-based Reinforcement Learning for Continuous Control with Posterior Sampling.
  • [22]. Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach.
  • [23]. PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning.
  • [24]. A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation.
  • [25]. Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning.
  • [26]. Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective.
  • [27]. Detecting Rewards Deterioration in Episodic Reinforcement Learning.
  • [28]. UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning.
  • [29]. Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning.
  • [30]. Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient.
  • [31]. Logarithmic Regret for Reinforcement Learning with Linear Function Approximation.
  • [32]. Generalizable Episodic Memory for Deep Reinforcement Learning.
  • [33]. Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning.
  • [34]. Randomized Exploration in Reinforcement Learning with General Value Function Approximation.
  • [35]. Emphatic Algorithms for Deep Reinforcement Learning.
  • [36]. Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations.
  • [37]. Reward Identification in Inverse Reinforcement Learning.
  • [38]. A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning.
  • [39]. A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning.
  • [40]. High Confidence Generalization for Reinforcement Learning.
  • [41]. Offline Reinforcement Learning with Fisher Divergence Critic Regularization.
  • [42]. Revisiting Peng’s Q(λ) for Modern Reinforcement Learning.
  • [43]. SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning.
  • [44]. PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training.
  • [45]. Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot.
  • [46]. MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning.
  • [47]. Parallel Droplet Control in MEDA Biochips using Multi-Agent Reinforcement Learning.
  • [48]. Cooperative Exploration for Multi-Agent Deep Reinforcement Learning.
  • [49]. Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition.
  • [50]. Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices.
  • [51]. A Sharp Analysis of Model-based Reinforcement Learning with Self-Play.
  • [52]. Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning.
  • [53]. Inverse Constrained Reinforcement Learning.
  • [54]. Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity.
  • [55]. Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs.
  • [56]. Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks.
  • [57]. Counterfactual Credit Assignment in Model-Free Reinforcement Learning.
  • [58]. Offline Meta-Reinforcement Learning with Advantage Weighting.
  • [59]. Emergent Social Learning via Multi-agent Reinforcement Learning.
  • [60]. Density Constrained Reinforcement Learning.
  • [61]. Decoupling Value and Policy for Generalization in Reinforcement Learning.
  • [62]. Model-Based Reinforcement Learning via Latent-Space Collocation.
  • [63]. Recomposing the Reinforcement Learning Building Blocks with Hypernetworks.
  • [64]. RRL: Resnet as representation for Reinforcement Learning.
  • [65]. Structured World Belief for Reinforcement Learning in POMDP.
  • [66]. Multi-Task Reinforcement Learning with Context-based Representations.
  • [67]. Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks.
  • [68]. PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration.
  • [69]. Decoupling Representation Learning from Reinforcement Learning.
  • [70]. Reinforcement Learning for Cost-Aware Markov Decision Processes.
  • [71]. REPAINT: Knowledge Transfer in Deep Reinforcement Learning.
  • [72]. Safe Reinforcement Learning Using Advantage-Based Intervention.
  • [73]. Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing.
  • [74]. On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP.
  • [75]. Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning.
  • [76]. Deep Reinforcement Learning amidst Continual Structured Non-Stationarity.
  • [77]. CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee.
  • [78]. Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies.
  • [79]. Reinforcement Learning with Prototypical Representations.
  • [80]. Continuous-time Model-based Reinforcement Learning.
  • [81]. Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL.
  • [82]. DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning.
  • [83]. Near Optimal Reward-Free Reinforcement Learning.
  • [84]. FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning.
  • [85]. On-Policy Deep Reinforcement Learning for the Average-Reward Criterion.
  • [86]. MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration.
  • [87]. Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity.
  • [88]. Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping.
  • [89]. Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning.
  • [90]. Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning.

相关内容

热门资讯

喜欢穿一身黑的男生性格(喜欢穿... 今天百科达人给各位分享喜欢穿一身黑的男生性格的知识,其中也会对喜欢穿一身黑衣服的男人人好相处吗进行解...
发春是什么意思(思春和发春是什... 本篇文章极速百科给大家谈谈发春是什么意思,以及思春和发春是什么意思对应的知识点,希望对各位有所帮助,...
网络用语zl是什么意思(zl是... 今天给各位分享网络用语zl是什么意思的知识,其中也会对zl是啥意思是什么网络用语进行解释,如果能碰巧...
为什么酷狗音乐自己唱的歌不能下... 本篇文章极速百科小编给大家谈谈为什么酷狗音乐自己唱的歌不能下载到本地?,以及为什么酷狗下载的歌曲不是...
家里可以做假山养金鱼吗(假山能... 今天百科达人给各位分享家里可以做假山养金鱼吗的知识,其中也会对假山能放鱼缸里吗进行解释,如果能碰巧解...
华为下载未安装的文件去哪找(华... 今天百科达人给各位分享华为下载未安装的文件去哪找的知识,其中也会对华为下载未安装的文件去哪找到进行解...
四分五裂是什么生肖什么动物(四... 本篇文章极速百科小编给大家谈谈四分五裂是什么生肖什么动物,以及四分五裂打一生肖是什么对应的知识点,希...
怎么往应用助手里添加应用(应用... 今天百科达人给各位分享怎么往应用助手里添加应用的知识,其中也会对应用助手怎么添加微信进行解释,如果能...
苏州离哪个飞机场近(苏州离哪个... 本篇文章极速百科小编给大家谈谈苏州离哪个飞机场近,以及苏州离哪个飞机场近点对应的知识点,希望对各位有...
客厅放八骏马摆件可以吗(家里摆... 今天给各位分享客厅放八骏马摆件可以吗的知识,其中也会对家里摆八骏马摆件好吗进行解释,如果能碰巧解决你...