
Ray
 Referenced in 7 articles
[sw28740]
 will continuously interact with the environment and learn from these interactions. These applications impose ... consider these requirements and present Raya distributed system to address them. Ray implements ... performance requirements, Ray employs a distributed scheduler and a distributed and faulttolerant store ... existing specialized systems for several challenging reinforcement learning applications...

SURREAL
 Referenced in 1 article
[sw31156]
 that runs stateoftheart distributed reinforcement learning (RL) algorithms...

PyTorchRL
 Referenced in 1 article
[sw41243]
 PyTorchRL: Modular and Distributed Reinforcement Learning in PyTorch. Deep reinforcement learning (RL) has proved successful ... modules. Additionally, PyTorchRL permits the definition of distributed training architectures with flexibility and independence...

IMPALA
 Referenced in 10 articles
[sw41064]
 IMPALA: Scalable Distributed DeepRL with Importance Weighted ActorLearner Architectures. In this work ... collection of tasks using a single reinforcement learning agent with a single set of parameters ... training time. We have developed a new distributed agent IMPALA (Importance Weighted ActorLearner Architecture ... achieve stable learning at high throughput by combining decoupled acting and learning with a novel...

RLgraph
 Referenced in 2 articles
[sw31155]
 Computation Graphs for Deep Reinforcement Learning. Reinforcement learning (RL) tasks are challenging to implement, execute ... algorithmic instability, hyperparameter sensitivity, and heterogeneous distributed communication patterns. We argue for the separation ... library for designing and executing reinforcement learning tasks in both static graph and define ... high performance across different deep learning frameworks and distributed backends...

ART 3
 Referenced in 27 articles
[sw08755]
 implement parallel search of compressed or distributed pattern recognition codes in a neural network hierarchy ... functions well with either fast learning or slow learning, and can robustly cope with sequences ... memory representation of a pattern recognition code. Reinforcement feedback can modulate the search process...

Horizon
 Referenced in 6 articles
[sw31157]
 Horizon, Facebook’s open source applied reinforcement learning (RL) platform. Horizon ... algorithms and includes data preprocessing, feature transformation, distributed training, counterfactual policy evaluation, optimized serving ... showcase and describe real examples where reinforcement learning models trained with Horizon significantly outperformed...

Catalyst.RL
 Referenced in 3 articles
[sw31154]
 Distributed Framework for Reproducible RL Research. Despite the recent progress in deep reinforcement learning field ... library include largescale asynchronous distributed training, easytouse configuration files with the complete...

DualDICE
 Referenced in 1 article
[sw40535]
 Discounted Stationary Distribution Corrections. In many realworld reinforcement learning applications, access to the environment ... policy, accurate estimates of discounted stationary distribution ratios  correction terms which quantify the likelihood that...

JRLF
 Referenced in 1 article
[sw14316]
 testing reinforcement learning algorithms in a variety of environments. The current distribution contains implementation...

Seq2SQL
 Referenced in 3 articles
[sw27204]
 Structured Queries from Natural Language using Reinforcement Learning. A significant amount of the world ... loop query execution over the database to learn a policy to generate unordered parts ... annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia. This dataset ... comparable datasets. By applying policybased reinforcement learning with a query execution environment to WikiSQL...

RLDDE
 Referenced in 3 articles
[sw35866]
 time delay. A novel method, called reinforcement learningbased dimension and delay estimator (RLDDE ... learn the selection policy of the dimension and delay under different distribution of the data...

SNAS
 Referenced in 3 articles
[sw42518]
 optimization problem on parameters of a joint distribution for the search space in a cell ... gradient optimizes the same objective as reinforcementlearningbased NAS, but assigns credits to structural...

SAMBA
 Referenced in 1 article
[sw42123]
 armed bandit is a reinforcement learning model where a learning agent repeatedly chooses an action ... stochastic outcome (reward) coming from an unknown distribution associated with the chosen arm. Bandits have...

LibPGRL
 Referenced in 1 article
[sw14310]
 highperformance policygradient reinforcement learning library. Since the first version it has been extended ... iteration. It has been designed with large distributed RL systems in mind...

d3rlpy
 Referenced in 1 article
[sw40533]
 d3rlpy, an opensourced offline deep reinforcement learning (RL) library for Python. d3rlpy supports ... exporting policies for deployment, preprocessing and postprocessing, distributional Qfunctions, multistep learning...

MIMOSA
 Referenced in 1 article
[sw41883]
 input molecule. Existing generative models and reinforcement learning approaches made initial success, but still face ... guess and sample molecules from the target distribution. MIMOSA first pretrains two property agnostic graph...

BaRC
 Referenced in 3 articles
[sw40037]
 Learning. Modelfree Reinforcement Learning (RL) offers an attractive approach to learn control policies ... amount of exploration required to obtain a learning signal from the initial state ... task, and expands the initial state distribution backwards in a dynamicallyconsistent manner once...

pymgrid
 Referenced in 1 article
[sw36564]
 increasing infrastructure resiliency. Due to their distributed nature, microgrids are often idiosyncratic; as a result ... pymgrid is built to be a reinforcement learning (RL) platform, and includes the ability...

EAQR
 Referenced in 1 article
[sw27641]
 design a multiagent reinforcement learning algorithm for cooperative tasks where multiple agents need to coordinate ... pushing, and the other is the distributed sensor network problem. Experimental results show that EAQR...