Performance, cost, and risk
What are the performance, cost, and risk impacts of implementing this product?
- Performance: Advances the effectiveness and efficiency of model training by preventing model choices that are unsafe and lead to adverse outcomes.
- Cost: Implementing these safety procedures may be costly, but reduce the cost and time associated with model training as time spent on learning unsafe actions is avoided.
- Risk: Under certain circumstances, training speed and model may not improve.
Implementation requirements
What capabilities would a business/organization/institution need to have to implement this product?
- Processes: Reinforcement learning model development processes should be starting up or be in their infancy for the toolkit to be most effective.
- Resources: System data to train and validate RL models, computational infrastructure to facilitate training and validation, data scientists to oversee training and validation.
- Competences: Knowledge of reinforcement learning to support the use of the toolkit, safety focus in validation procedures for asset management to stimulate adoption.
- Technologies: Reinforcement learning training and validation applications, RL frameworks (e.g., TensorFlow, PyTorch).
Related works
- Carr et al. (2023). Safe Reinforcement Learning via Shielding under Partial Observability.
- Gross et al. (2023). Targeted Adversarial Attacks on Deep Reinforcement Learning Policies via Model Checking.
- Hogewind et al. (2023). Safe Reinforcement Learning from Pixels Using a Stochastic Latent Representation.
- Simao et al. (2021). AlwaysSafe: Reinforcement Learning without Safety Constraint Violations during Training.
- Yang et al. (2022). Safety-Constrained Reinforcement Learning with a Distributional Safety Critic.
- Yang et al. (2023). Reinforcement Learning by Guided Safe Exploration.
- Wienhöft (2023). More for Less: Safe Policy Improvement with Stronger Performance Guarantees.
- Castellini et al. (2023). Scalable Safe Policy Improvement via Monte Carlo Tree Search.
- Krale et al. (2023). Act-Then-Measure: Reinforcement Learning for Partially Observable Environments with Active Measuring.
- Koprulu et al. (2023). Risk-aware curriculum generation for heavy-tailed task distributions.
Contact information
For further inquiries regarding this product, feel free to get in touch with:
- Nils Jansen, Radboud Universiteit. nils [dot] jansen [at] ru [dot] nl
- Thiago Simão, Eindhoven University of Technology. t [dot] simao [at] tue [dot] nl






