Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Get Started Now!

DeepMind releases Acme, a distributed framework for reinforcement learning algorithm development

Source: venturebeat.com

DeepMind this week released Acme, a framework intended to simplify the development of reinforcement learning algorithms by enabling AI-driven agents to run at various scales of execution. According to the engineers and researchers behind Acme, who coauthored a technical paper on the work, it can be used to create agents with greater parallelization than in previous approaches.

Reinforcement learning involves agents that interact with an environment to generate their own training data, and it’s led to breakthroughs in fields from video games and robotics to self-driving robo-taxis. Recent advances are partly attributable to increases in the amount of training data used, which has motivated the design of systems where agents interact with instances of an environment to quickly accumulate experience. This scaling from single-process prototypes of algorithms to distributed systems often requires a reimplementation of the agents in question, DeepMind asserts, which is where the Acme framework comes in.

Acme is a development suite for training reinforcement learning agents that attempts to address the issues of both complexity and scale, with components for constructing agents at various levels of abstraction from algorithms and policies to learners. The thinking goes that this will allow for the swift iteration of ideas and the evaluation of those ideas in production, chiefly through training loops, obsessive logging, and checkpointing.

Within Acme, actors interact closely with an environment, making observations produced by the environment and taking actions that in turn feed into the environment. After observing the ensuing transition, the actors are given an opportunity to update their states; this most often relates to their action-selection policies, which determine which actions they take in response to the environment. A special type of Acme actor comprises both acting and learning components — they’re referred to as “agents” — and their state updates are triggered by some number of steps within the learner component. That said, agents for the most part defer their action selection to their own acting component.

Acme provides a data set module that sits between the actor and learner components, backed by a low-level storage system called Reverb that DeepMind also released this week. In addition, the framework establishes a common interface for insertion into Reverb, enabling different styles of preprocessing and the ongoing aggregation of observational data.

Acting, learning, and storage components are split among different threads or processes within Acme, which confers two benefits: environment interactions occur asynchronously with the learning process, and data generation accelerates. Elsewhere, Acme’s rate limitation allows the enforcement of a desired rate from learning to acting, allowing processes to run unblocked so long as they remain within some defined tolerance. For instance, if one of the processes starts lagging behind the other due to network issues or insufficient resources, the rate limiter will block the laggard while the other catches up.

In addition to these tools and resources, Acme ships with a set of example agents meant to serve as reference implementations of their respective reinforcement learning algorithms as well as strong research baselines. More might become available in the future, DeepMind says. “By providing these … we hope that Acme will help improve the status of reproducibility in [reinforcement learning], and empower the academic research community with simple building blocks to create new agents,” wrote the researchers. “Additionally, our baselines should provide additional yardsticks to measure progress in the field.”

Related Posts

DeepMind open-sources Lab2D to support creation of 2D environments for AI and machine learning

Source: computing.co.uk Alphabet subsidiary DeepMind announced on Monday that it has open-sourced Lab2D, a scalable environment simulator for artificial intelligence (AI) research that facilitates researcher-led experimentation with environment Read More

Read More

A VR Film/Game with AI Characters Can Be Different Every Time You Watch or Play

Source: technologyreview.com The square-faced, three-legged alien shoves and jostles to get at the enormous plant taking over its tiny planet. But each bite just makes the forbidden Read More

Read More

Researchers detail LaND, AI that learns from autonomous vehicle disengagements

Source: venturebeat.com UC Berkeley AI researchers say they’ve created AI for autonomous vehicles driving in unseen, real-world landscapes that outperforms leading methods for delivery robots driving on Read More

Read More

Google Teases Large Scale Reinforcement Learning Infrastructurean

Source: alyticsindiamag.com The current state-of-the-art reinforcement learning techniques require many iterations over many samples from the environment to learn a target task. For instance, the game Dota Read More

Read More

Plan2Explore: Active Model-Building for Self-Supervised Visual Reinforcement Learning

Source: bair.berkeley.edu To operate successfully in unstructured open-world environments, autonomous intelligent agents need to solve many different tasks and learn new tasks quickly. Reinforcement learning has enabled Read More

Read More

Is AI an Existential Threat?

Source: unite.ai When discussing Artificial Intelligence (AI), a common debate is whether AI is an existential threat. The answer requires understanding the technology behind Machine Learning (ML), and recognizing Read More

Read More
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x