Uber Open-Sources Plug-and-Play Language Model for Controlling AI-Generated Text

Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Source: infoq.com

Uber AI open-sourced their plug-and-play language model (PPLM) which can control the topic and sentiment of AI-generated text. The model’s output is evaluated by human judges as achieving 36% better topic accuracy compared to the baseline GPT-2 model.

The team provided a full description of the system and experiments in a paper published on arXiv. PPLM starts with a pre-trained language model (LM), such as GPT-2. These LMs can produce complex output which approaches human fluency, but it is difficult to control the specific properties of the generated text. Instead of “fine-tuning” the LM with additional training data, PPLM uses a separate attribute model that can evaluate the LM’s output for sentiment or topic; this model is used to control the text produced by the LM. A strength parameter can tune how much the attribute model adjusts the LM output. According to Uber’s researchers,

PPLM allows a user to flexibly plug in one or more simple attribute models representing the desired control objective into a large, unconditional LM. The method has the key property that it uses the LM as is—no training or fine-tuning is required—which enables researchers to leverage best-in-class LMs even if they do not have the extensive hardware required to train them.

Recent state-of-the-art NLP research has focused on creating pre-trained models based on the transformer architecture. These models are large, containing hundreds of millions of parameters, and are trained on large datasets containing millions of words; the training may take several days of runtime on expensive GPU hardware. Researchers without the resources to train their own state-of-the-art models must often choose to use a publicly available model that isn’t quite suited for their task, or go with a smaller, less accurate model of their own. Another alternative is to fine-tune a pretrained model, but that presents the risk of catastrophic forgetting.

The key to PPLM is to use an additional, simpler model, the attribute model (AM), that can score the output of the LM; in particular, it calculates the probability that the LM’s output text has some attribute (for example, that the text has positive sentiment, or is about politics). The AM can also calculate the gradient of that probability, which is used to “steer” the LM; the transformer-based LMs are “autoregressive,” meaning that as they generate a sequence of words, the previously generated word becomes an input to the system for creating the next word. In PPLM, the gradient of the AM is also used to generate the next word, such that it is more likely to contain the desired attribute.

Uber highlighted the “pluggable” nature of PPLM with other techniques that require training and fine-tuning the full model. For example, a team from Google Brain presented a paper at last year’s NeurIPS conference that uses a generative-adversarial technique made popular by deep-learning “style-transfer” image processing systems. OpenAI created a system that uses reinforcement learning (RL) to incorporate human feedback in fine-tuning a GPT-2 LM. On Hacker News, user Gwern Branwen wrote:

What’s particularly nice [about PPLM] is if you can plug in a classifier for things like esthetics based on human ratings, along the lines of [OpenAI’s system] but better – why spend the enormous effort running [RL] to brute force the classifier to obtain desired text or image output, when you can just backprop through it and let the classifier itself tell you how exactly to improve the inputs?

PPLM source code is available on GitHub. A demo is also available on NLP research site HuggingFace and via a Google Colab notebook.

DeepMind open-sources Lab2D to support creation of 2D environments for AI and machine learning

Source: computing.co.uk Alphabet subsidiary DeepMind announced on Monday that it has open-sourced Lab2D, a scalable environment simulator for artificial intelligence (AI) research that facilitates researcher-led experimentation with environment Read More

Plan2Explore: Active Model-Building for Self-Supervised Visual Reinforcement Learning

Source: bair.berkeley.edu To operate successfully in unstructured open-world environments, autonomous intelligent agents need to solve many different tasks and learn new tasks quickly. Reinforcement learning has enabled Read More

Is AI an Existential Threat?

Source: unite.ai When discussing Artificial Intelligence (AI), a common debate is whether AI is an existential threat. The answer requires understanding the technology behind Machine Learning (ML), and recognizing Read More

Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

Related Posts

DeepMind open-sources Lab2D to support creation of 2D environments for AI and machine learning

A VR Film/Game with AI Characters Can Be Different Every Time You Watch or Play

Researchers detail LaND, AI that learns from autonomous vehicle disengagements

Google Teases Large Scale Reinforcement Learning Infrastructurean

Plan2Explore: Active Model-Building for Self-Supervised Visual Reinforcement Learning

Is AI an Existential Threat?