The Computational Limits of Deep Learning Are Closer Than You Think
Deep in the bowels of the Smithsonian National Museum of American History in Washington DC sits a large metal cabinet the size of a walk-in wardrobe. The cabinet houses a remarkable computer – the front is covered in dials, switches and gauges and inside it is filled with potentiometers controlled by small electric motors. Behind one of the cabinet doors is a 20 x 20 array of light sensitive cells, a kind of artificial eye.
This is the Perceptron Mark I, a simplified electronic version of a biological neuron. It was designed by the American psychologist Frank Rosenblatt at Cornell University in the late 1950s who taught it to recognize simple shapes such as triangles.
Rosenblatt’s work is now widely recognized as the foundation of modern artificial intelligence but at the time it was controversial. Despite the original success, researchers were unable to build on this, not least because more complex pattern recognition required vastly more computational power than was available at the time. This insatiable appetite prevented further study of artificial neurons and the networks they create.
Today’s deep learning machines also eat power, lots of it. And that raises an interesting question about how much they will need in future. Is this appetite sustainable as the goals of AI become more ambitious?
Today we get an answer thanks to the work of Neil Thompson at the Massachusetts Institute of Technology in Cambridge and several colleagues. This team has measured the improved performance of deep learning systems in recent years and show that how it depends on increases in computing power.
By extrapolating this trend, they say that future advances will soon become unfeasible. “Progress along current lines is rapidly becoming economically, technically, and environmentally unsustainable,” say Thompson and colleagues, echoing the problems that emerged for Rosenblatt in the 1960s.
The team’s approach is relatively straightforward. They analyzed over 1000 papers on deep learning to understand how learning performance scales with computational power. The answer is that the correlation is clear and dramatic.
In 2009, for example, deep learning was too demanding for the computer processors of the time. “The turning point seems to have been when deep learning was ported to GPUs, initially yielding a 5 − 15× speed-up,” they say.
This provided the horsepower for a neural network called AlexNet, which famously triumphed in a 2012 image recognition challenge where it wiped out the opposition. The victory created huge and sustained interest in deep neural networks that continues to this day.
But while deep learning performance increased by 35x between 2012 and 2019, the computational power behind it increased by an order of magnitude each year. Indeed, Thompson and co say this and other evidence suggests the computational power for deep learning has increased 9 orders of magnitude faster than the performance.
So how much computational power will be required in future? Thompson and co say that error rate for image recognition is currently 11.5 percent using 10^14 gigaflops of computational power at a cost of millions of dollars (ie 10^6 dollars).
They say achieving an error rate of just 1 per cent will require 10^28 gigaflops. And extrapolating at the current rate, this will cost 10^20 dollars. By comparison, the total amount of money in the world right now is measured in trillions ie 10^12 dollars.
What’s more, the environmental cost of such a calculation will be enormous, an increase in the amount of carbon produced of 14 orders of magnitude. “Progress along current lines is rapidly becoming economically, technically, and environmentally unsustainable,” conclude Thompson and colleagues.
The future isn’t entirely bleak, however. Thompson and co’s extrapolations assume that future deep learning systems will use the same kinds of computers that are available today.
But various new approaches offer much more efficient computation. For example, in some tasks the human brain can outperform the best supercomputers while running on little more than a bowl of porridge. Neuromorphic computing attempts to copy this. And quantum computing promises orders of magnitude more computing power with relatively little increase in power consumption.
Another option is to abandon deep learning entirely and concentrate on other forms of machine learning that are less power hungry.
Of course, there is no guarantee that these new techniques and technologies will work. But if they don’t, it’s hard to see how artificial intelligence will get much better than it is now.
Curiously, something like this happened after the Perceptron Mark I first appeared, a period that lasted for decades and is now known as the AI winter. The Smithsonian doesn’t currently have it on display, but it is surely marks a lesson worth remembering.