Humans help train their robot replacements
Source – autonews.com
Taking a human out of the driver’s seat requires putting a lot of humans to work behind the scenes.
The artificial intelligence in computers that operate self- driving vehicles is developed using vast amounts of data collected from public road tests. But to be useful, the data must be extensively labeled — a process known as annotation that can require hundreds of man-hours for a single hour of data collected.
“Data annotation is super labor-intensive,” said Sameep Tandon, CEO of self- driving startup Drive.ai in Mountain View, Calif. “Each hour of data collected takes almost 800 human hours to annotate. How are you going to scale that?”
To meet aggressive timelines for autonomous vehicle deployment — some manufacturers expect to introduce a Level 4 autonomous vehicle, which can operate with no human interaction under defined conditions, within five years — companies working with artificial intelligence must find a way to speed annotation without sacrificing quality.
Options companies are pursuing include outsourcing smartphone users and repurposing the technology they’re developing in the first place. Even when self-driving vehicles reach production, these processes will be valuable, teaching vehicles how to safely interact in a rapidly changing transportation environment.
“We’re very likely going to need some form of data annotation in the long term,” Tandon said. “New situations will come up in the future that cars today would not regularly see.”
In the past decade, deep-learning algorithms, mobile devices, powerful sensors and graphics processing units — which can evaluate tens of millions of operations in a second — have converged to create a highly capable driving “brain,” said Premkumar Natarajan, a computer vision expert at the University of Southern California’s Information Sciences Institute.
Though capable, the systems need to learn many situational details, such as driving faster in the left lane or reading pedestrians’ body language to know when they are likely to walk into the street.
Engineers “teach” these situations by feeding the computer thousands of images, typically collected via cameras on research vehicles. But these images are relatively meaningless to the computer if objects aren’t marked and labeled. The labels help the system differentiate obstacles along a route.
“Ultimately, the computer is acting on what you’ve fed it,” said Daryn Nakhuda, CEO of artificial intelligence training startup Mighty AI in Seattle. “It needs enough of what’s right and what’s wrong to really understand what it’s looking at.”
Though the computers can process information many times faster than humans, learning is still gradual and can be thrown off by incorrect or inaccurate information.
“The system slowly learns like the human brain,” said Bence Varga, head of European sales at AImotive, a Hungarian startup developing artificial intelligence software for self-driving vehicles. “Everything it sees has to be correctly labeled.”
Varga estimated it takes about 100,000 images and a week of teaching for a computer to safely learn a traffic situation. Robust, comprehensive training includes images from around the world and at different times of day, where traffic rules and situations vary widely.
City driving is more labor-intensive than highway driving, Varga said, because of pedestrians and other less-predictable variables.
With the high number of images required for training, and the need for accuracy, detailed data annotation is crucial to ensuring that autonomous vehicles are safe for use in public.
To ensure data are labeled correctly, some companies are doing it the old-fashioned way — by hand, which requires massive amounts of human labor.
Although Mighty AI has only about 50 full-time employees, it meets the high labor demand by outsourcing data annotation to anyone with a smartphone.
“We have a community where we can distribute the load,” Nakhuda said. “We try to break it up into bite-size pieces.”
The startup achieves this distribution through its Spare5 app, which anyone can download. The app presents data annotation as a series of tasks, in which users are given an image and perform labeling activities, such as drawing boxes around pedestrians, outlining vehicles or something as detailed as labeling individual pixels. Users must qualify to do these tasks by completing training exercises and are paid around 10 cents for each task — which typically entails labeling one type of object, such as pedestrians or cars, in an image.
App users who have proved to be skilled annotators review other users’ work before it is returned to Mighty AI.
“For me, it’s kind of relaxing,” said Iris Hanlon, a Spare5 user in Summerdale, Ala. Hanlon, 52, is a disabled, single parent who came across the app while looking for extra sources of income.
The app doesn’t bring in a significant amount of money, but it has become an easy way for her to unwind. Hanlon said the purpose of the data is of little importance to her.
“I paint pixels so the computer can recognize objects,” she said. “But to be honest, I’m kind of nervous about self-driving cars.”
Though Mighty AI’s approach speeds data annotation while keeping a human in the loop, other companies are looking at ways to automate the process.
Varga said AImotive has developed a semiautomated method, labeling videos instead of individual images. The system also suggests annotations before the labeler draws them, which saves time.
“We used our software engineer skill set to build an annotation tool that speeds up annotation by 50 times,” he said, adding that the speed must further improve to scale operations.
Video games, which have proved useful as simulation platforms for autonomous vehicles, come with labeled data, said Drive.ai’s Tandon, making them a more attractive tool.
Drive.ai is working to automate its labeling process as well, applying artificial-intelligence technology to its annotation system. The system observes humans annotating data, then tries the task itself.
A human reviewer corrects its mistakes, enabling it to learn and improve.
“It’s like training wheels,” Tandon said. “Humans don’t have to do as much work as before. Instead of labeling every car, they’re correcting mistakes of the algorithm. The more it becomes automated, the less humans have to do.”
Experts envision an even more efficient future, in which vehicles can learn and adapt to new situations on their own, cutting humans out of the loop.
“That’s the holy grail — unsupervised learning,” USC’s Natarajan said. “But who knows when that will be a reality?”