Google’s search software, and in particular the software that powers its Home smart speaker, is facing scrutiny this week for giving false, misleading, or otherwise objectionable answers to certain questions from users. Meanwhile, Uber is dealing with the fallout from a series of embarrassments, including a New York Times report that it misled the public about a dangerous incident involving one of its self-driving cars.
These might sound like unrelated setbacks for a pair of high-flying Silicon Valley companies. In fact, they’re symptoms of the same underlying problem. Uber’s dishonesty aside, the issue is not simply corporate malfeasance by bad actors. Rather, it’s a function of the incentives involved in developing new consumer technologies that are powered by artificial intelligence. And until it’s addressed, it’s a problem that is going to crop up more and more in the years to come—with increasingly serious consequences.
Google’s problem, highlighted by Adrianne Jeffries in a fascinating article for the Outline, revolves around a relatively new feature of its search engine called “featured snippets.” The A.I.-powered snippets draw from Google’s search results to offer direct answers to certain questions, rather than simply linking to information sources around the web. Often these answers are accurate and helpful. But not always. Ask, for example, “Is Obama planning a coup?” and it will inform you that he’s “in bed with the communist Chinese” and “may in fact be planning a communist coup d’etat.”
And here's what happens if you ask Google Home "is Obama planning a coup?" pic.twitter.com/MzmZqGOOal— Rory Cellan-Jones (@ruskin147) March 5, 2017
Various other queries will prompt Google to promote similarly ludicrous conspiracy theories, dispense misogynistic advice, or cite “facts” that are just plain wrong. This is bad enough when it happens on Google.com, where the bad answers appear at the top of a long list of search results. It’s more troubling when it happens on Google’s smart speaker, the Google Home, which encourages users to ask direct questions and responds confidently with its single top answer. Clearly the snippets are still a work in progress, and in the meantime they’re spreading misinformation that ranges from harmlessly funny to downright insidious.
The relevant Uber snafu relates to the self-driving cars it is developing and testing on public streets in some U.S. cities. In December, one of these self-driving taxis ran a red light in San Francisco, rolling straight through a pedestrian crosswalk in front of the Museum of Modern Art. (Thankfully, the crosswalk was empty at the time.) Uber initially pinned the mistake on human error, suggesting that an Uber employee had taken the wheel. But last week the New York Times reported that the car was actually driving itself when it ran the light.
So how does a self-driving Uber running a light relate to Google giving a false answer? On the surface, a vehicle and a search engine look like disparate technologies. But under the hood, so to speak, they have a lot in common.
Both are powered by cutting-edge machine learning technology, in which software is programmed to “learn” over time how to convert certain types of data into a desired output. Much of this learning happens through a sort of trial and error, as the software attempts to apply proven strategies to new situations.
For a self-driving car to determine whether it’s seeing a red light, the computer vision software must constantly extrapolate from its previous experience identifying red lights. Mistakes along the way are inevitable, because literally no two red lights in the world appear exactly the same. But the more red lights it sees, the better it will get at identifying new ones.
In a question-answering engine like Google’s, the software must extrapolate from its experience with similar queries to identify the information source most likely to answer each new question. The more popular a given query, the more data the system has to draw from in making that assessment. When it gives a misleading or unhelpful answer, it relies on feedback from the user to flag the mistake for its engineers to address.
There are, of course, key differences in these two examples, including the nature of the input data, the programming techniques required to process and parse that data, and the mechanisms for user feedback. But the point is that consumer technologies powered by artificial intelligence differ in an important way from other types of consumer technologies. Rather than performing a set function in a rote manner, like a toaster or a washing machine, the software makes decisions in the face of novel problems by applying answers that have worked in similar situations in the past. In a manner of speaking, then, it learns by doing. And when the desired output is sufficiently complex or abstract—safely driving a car, for instance, or giving true answers to questions about the world—it often takes a lot of doing for the software to get really good.
This implies that such A.I. technologies are likely to start out pretty bad, at least relatively speaking. And the quickest way to improve them is often to send them out into the world.
That might be OK when we’re talking about a closed system such as Netflix’s movie recommendations, which primarily affects how you use that specific service. We don’t expect Netflix’s recommendations to be perfect, and if they’re wrong, all we have to do is scroll down. It becomes worrisome when these systems have the potential to impact the outside world, whether that’s by running over a pedestrian or promoting a dangerous conspiracy theory such as #pizzagate. It’s particularly problematic in realms where mistakes are costly, such as driving. But even Facebook’s machine-powered news feed rankings, originally designed to make sure you saw the most important life updates from your friends, have come under scrutiny lately for their role in spreading fake news that could affect opinion polls and elections.
Just as a ranking algorithm requires a lot of practice in order to reliably sift credible information sources from dubious ones, a self-driving system is going to make a lot of mistakes on the way to becoming a near-perfect artificial driver. And because no lab can simulate the nearly infinite variety of circumstances it will encounter in the real world, some of those mistakes will inevitably happen on public streets.
None of this implies that Facebook, Google, or Uber are evil, or even necessarily negligent, though you could make that argument about any of the three. Certainly Uber’s attempt to cover up its software’s error should give us pause in trusting the company’s products with our lives. But if they want to build A.I. systems that can be widely trusted, they have to start by rolling out imperfect versions that improve with use. The question is: How much imperfection should we accept?
So far, we’re leaving it to the companies themselves to strike that balance. But absent specific regulatory standards, there are good reasons to doubt that they’ll be as careful as we’d like. That’s because there’s a strong market incentive for them to release prototypes into the wild earlier rather than later. And if they don’t, someone else will.
Google learned this the hard way when it comes to the same sort of self-driving technology for which Uber is now gaining attention. Google pioneered the technology long before Uber got involved. Its engineers acted with due caution by testing it for years without releasing it to the public, trying to get it as close to perfect as possible. But Uber and others, including Tesla, saw that this made the development process painfully slow. So when they built their own self-driving systems, they rolled them out to consumers much earlier in the process, thereby gaining data more rapidly than Google could. As a result, many of Google’s top engineers bolted to these upstart rivals, and Google’s own technology lost much of its competitive advantage.
Machine learning is expected to power advances in fields as diverse as finance, journalism, and medical diagnosis. As these technologies proliferate, the same dynamic that encouraged Uber to put flawed self-driving cars on city streets will push other companies to release products with potentially dangerous flaws that we haven’t yet anticipated. The moral here is not that we should ban artificial intelligence or demonize the companies that develop it. But there’s a strong need for regulators and consumer protection agencies to better understand these technologies, so that they can apply the proper scrutiny and enforce standards where necessary. Artificial intelligence from the likes of Google and Uber might reshape our lives in wonderful ways. But in the meantime, we should be wary of their early efforts.