In 2017, a Palestinian construction worker in the West Bank settlement of Beiter Illit, Jerusalem, posted a picture of himself on Facebook in which he was leaning against a bulldozer. Shortly after, Israeli police arrested him on suspicions that he was planning an attack, because the caption of his post read "attack them."
Except that it didn't. The real caption of the post was "good morning" in Arabic. But for some unknown reason, Facebook's artificial intelligence–powered translation service translated the text to "hurt them" in English or "attack them" in Hebrew. The Israeli Defense Force uses Facebook's automated translation to monitor the accounts of Palestinian users for possible threats. In this case, they trusted Facebook's AI enough not to have the post checked by an Arabic-speaking officer before making the arrest.
The Palestinian worker was eventually released after the mistake came to light—but not before he underwent hours of questioning. Facebook apologized for the mistake and said that it took steps to correct it.
Advances in deep learning and neural networks have improved the precision of AI algorithms and enabled the automation of tasks that were previously thought to be the exclusive domain of human intelligence. But the precision in performance comes at a cost to transparency. Unlike with traditional software, we don't always have an exact idea of how deep-learning algorithms work. Troubleshooting them is very difficult, and they often fail in unanticipated and unexplainable ways. Even the creators of deep-learning algorithms are often hard-pressed to investigate and interpret the logic behind their decisions.
The failure of Facebook's machine-translation system is just one of the many cases in which the opacity of deep-learning algorithms has caused larger troubles.
What's widely known as the AI "black box" problem has become the focus of academic institutions, government agencies, and tech companies that are researching methods to explain AI decisions or to create AI that is more transparent and open to investigation.
Their efforts will be crucial to the development of the AI industry—especially as deep learning finds its way into critical domains where mistakes can have life-changing consequences.
The Rise of Deep Learning
In classical approaches to creating software, developers meticulously specify the rules that define the behavior of a system. In contrast, deep-learning algorithms develop their behavior by examining and comparing numerous examples. The concept and science behind deep learning has existed for decades, but only in recent years has the abundance of data and compute resources pushed it from research labs and academic papers into practical domains. And with its rise in popularity, deep learning has introduced changes in the way developers create software.
For Kate Saenko, who has been involved in computer vision since the early 2000s, those changes are very tangible. Computer vision is a field of artificial intelligence that enables computers to process and understand the context and content of digital images and videos. It is the technology used in a wide range of fields, including image classification, facial recognition, and the automated diagnosis of MRI and X-ray images. It's one of the fields where rules-based programming has historically struggled, because the number of rules developers have to write down are virtually endless.
"Back in those days, we had a very different approach, where first you designed your features, and a lot of thought and design process went into that," said Saenko, an associate professor at the Department of Computer Science at Boston University.
For instance, if developers wanted to detect cats, they had to write code manually that could probe pictures for cat features such as heads or tails. "You designed these features first, and then you designed methods to extract those features. And then you would do machine learning on top of the features," Saenko said.
The process was arduous and lengthy because each of those features can vary in shape and size, depending on the species of the animal and the angle at which the picture was taken.
In contrast, a deep-learning algorithm that is meant to classify pictures as "cat" or "not cat" only needs to be given many cat pictures. It will create its own rules to determine how to detect cats in pictures and performs much better than previous methods that involved a lot of manually written features. In 2012, researchers from the University of Toronto used deep learning for the first time to win a famous computer-vision competition and improve the field by a large margin. Deep learning has since found its way into many other fields, including voice recognition, natural language processing, fraud detection and arts.
"The reason deep learning is so successful is because there's very little design that goes into neural networks," said Saenko. "We just let the machine discover the most useful pattern from raw data. We're not going to tell it what to look for. We're not going to tell it any high-level features. We let it search through all of its training data and find those patterns that lead to the highest accuracy in solving the problem."
The Challenges of Debugging Deep-Learning Software
The benefits in accuracy that deep learning provides is not without its trade-offs.
"In classical computer programming, you have precision with the algorithm. You know exactly in mathematical terms what you are doing," said Sheldon Fernandez, CEO of DarwinAI, an Ontario-based AI company. "With deep learning, the behavior is data driven. You are not prescribing behavior to the system. You are saying, 'here's the data, figure out what the behavior is.' That is an inherently fuzzy and statistical approach."
This means that when you let a neural network develop its own behavioral model, you are basically losing visibility into its reasoning process. And mostly, the inner parameters and connections that neural networks develop are so numerous and complex that they become too difficult for humans to understand.
A simplified view of how data flows in neural networks (Source: Wikipedia)
As Saenko explained, when using deep learning, engineers must choose "between how much human-imposed, top-down design you put into something to make it more interpretable versus how much performance you lose as a result of that."
Also, the reasoning that a neural network develops does not necessarily reflect that of humans, even though it produces accurate results most of the time.
"The real challenge of deep learning is that it's not modeling, necessarily, the world around it. It's modeling the data it's getting," Fernandez said. "And that modeling often includes bias and problematic correlations. It can include nonsensical correlations. And all those things can find [their] way into the behavior of the system."
A while ago, Seanko developed a deep-learning algorithm that captioned images and videos with impressive accuracy. The problem was that her captioning application had developed a bias toward certain types of decisions, a problem that is common in deep-learning algorithms. For instance, in cooking videos, it often captioned kitchen workers as women—even when they were men. On the other hand, in science videos, the algorithm was more inclined to label scientists as men. But she couldn't determine for certain why the network was making the mistakes. And without being able to find the reasons of those errors, she couldn't fix them.
In some cases, the opacity of AI algorithms can cause frustration. But in other cases, not being able to explain the reasoning behind AI decisions can have more serious consequences.
In 2017, Fernandez, then a computer scientist at Avande, an IT consulting company, was using deep learning to help a bank in the UK detect fraudulent transactions. They basically trained the deep neural network with all the historical data of the bank and let it figure out for itself the patterns that defined fraudulent transactions.
Their algorithm was able to detect fraud 3 or 4 percent better than the client's best-in-class system. The problem was that they had no idea why it was performing better. "We had no insight into what data the neural network was triggering off in order to make better predictions," Fernandez said.
Naturally, the client could not confer sensitive financial decision-making onto an automated system if they couldn't understand the logic behind its decisions.
The financial industry is one of several domains where interpretability has become a requirement for the use of AI algorithms in critical decisions. Other fields where the opacity of deep learning has become a hurdle include health care and medicine, hiring and human resources, criminal justice, and the military. In all these domains, a bad decision can have a negative and irreversible effect on the career, health, or life of one or many humans and can have severe legal consequences for the person who makes those decision. That's why experts are generally skeptical about trusting an automated system to make decisions on their behalf.
Moreover, European Union's General Data Protection Regulation (GDPR), which went into effect in May, requires organizations that use automated decision-making to provide meaningful information about the information and logic involved in those decisions when users or customers demand it. The GDPR, which is legally binding for any company and organization that does business in the EU zone, is considered a de facto gold standard for all tech companies handling personal information.
"One of the real powers of explainable AI is to illustrate how the AI is triggering data points to reach a decision, and surfacing those data points to a human for verification," Fernandez said.
Investigating the AI Black Box
There are generally two pathways toward making decisions made by neural networks interpretable. The first, called "local explanations," tries to understand the motives and parameters behind individual decisions made by an AI algorithm. "Global explanations" try to describe the general reasoning logic of an AI model.
After her neural networks failed to reveal the reasons they were mislabelling videos and pictures, Saenko and a team of researchers at Boston University engaged in a project to find the parameters that influenced those decisions.
What came out of the effort was RISE, a method that tries to explain to interpret decisions made by AI algorithms. Short for "randomized input sampling for explanation of black-box models," RISE is a local explanation model.
When you provide an image-classification network with an image input, what it returns is a set of classes, each associated with a probability. Normally, you'd have no insight into how the AI reached that decision. But RISE provides you with a heatmap that describes which parts of the image are contributing to each of those output classes.
For instance, in the above image, it's clear that the network in question is mistaking brown sheep for cows, which might mean that it hasn't been trained on enough examples of brown sheep. This type of problem happens often. Using the RISE method, Saenko was able to discover that her neural networks were specifying the gender of the people in the cooking videos based on pots and pans and other objects that appeared in the background instead of examining their facial and physical features.
The idea behind RISE is to randomly obscure parts of the input image and run it through the neural network to observe how the changes affect the output weights. By repeating the masking process multiple times, RISE is able to discern which parts of the image are more important to each output class.
Since RISE works by manipulating inputs, it is a "black box" explanation method, which means it is model-agnostic: It can work with any AI model, without the need to access its inner workings or its training examples.
Methods such as RISE can also help build trust with the end users of AI algorithms in fields such as radiology. "When you give a doctor and AI image model that can look at a medical image or an MRI and detect cancer with very high accuracy, they often still don't trust it because they don't know why it's making that decision," Saenko said. RISE can clarify why an AI is making a diagnosis by pointing out which parts of the image it is considering relevant to the symptoms it is reporting.
Looking for What Isn't There
Most AI explanation methods focus on what's present in the input. But sometimes, focusing on what's missing can provide a better picture of the reasoning behind AI decisions.
"If you want to describe a colleague to me, a very natural kind of explanation you might use is 'He has long hair and is tall, but he doesn't wear glasses,'" said Amit Dhurandhar, scientist at IBM Research. "However, none of the methods that do local explanations of AI models explicitly capture this idea."
Contrastive Explainable Method (CEM), a joint project by researchers at IBM and the University of Michigan, tries to describe decisions made by neural networks by pointing out what it's not seeing in the input. Like RISE, CEM is a local explanation method, which means it tries to interpret individual decisions made by an AI algorithm.
Basically, like other local explanation methods, CEM tries to tell you why a certain neural network has classified your input in a particular way. But it also tells you what could be added to the input to change its class. For instance, the image below was extracted from a classifier for digits that was run through the CEM probe. On the left is the original input image and the original prediction of the neural network. The middle images highlight in cyan which parts of the image contributed to the original prediction. On the right, the pink highlights show the minimal additions that could lead to a change in prediction.
As Dhurandhar explained, medical diagnosis is one of the fields that stands to benefit much from this explanation method, because doctors reach conclusions not only by looking for the symptoms that are present but also by looking for those that are absent.
"If you go to a doctor, they will register facts such as whether your heart rate was normal. But they will also write things like arrhythmia was absent and a bunch of things that were not present," Dhurandhar said. "The reason is that in your next checkup, if you have an issue, the doctor will know what you were checked for. Also, if you switch a doctor, it's easy for the other person to know your diagnosis process."
Therefore, with methods like CEM, a doctor will be better positioned to probe an automated decision both for the positive and negative contributing factors.
Understanding the General Behavior of AI Models
While local models are helpful in investigating individual AI decisions, some domains require full transparency of the behavioral model of the software they use.
A few years ago, Dhurandhar developed a deep-learning model that helped a semiconductor-chip-manufacturing company predict which chips would likely become defective further down the production line. The model performed much better than the company's previous prediction software and enabled it to discard or fix chips at early production stages and improve its yield by several percent, which translated to millions of dollars in costs savings per year.
But the engineers controlling the system, whose jobs were on the line, weren't willing to let the AI make decisions without knowing exactly how it worked. What they wanted was to improve their original software, not to replace it with a black box that, albeit more accurate, would not provide them with insights on how it worked.
"Since in many domains, there's a human making the final decision—even if you have a higher performing model, if the person doesn't understand, the overall performance of the system might be lower than a lower-performing model that the person is able to understand," Dhurandhar said.
Improving Simple Models with Confidence Profiles, another AI-explanation method Dhurandhar helped develop with other researchers at IBM, addresses this issue by trying to transfer the behavior of neural networks to interpretable software structures. This is a global explanation model, which means instead of trying to interpret individual decisions, it tries to paint a general picture of how an AI model works.
Dhurandhar describes the "improving simple models" method as trying to achieve "best of both worlds," which means to benefit from the improvements that a neural network provides while adhering to other constraints that domain experts impose.
The method involves inserting software probes in the various layers of a neural network and monitoring its behavior as it trains on examples and evolves. In later stages, those probes try to replicate the observed behavior of the network on a decision tree, rule-based structure, or another model that is interpretable. In the case of the semiconductor company, Dhurandhar was able to map the behavior of the neural network on the software structure that the company already used.
The resulting model did not perform as well as the neural network but managed to improve the performance of the company's original software considerably while also maintaining its interpretability. Effectively, the engineers were willing to trade some of the accuracy of the neural network; instead, they gained full visibility and control on how their prediction software worked.
Using AI to Understand AI
Fernandez, who co-founded DarwinAI with University of Waterloo professor Alex Wong, reached AI explainability through a different approach. As an academic, Wong, who had years of experience in computer vision, had worked on a technique called evolutionary synthesis, (it's where the name DarwinAI comes from). Evolutionary synthesis is meant to make neural networks more efficient by treating them like organisms that evolve over time and shed their redundant components to become efficient.
At DarwinAI, Wong helped develop Generative Synthesis, a new technology that builds on the ideas of evolutionary synthesis and takes it a step further.
"The idea behind Generative Synthesis is to take artificial intelligence itself and see if we can better understand and develop neural networks," Fernandez said.
Generative Synthesis uses machine learning to probe and understand neural networks in a fundamental way. It then develops a complex mathematical representation of the model, which it uses to generate a second neural network that is just as accurate as the first one but is also more compact and faster. Making neural networks smaller makes them deployable in UAVs (unmanned aerial vehicles), driverless cars, and other edge environments that are resource-constrained or need real-time access to AI functionality.
But a byproduct of this approach is a thorough understanding of the way the neural network operates. By having monitored and documented the entire evolution of a neural network, DarwinAI's Generative Synthesis approach was able to point out the factors and data points that influenced each of the decisions its neural networks made.
"We had a kind of roundabout way of getting to the technology, but it's really powerful in trying to understand how these neural networks are making decisions," Fernandez said.
Beyond Finding Mistakes
"There are correlations that are demonstrably bad, that just shouldn't happen, such as bias. We need to recognize it in the system and eradicate it," Fernandez said. In the future, explainability methods can help find and fix those errors before they lead to an unjustified arrest or an unfairly declined loan.
But the benefits of interpreting deep-learning models expand beyond troubleshooting and fixing errors. In some cases, they can help shed light on previously unknown aspects of the domains they're deployed in.
"Explainability can also work in another direction. It can also give you insights into correlations that you didn't know existed," Fernandez said. During his work on applying deep learning to the banking sector, Fernandez's exploration of interpretable networks helped uncovered new insights on the characteristics of fraudulent transactions.
For example, thanks to explainable AI, they discovered that if a person is using the Chrome browser, the chances of a transaction being fraudulent is higher than if they're using Internet Explorer or Safari. And that's because as technical people, cybercriminals are much more likely to use Chrome rather than their operating system's preinstalled browser.
In another case, a travel agency was able to discover that some people were interested in hotels located on street corners. They later added this as an option for their clients.
"Getting these insights is just as important as eradicating bias, because these insights are valuable to business," Fernandez said.