How GDPR Will Impact the AI Industry

In what can be described as a kind of arms race, tech companies have been amassing large volumes of user data to hone the artificial intelligence algorithms that power their applications and platforms. Thus far, they've mostly been able to evade accountability when their practices have pushed them into legally and ethically gray areas.

But that might change on May 25, when the European Union's General Data Protection Rules (GDPR) come into effect. The GDPR will impose unprecedented restrictions on the collection and handling of user data in the EU region and slap heavy penalties on companies that fail to comply.

That might sound like bad news for companies that use AI algorithms, which have benefitted from lax data-collection regulations (and lengthy, boring, and ambiguous terms of service documents). Some fear that stricter rules will hamper innovation and deployment of artificial intelligence in many applications and domains. Others believe the new directive will create a foundation where AI applications will become more reliable and trustable.

Whatever the case, the AI industry is in for a big shift in the GDPR era.

Data Ownership and Privacy

"GDPR is a big deal for AI, because it necessitates that we think differently about how we collect and use data," says Tim Estes, founder and president of Digital Reasoning. "For too long, tech companies have insisted that in order to receive value from their products and services, you had to give up your data."

Previously, companies were required to receive only a vague consent from users to collect all sorts of data. "AI has helped to keep the big-data hype alive—providing yet another reason why businesses should collect and mine all available data," Estes says. "Many companies have started to implement AI to analyze their data just because they think they should—without thought to the impact on user privacy or data ownership."

The most basic premise of GDPR is that data belongs to users. Under GDPR, companies will have to reveal the full scope of information they collect as well as how they use it and how they'll protect it and prevent unauthorized access. The new rules will force AI companies to be more meticulous about the data they collect as opposed to becoming engaged in wanton hoarding, processing, and sharing of user information.

The Right to Be Forgotten

GDPR gives users the power to demand that a company erase all their data from its servers. This won't sit well with AI companies, which have a vested interest in keeping as much user data as possible to perform tasks such as predicting trends and user behavior.

"At the end of the day, GDPR is all about how you collect and manage the data and not necessarily about how much data you have," says Maryna Burushkina, CEO of GrowthChannel. "The main difficulty that most companies will face is not much about getting opt-ins, but more of managing the data, communicating the use of data to users, and allowing users opportunity to delete it."

AI companies will have to go extra steps to anonymize their data if they still want to have access to these insights. But other challenges face companies that already have large stores of user data.

"Under GDPR, if a company wants to erase a specific PII [Personally Identifiable Information], then they have to ensure it's erased everywhere," says Amnon Drori, CEO of Octopai. This can be a daunting task to perform manually when your data is scattered across different servers and stored in different structured and unstructured formats.

For instance, when deleting a user's credit card number (certainly a sensitive piece of information), companies will have to look into every report, database, database object, and ETL where the information is stored. "Sometimes we see different metadata names for the same item: for example, 'credit card number', 'cc number', credit c number', 'card number', 'credit card no.'...the list goes on and on," says Drori. It is often impossible to know where to look and the process can take weeks or even months, and like many manual processes, it is prone to human error and inaccuracies, Drori says.

GDPR will also raise the cost of human mistakes in handling data. "This is why so many companies today are seeking an automated solution to accurately manage their metadata," Drori says. Perhaps ironically, AI itself can be a solution in this regard. AI-powered metadata management tools can scan all data sources within an organization and consolidate relationships between different tools and data sources.

The Right to Explanation

One of the most significant parts of GDPR regarding AI is what has become known as the "right to explanation." The directive states that companies must notify users about "the existence of automated decision-making" and provide them with "meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject."

This basically means that users must know when they're being directly or indirectly subject to AI algorithms and should be able to challenge the decisions those algorithms make and request proof of how the conclusion was derived. This will be one of the biggest challenges that the AI industry will face.

Deep neural networks, the main technology behind contemporary AI algorithms, are complex software structures that create their own functionality rules by analyzing large sets of data and finding correlations and patterns. As neural networks grow more complex, their behavior become increasingly harder to decompose. Often, not even engineers can explain the reasons behind the decisions their AI algorithms make.

Called the "black box" problem, the inexplicability of AI algorithms has made it hard to implement them in courtroom decisions, law enforcement, loan and credit applications, recruiting, healthcare, and other critical domains. But without any legal leverage, AI companies had little incentive to make their AI algorithms more transparent, especially when they were closely tied to their trade secrets.

Now, GDPR will hold AI companies to account for the decisions their algorithms make.

"As part of GDPR, organizations are responsible to clearly describe the processing method in human language while requesting consent from the subject," says Pascal Geenens, security researcher at Radware. "As deep learning evolves, and data scientists are unable to characterize a deterministic nature behind the neural network's reasoning, this description might become more complex and difficult to explain."

Basically, GDPR is about making the humans who process the data accountable, Geenens says. So if you're using machine learning algorithms to do the processing, you must design them in a way that will enable you to explain the decisions they make on your behalf.

A handful of organizations are trying to develop technologies to make AI more transparent. Prominent among them is DARPA's Explainable AI (XAI), a research project that aims to make AI-based decisions understandable.

Outsourcing AI

GDPR will also affect organizations that make their data available to third parties. A prominent example of such a company is Facebook; in its Cambridge Analytica scandal, the social media giant failed to prevent the data mining firm from collecting and abusing the data of 87 million users. But GDPR will also have implications for companies that outsource their AI functionalities and make their data available to AI providers.

"While many assume that AI providers are like other service providers—simply offering their tech in exchange for monetary compensation—the truth is, AI providers also enter business partnerships as a way to build upon and evolve their technology," says Estes, the chief executive from Digital Reasoning. This means that an AI provider might want to hold on to a client's data to further train its own algorithms and use it in other domains.

For instance, an AI provider that's helping a healthcare organization find patterns in symptoms and improve diagnoses might come across a data set that improves its proprietary algorithms. The AI company might then want to leverage the data to improve its algorithms for other types of patient care, to evolve its capabilities to help other healthcare providers. Under GDPR, the healthcare organization in charge of the data will be held to account for any unethical usage by the AI provider. The key, Estes believes, is for businesses to seek out AI providers that believe in owning the algorithms, not the data.

"GDPR will force businesses to pay closer attention to how and when their data is used, where it is stored, and what happens to it after a project is completed," Estes says. "This means working with AI providers that help define the lines of data ownership and implement strategies that protect user information, while not hindering the ways it is able to help evolve the success of AI algorithms."

Will GDPR Hamper AI Innovation?

Experts we spoke to believe that although the new regulations will challenge the current practices and habits AI companies have adopted, it will also force them to find new ways to innovate and also to maintain respect for privacy and ethical standards.

"With GDPR regulation coming into force, all the major software companies are taking necessary measures to not just ensure compliance but also to innovate and think out of the box to find new opportunities in the market," says GrowthChannel's Burushkina.

"Innovation will not be hampered—but directed and motivated—by GDPR," says Octopai's Drori. Meanwhile, GDPR will also give rise to new businesses and technologies that will help organizations to achieve and maintain GDPR compliance.

The standards set by GDPR might in fact help bridge the widening trust gap between the providers and users of AI-powered services. Estes believes GDPR will make both AI providers and those implementing their technology more responsible for how and where they use data resources and push them to put users before profits. "At the end of the day," he says, "AI providers should only need to own the algorithms—not the data—to innovate their capabilities and solutions."

This article originally appeared on PCMag.com.