By John Walubengo
This week being International Data Protection Week, it is perhaps timely to explore the big debate about whether data protection laws curtail AI developments.
At face value, one gets the feeling that, indeed, that may be the case. There are several Data Protection Principles that we could go through to support this view. This includes purpose limitation, data minimization, transparency, and data security, amongst others.
Purpose and Data Minimization
The purpose limitation principle dictates that before personal data is collected, one must define the purpose for that data collection. If one is a hospital, a school, or an online platform, the purpose may be for registration, admission, subscription, or otherwise.
Whatever the case, the entity is not authorized to repurpose those data sets without secondary consent from the customers. It is usually the case that AI algorithms mine these data sets for additional purposes, such as predictions, which strictly speaking are over and above the original purpose of registration.
The data minimization principle also states that the amount of data that organizations should collect should be tied to their original purpose. So if one is collecting data for registration purposes, the specific questions that the customer is requested to answer within the data collection forms should be tied to the purpose.
One should not be asked to state, for example, their educational or financial background when all they want is to book a flight. For flight bookings, one would only need, at a minimum, their name, ID, and payment method to complete the process. Any data sets beyond that would be considered a violation of the data minimization principle.
However, from an AI perspective, the more data, the merrier. Most AI algorithms perform better predictions with more rather than fewer data points about an individual.
Transparency
The principle of transparency has two components. One is that the customer should be made aware of what personal data the organization is collecting, how they will use it, and whom they are likely to share it with.
The second component is that decisions made based on processing customer data should be explainable; that is, even a decision that has a negative and significant impact on the customer, e.g., denial of credit, denial of VISA, denial of college admission, amongst others, should be transparently understood by prospective clients.
With the increasing use of complex AI mechanisms, such as neural networks, AI decisions are becoming increasingly opaque, even to the technical teams who build machine learning models.
A case in point exists where a couple, husband and wife, share a joint bank account and apply for a credit facility, and the AI logic denies the wife the credit while accepting to extend the credit facility to the husband.
At face value, it does look like a clear case of gender discrimination, but the ML engineers who built the system could not explicitly trace out and pinpoint the source of the weakness to correct it.
Algorithmic transparency requirements are therefore going to remain a challenge going into the future as AI technologies continue to get more sophisticated and opaque.
Data Security
Finally, the principle of data security provides that organisations must secure their online and physical filing systems from unauthorized access, unauthorised changes, and unscheduled downtimes.
In data security lingo, these three properties are commonly known as the CIA triad: ensuring the confidentiality, integrity, and availability of the data collected.
From the recent ChatGPT experiences, AI systems are beginning to get hacked to the point where they un-expectedly reveal their raw training data, which has confidential and often sensitive personal data. Additionally, there are cases where hackers are now beginning to illegally tweak the parameters of AI models, forcing them to deliberately provide wrong answers to prompts.
So next time you get an AI solution, consider it a suggestion rather than a final, authoritative solution to your query.
So, back to the original question. Do data protection laws restrain AI developments?
The answer is yes but for a good reason. Innovation must be balanced against responsibility. There is a growing demand for responsible AI solutions, and these tend to be achieved under rigorous privacy laws.
RELATED
John Walubengo is an ICT Lecturer and Consultant. @jwalu.