How Does AI Invade Privacy?
- Ian Cohen
Privacy and AI
It’s easy to compartmentalize the idea of digital privacy. “Whatever,” we may think. “No one is interested in my life.”
Then we watch something like The Social Dilemma that explains how companies — and Russian hackers — use AI to plumb the depths of the internet, registering our every move as fodder for their neuroscience-based behavior modification triggers. Maybe they’re just trying to get us to re-engage with Twitter, or buy that new mattress you’ve had your eye on. Maybe they’re Russian hackers using AI-collected information to push your outrage buttons. Either way, even though they don’t know you, they’re crawling around in your data finding ways to direct your behavior…and that’s just creepy.
Data Collection
That’s the thing about AI when used for data collection: by itself, it’s neither good nor bad. The real question surrounds how the data is used.
Data collection for utilitarian purposes is incredibly helpful for lots of things like allowing you to log in faster or finding the information you’re looking for more quickly. It’s the nefarious purposes that are worrisome.
For example, in 2020, a lot of articles sounded the alarm about an artificial intelligence app called Clearwater that can identify someone from a photo after scanning internet images for matching data. While a tool like that could be great for learning about someone you might want to date or, as some police departments found, finding suspects, it could also be used by marketers to target advertising opportunities. In far worse examples, these tools can also be used by stalkers and kidnappers.
Artificial intelligence is often used to invade privacy because it is so good at quickly gathering and comparing data. And it’s just becoming more efficient.
In 2019, the leading AI and machine learning technology could achieve 38.7 quadrillion operations per second. That’s slow compared to the fastest AI of 2021. AI can quickly build a picture of someone by scanning through public records, social media, the apps people use, the websites they frequent, the items they’ve purchased, and more.
How AI gets to your data
Every time we use an app or a website and “accept cookies,” we’re essentially putting little trackers on our computer or phone that let companies follow us around and compile data on us that they then might share with somebody else. Google, for example, can track about 70 percent of credit card purchases made online.
As we said earlier, data mining using AI is often used for fairly innocuous things like selling stuff. If we visit an outdoor sporting goods site, for example, that company might let its partners know we were there, so the partners can offer us a deal on an outdoorsy magazine subscription or a wilderness vacation. That’s a little unnerving, but potentially also nice if you’re interested in the offer.
But some of this tracking is used for other things that we might not like. Like employers or others scoping you out before making a decision about hiring you, or Russian hackers pushing your emotional buttons to create social chaos. Hackers can do that because AI collects the data that tells them which buttons to push.
The boiling frog analogy
As AI improves over time, we’re getting more and more accustomed to the idea that trading our personal data for convenience is an acceptable exchange. It’s a little bit like the boiling frog story. Slowly, with each new minor convenience, we’ve acclimated ourselves to the idea that our privacy is worth giving up. In time, we become so entangled in this idea that taking back control of our private data seems too difficult an undertaking.
From time to time, something like Clearwater makes us pull up the covers, but then we get busy and shrug our shoulders about the data that’s being collected on us. The problem is that when we do mind, it’s hard to control who has access to our data.
And it’s a very lucrative business. The spend on AI systems will reach $110 billion in 2024 globally and the broader AI market for hardware, software, and support services is expected to explode to $500 billion by 2024. Getting companies to self impose ethical restrictions when that kind of money is at stake is tough, especially in the United States.
So exactly what’s going on with privacy and AI?
AI makes a picture of you
For a long time, people thought that Facebook was listening to their private conversations. How else would it be that you would talk to someone about buying a drone or a pair of boots and the thing you talked about would show up in your ad feed the next time you logged on? AI is how.
Facebook is not listening to your conversations. In addition to being illegal, the logistics of listening to that many conversations would be staggering, especially since most conversations don’t contain any useful information for people who would buy it. AI is a simpler, cleaner, and a mostly unregulated way to find out what you’re thinking and what you might buy. By pulling together a lot of other information about who you are like where you live, where you shop, and who your friends are, companies like Facebook are able to make a composite picture of you in order to make valuable predictions about your future behavior.
A warped picture
The picture AI creates of us isn’t always accurate. It gets things wrong. This is one reason self-driving cars aren’t the norm … yet.
As an example, photos AI creates from pixelated images may only resemble the person’s real face, not replicate it. AI often mistakes one object for another, one emotion for another, and it may not register people of color at all. But it’s getting better all the time. And it can sometimes find very specific and uncomfortable information about us.
One study showed that attackers who can access encrypted web browsing data in transit can sometimes use machine learning to spot patterns that can predict which website, or even which page someone is visiting. The technique, known as web fingerprinting, could identify a website from 95 possibilities with up to 98 percent accuracy.
AI can even use a handful of identifying data points – like your geolocation — to de-anonymize so-called “anonymous” data.
No governing body
At present, there’s very little to protect consumers from this technology or the people and organizations who use it. Europe enacted its General Data Protection Regulation which requires websites to ask consumers for permission to collect their data as well as ensuring it will turn over any data collected at the consumer’s request, among other things.
The California Consumer Privacy Act is another. And there are various international efforts to create laws and a set of ethics around the use and deployment of AI. But the technology is so new, relatively, and the people who make and enforce policy so unfamiliar with it, that it’s a bit of a wild west with no one really in charge.
Some people have proposed creating compensation for sharing our data, a data bank if you will so that every time your data is shared in a way that will economically help a company, you are compensated. That has yet to be created though.
Emerging technologies may solve the problem for a while. Two such technologies include Differential privacy systems which introduce randomness into user data to thwart de-anonymization tactics and homomorphic encryption which lets machine learning algorithms operate on data without decrypting it. Technologies like this could help…until someone finds a workaround, which hackers usually do.
It may turn out that market forces drive privacy protection. In April 2021 Apple updated their iOS to let users decide whether apps can track them. Facebook criticized Apple for this move, knowing that losing access to users’ data was going to hit them hard in the bottom line. If this turns out to be a business move that pays off for Apple though, other companies may try it, and choose to become a brand known for doing all they can to protect their users’ information.
Protecting your own privacy
If you’re concerned about keeping your data as private as possible, there are some strategies for doing so.
These include:
- Use a VPN to mask your activity online
- Open your browsers in Incognito Mode
- Utilize open-source browsers and operating systems
- Don’t accept cookies or take the extra step to only accept cookies that are essential for a website to function
- Play the adversary and send false identity signals
This last suggestion is deploying “adversarial examples” or false trails in a sense, for the AI to follow. Researchers at Duke University found that AI and machine learning can identify a person’s gender by their rating of a particular movie. But by deliberately tweaking your behavior by a few data points, like using simple product rating apps to create a false sense of where or who you are, you can throw the AI off. Of course, it’s machine learning so it will be back on your trail in no time.
On the one hand, people often don’t care that their data is being collected. But when a data breach threatens their money, their safety, their relationships, their future, or similar fundamental attributes of a free society and happy life, privacy and AI suddenly become a very relevant issue.
Whatever rules are put in place must be international, since the digital world has no borders. Ultimately the force for creating a shared set of ethics and laws will likely come from a combined effort between governments, consumers, and the industry working together for a common good.
Lokker is here to help you figure out cybersecurity and privacy solutions for your employees and customers that will comport with legislation and prepare you for the future.