80 likes | 94 Views
Data is a crucial resource for businesses today, but using data legally and ethically often requires data anonymization. Laws like the GDPR in Europe require companies to ensure that personal data is kept private, limiting what companies can do with personal data. Data anonymization allows companies to perform critical operationsu2014like forecastingu2014with data that preserves the originalu2019s characteristics but lacks the personally identifying data points that could harm its users if leaked or misused.
E N D
6 COMMON DATA ANONYMIZATION MISTAKES BUSINESSES MAKE EVERY DAY
DATA ANONYMIZATION MISTAKES Despite the importance of data anonymization, there are many mistakes that companies regularly make when performing this process. These companies’ errors are not only dangerous to their users but could also subject them to regulatory action in a growing number of countries. Here are six of the most-common data anonymization mistakes that you should avoid.
ONLY CHANGING OBVIOUS PERSONAL IDENTIFICATION INDICATORS One of the trickiest parts of anonymizing a dataset is determining what is or isn’t Personally Identifiable Information (PII) is the kind of information you want to ensure is kept safe. Individual information like the date of purchase or the amount paid may not be personal information, but a credit card number or a name would be. Of course, you could go through the dataset by hand and ensure that all relevant data types are anonymized, but there’s still a chance that something slips through the cracks.
CONFUSING ANONYMIZATION WITH PSEUDONYMIZATION According to the EU’s GDPR, data is anonymized when it can no longer be reverse-engineered to reveal the original PII. Pseudonymization, in comparison, replaces PII with different information of the same type. Pseudonymization doesn’t guarantee that the dataset cannot be reverse- engineered if another dataset is brought in to fill in the blanks. Companies that don’t correctly categorize their data into one bucket or another could face heavy regulatory action for violating the GDPR or other data laws worldwide.
ONLY ANONYMIZING ONE DATA SET One of the common threats we’ve covered so far is the threat of personal information being reconstructed by introducing a non-anonymized database to the mix. There’s an easy solution to that problem. Instead of anonymizing only one dataset, why not anonymize all of the ones that share data. That way, it would be impossible to reconstruct the original data. In that case, you have to consider the variety of interconnections that connect databases, and that may mean that to be safe, you need to anonymize data you don’t release.
ANONYMIZING DATA—BUT ALSO DESTROYING IT Data becomes far less valuable if the connections between its points become corrupted or weakened. A poorly executed anonymization process can lead to data that has no value whatsoever. Of course, it’s not always oblivious that this is the case. A casual examination wouldn’t reveal anything wrong, leading companies to draw false conclusions from their data analysis. That means that a good anonymization process should protect user data and do it in a way where you can be confident that the final results will be what you need.
APPLYING THE SAME ANONYMIZATION TECHNIQUE TO ALL PROBLEMS Sometimes when we have a problem, our natural reaction is to use a solution that worked in the past for a similar problem. However, as you can see from all the examples we’ve explored, the right solution for securing data varies greatly based on what you’re securing, why you’re securing it, and your ultimate plans for that data. Using the same technique repeatedly can leave you more vulnerable to reverse engineering. Worse, it means that you’re not maximizing the value of each dataset and are possibly over- or under-securing much of your data.