Limiting Data Collection = Limiting Risk

May 24, 2021 Tom Madsen

In the past months, we have seen a marked increase in cyber-attacks that are causing leaks of company data, data that is often both critical to the company operations but also of critical importance to the people whose data is now floating around on the Internet. Ransomware incidents that steal data as part of the attack are becoming the norm rather than an exception and, depending on the sensitivity of the data, it can be turned into an additional source of income. Recently, an American police department was attacked, with the data stolen consisting of extremely sensitive data about the police officers themselves as well as various informants to the police.

By stealing data, as well as encrypting it during a ransomware attack, the attackers now have additional leverage to make the victims pay. If they do not pay to get their data back –for instance if the victim has good backups– the attacker can now threaten to release the data if no payment is forthcoming. Articles on how to defend and recover from a ransomware attack have been featured previously on this site. This article, however, makes another argument to help to minimize the significant risks of a successful ransomware attack on our customers and companies. That is, limiting the data we are collecting on customers and employees.

Limit Data

The EU GDPR regulation restricts the kind of data we can store and process. Specifically, GDPR article nine details some special categories of data with concrete limits on storage and processing. These include, among others, health, religion, political views, and sexuality. You can read more detailed information and guidance on GDPR article nine here.

So, there have already been limits to what kinds of data we can store for longer terms and what kinds of processing we are allowed to do with that data. Before moving on, we should define what kinds of data is proposed to be limited. Production data or data collected from ICS systems for reporting purposes or trend analysis is not included in this. Not that this kind of data cannot be sensitive and of value to competitors, but data that can embarrass or tarnish a brand in a bad way is mostly related to data on customers or employees, that is, privacy related data.

Minimizing Data Collection

In any scenario, there is of course a minimum amount of data needed to, for instance, identify a customer uniquely and different scenarios require different kinds of data collection. This is what makes this proposal a somewhat difficult one to go about implementing, but nevertheless, there are some general suggestions for tackling this issue.

Most certainly, there are cases where the following suggestions will not work, because there is a need for the data in question to be collected. These suggestions are for us to examine the information we store and ask ourselves: Do we really need it?

Suggestion 1: Don’t store data for one-time use

In Denmark there is a unique national ID number called CPR number. This number is unique to an individual in Denmark and thus, is a data point that is PII (Personally Identifiable Information). This ID is used in many national systems, such as driver’s license and passport systems. Surely, your own country has something similar in use.

Unless there is a need for us to query some public registers, then there is no need to collect or store such information. We could use the same systems that the various credit card providers are offering their clients. For credit cards, we can ask for the credit card number in an application but route the number directly to the credit card provider for verification, without ever storing these numbers. Why not use the same trick with national ID numbers, when they are needed?

Aside from this, do not use a national ID as key in a database! Although it will be inherently unique, because of the PII nature of this number, you would be exposing the affected individuals to identity theft if this data gets stolen.

Suggestion 2: Confirm the purpose for data processing

Let us review our customer applications and the data that is stored as a result. How much of this information is merely a ‘nice to have’ and how much is a ‘must have’? Look first at the ‘must have’ data and develop a protection strategy that can be used to identify individuals, such as phone numbers or addresses. For the ‘nice to have’ data, what exactly is the requirement for collecting it? Is it actively used for something or is collected just in case it might be needed at some point in time? If the data is not needed, we should not be storing it.

An example here could be an airline that asks passenger for their dietary requirements. Should they be asking about religion in this case or just ask about the diet? Obviously, most of the bigger airlines are just asking for diet requirements and not any religious affiliation on their registration forms. Still, there are cases where you are asked to identify which demographic you identify with – European, Asian African, and so on. Why is the demographic or race important? Or sexuality for that matter?

Unlike big ad companies, such as Facebook or Google, that are collecting everything that is not nailed down, we should limit our collection and storage of personal data to the absolute minimum needed to conduct business with customers and clients.

Finishing Thoughts

Businesses have for many years been collecting data on everybody and everything because they could, but with the marked increase in data breaches and stricter regulation, we absolutely must look at the data we are storing in our systems and perform assessments on the value of this, as well as the risk associated with storing it. Unless an attacker is abusing our hardware for coin mining, then the only reason that we are being attacked is our data, and if we are not storing a specific piece of data in the first place, then we cannot lose it during an attack. Additionally, our customers and clients will surely appreciate the shorter forms they are being asked to fill in, on many occasions.

Tom Madsen

Senior Security Consulting Manager at Accenture | + posts

Tom Madsen has been active in the cybersecurity industry for more than 20 years. Tom graduated from the University of Aalborg and covered several technical roles in security during his professional career. He is certified as CISSP, CISA, CISM, CGEIT, CRISK, CCSP, CDSPE and CSSLP, and has published the book "The Art of War for Cybersecurity". He is currently writing a book 'Security Architecture - How & Why'.

Limit Data

Minimizing Data Collection

Suggestion 1: Don’t store data for one-time use

Suggestion 2: Confirm the purpose for data processing

Finishing Thoughts

Tom Madsen

Tom Madsen

Leave a Reply Cancel reply