One of the most common questions that businesses operating under GDPR, LGPD or other similar data regulations have is how long should you keep data?

As answers to this question typically seem to vary widely to clear up confusion, we’ve gathered insights from business leaders & specialists across a variety of industries to try and answer this question and shed light on what are reasonable timeframes to keep hold of data, whether that may be financial, employee or other potentially sensitive data.

  1. Context

“GDPR is focused on personal data, and the GDPR regulation is relatively vague on the topic”, says Michael Puldy.

“In short, the regulation says data cannot be kept longer than you need it meaning there’s no specific time requirement (with exceptions e.g., statistical).”

“This is different than say, how long do you need to keep data for the IRS (3 to 7 years) who are more specific, but the response depends on the filing.”

“Even though there’s no specified retention time limit for GDPR, or in general, companies are required, and should, implement policies to demonstrate their retention policies.”

Michael, also helpfully expanded on a number of additional criteria that you may wish to ask yourself if they also apply when handling data, including;

Legal: “the longer you keep data this could be a liability or an asset.  Does this data help you in case of litigation or will it hurt you?”

“Email is a perfect example of this, many companies force email deletions after a period of 12 months while others may keep emails forever.”

Client Satisfaction: “How far back do I need to go to access a client transaction record.  Bank charges, credit card statements, or anything related to finance tends to have long retention periods – due to concerns of litigation and because of federal and state regulations.”

“If you’re a transportation company (like FedEx or UPS), once a package is confirmed to be delivered, is anyone really going to remember, care or ask me if Grandma’s package sent in 2009 to Miami was delivered?  In this case, a short-term data retention requirement would suffice.”

IT Disaster Recovery: “Responsible companies backup data in case of IT failure. Now, the longer you retain data the more data backup storage is required.”

“One petabyte of primary storage is great, but if the retention requirement is to back up that petabyte of storage every week for 52 weeks, then my 1 petabyte of data becomes 53 petabytes. In this simplified scenario, my “inexpensive storage” is now extremely expensive.”

Technology Changes: “Disk and tape storage devices sold in 2010 are no longer being sold, but many companies store their data on these old storage devices.”

“At some point, the vendors stop supporting these older technologies and companies need to migrate this data – stored on old technology – to new storage platforms.”

  1. How Sensitive Is This Data?

Thierry Tremblay of database builder Kohezion states, “Like GDPR, under HIPAA, no rules limit the period for which you have to keep ePHI.”

“Healthcare organizations have to use their judgment to decide whether it is necessary to keep patient data for a specified period. However, policies, documentation, risk assessments, audits have to be retained for at least six years.”

“The most sensitive topic is personal data (including ePHI under HIPAA or any personal data under GDPR).”

“Businesses need to define and implement rigid data retention and disposal policies that will minimize security exposures. Keeping personal data for too long is like playing a cybersecurity roulette.”

“The only way to reduce risks is to define a list of substantial legal and business reasons for which the data should be kept longer than needed or to delete the data permanently if its retention is not justified.”

  1. When Handling Large Amounts of Complex Data

“In our case, we mostly provide recent market data, but we do also provide historical sales data, so we keep every vehicle record indefinitely,” says William Young of Competitive Intelligence Solutions.

“We don't collect the personal information of the people involved in the transaction, however, so we don't need to worry about mishandling buyer's PII.”

“As we track 70% of all vehicle dealership retail transactions in real-time & host data on 40k dealers and over 453M vehicles, over time the size of our dataset has given us many challenges.”

“For the best results, our data needs to be updated daily, so every day at midnight a new race to get all the processing done starts.”

“As our dataset grew we've had to change our data pipeline many times to improve its efficiency and capability to keep up with our strict deadlines and ever increasing reporting demands.”

“We also have much shorter-lived data such as individual reports or application logs that may only exist for a few hours or days. The size of this data can also be significant, or the data may only be valuable for a limited period of time, so storing it indefinitely like the vehicle sales data would neither make sense nor be practical.”

“We use judicious logging of critical components throughout our data pipeline to identify components that aren't keeping up. This firehose produces a lot of data that has a limited shelf life, so we automatically purge it after a few days. If left unchecked these logs would quickly eat up a huge amount of storage space.”

“Our reports are also regularly updated and overwritten. Keeping old versions of reports around could cause confusion since customers and their users expect data to be as up to date as possible when they query our API. If someone does want an old report we could just regenerate it as needed, so we don't need to retain originals for very long.”

  1. Employee & Job Applicant Records

“The IRS suggests that you keep employee payroll tax records for four years, after the date those taxes were due or were paid,” says Azza Shahid of Heart Water.

“This document often includes data of employees such as employer identification number, amounts and dates of wage, and tax deposits.”

“It is advised where possible that Job applicant files should be kept for at least three years regardless of if you hire the applicant or not.”

Dennis Bell of Byblos Coffee adds, “Once an employee leaves, we keep their payroll details for six years. In the event that there is a revenue audit, we still have the records and calculations of the benefits given to an employee.”

In Summary

Whether you need to retain data for a few days, weeks, months or years, we have the perfect solution for you at Logit.io.

However long you retain your data, you’ll be able to analyse and manage it with our high-quality ELK as a service platform, which makes the best of the open-source tools Logstash, Elasticsearch and Kibana to deliver the best data analysis experience possible, ensuring that you can meet compliance regulations under GDPR, HIPPA and SOC 2.