Start thinking about data archiving early

By | January 15, 2014

Dealing with data growth is a big challenge for DBA’s and sysadmins. The main problems to deal with are  storage capacity (server storage is not cheap!), backups / index maintenance windows and query performance. Implementing data retention helps dealing with these issues and I often make a checklist:

  • What kind of data are we dealing with?
  • What is the expected data growth?
  • Which data can be deleted and which data must be kept?
  • What retention time to use? How many days, months, years must the data be available? (What are the requirements from a business perspective? Do we have to deal with government regulations for retaining data for a certain amount of time?)
  • Should we offload older data to another (cheaper) storage system?
  • Is it allowed to aggregate older data and lose some of the detailed information?
  • Should we use SQL Server Enterprise partitioning for easier data management?
  • Should we use Enterprise compression features?

A lot of these decisions do impact the business.  A customer might not see all his historical data. Big organisations like banks in the Netherlands for example only store the last 2 years of your banking account transaction history.

In the development process of an application these questions often remain  unanswered. Then problems start to arise in the production environment. Things get bad if the contracts fail to mention anything about retention. A customer might not agree to archive his data.

Developers often don’t have a focus on archiving  because  it’s not a critical issue in the development process. I used to work as a developer and made the same mistakes ignoring the archiving needs for certain data. That’s why DBA’s and sysadmin’s always need to be involved in the development process. Also business people need to provide as much information as possible about the expected number of customers and transactions so DBA’s can do forecasting and size storage and servers.

So to prevent problems in production it’s a good idea to start thinking about data archiving as early as possible.

Leave a Reply

Your email address will not be published. Required fields are marked *