SQL Server: utilize partitioning to purge large tables

Forrards
March 18, 2023

Sooner or later every DBA needs to purge old data from database tables. And it often happens that such tables are large enough. This can be done using table partitioning. Such approach is used rarely but in case of really large table it is very useful. In this article we`ll show an example of data purging based on partitioning from real life experience.

So we had a table with application logs in one of our client`s production databases. It was about 80 millions rows in it. We needed to leave only actual data in this table and moreover this table had to be available for inserts during purging. Size of actual data was 10 millions rows. Of course we could have setup a job which would delete rows from the table in small chunks in a loop. But it can be time consuming and would place locks on the table. So we decided to go with partitioning.

Here is a high level plan:

1. Rename current logs table.

2. Create a new table named like the old one.

3. Create partitions with old data and actual data in renamed table.

4. Add partition with actual data to the log table.

Here are some challenges we had to overcome:

  • You can add partitions to the table only if the table is empty.
  • Creating partitions on the table with millions of rows may take some time and table may get filled.

Here is the workaround. We renamed log table two times. We did it second time after creating the partitions. So here is how it looks like.

  • Log table is renamed
  • New log table is created
  • Partitions are added to the table
  • The data is copied from second table which was renamed to the new table. There were collected a few thousands records in it during the time of partition creation.

Let`s get to the code.

Assume we have a table listed below

1.   Rename the table

2. Create new table with the same structure

3. Create partition function, partition schema and partitions.

4. Add partition to the new logs table.

In current code sample we omit the steps of second rename of the table and recreation of the log table. They are similar to points 1 and 2. Also it is interesting to mention that there were transactions waiting to be inserted to the logs table at that moment. After creation of the tables there were about 100 rows of new data. Creation and adding of partition takes a few milliseconds.

Here is how to check which partition stores the data

Using this approach we get following benefits

- Table was available, no down time..
- Data is separated
- Time taken is not much

A few notes

- Partitioning is available only in Enterprise edition of SQL Server

- At the moment of creation of a new table and adding a partition there can data insertion running on the table. In such case it will not be possible to add a partition. You will have to repeat this step again.

- This approach may not succeed in case of some specific data schema.

Good luck with your table partitioning.

Forrards
March 18, 2023
Share
Recent articles:
Database consultancy: SaaS database considerations
March 31, 2023

Utilizing given approach each client get his own database. In such case every client`s data stays isolated from others. It simplifies backing up the data, installing new clients but causes higher cost of maintenance.

More
Database consultancy: tree-like structure
March 31, 2023

Subject recursion is well covered in the literature, but, nevertheless, the problem of output “tree” does not mean the client and SQL Server many baffled. So, put the problem: there is a table with the name and record id field indicating the parent identifier. Immediately fill in this table, some sort of test data:

More
Database consultancy: Hierarchy in MS SQL Server
March 31, 2023

In our time of greatest prevalence of databases were relational databases, which are the main targets of the tables and the relationships between them. Tables can solve most problems for data storage and manipulation. But in the real world entity requiring storage is not always presented in a tabular form. One of these very common types of data structures other than the table is a tree structure, where each data element is the parent and the offspring. An example of such a structure may be the structure of state enterprises, which is headed by the director (the root of the tree), his deputies, heads of departments from which are subject to certain deputies, employees of departments, which are subject to the rulers.

More

Subscribe to our blog

    

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Database consultancy: SaaS database considerations

Utilizing given approach each client get his own database. In such case every client`s data stays isolated from others. It simplifies backing up the data, installing new clients but causes higher cost of maintenance.

March 31, 2023
Database consultancy: tree-like structure

Subject recursion is well covered in the literature, but, nevertheless, the problem of output “tree” does not mean the client and SQL Server many baffled. So, put the problem: there is a table with the name and record id field indicating the parent identifier. Immediately fill in this table, some sort of test data:

March 31, 2023

Discuss your project with us

    

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.