Mastering Paginated Deletions in MSSQL Tables

Discover the art of efficiently removing vast amounts of data from MSSQL tables using paginated deletions and understand the nuances of this technique (code included!)

Managing large datasets in MSSQL tables often comes with the challenge of efficiently deleting data.

Traditional deletion techniques can be resource-intensive, leading to bottlenecks and performance issues.

Enter paginated deletions: a technique that splits the data into manageable chunks, ensuring smooth deletions and optimum performance.

Why Paginated Deletions?

When deleting a large amount of data, it's tempting to go for a one-shot delete command. 

However, doing so can balloon the SQL Transaction Log file, risking disk space exhaustion and system slowdowns.

Paginated deletions, by contrast, systematically remove data in batches, reducing system strain and preventing unforeseen issues.

Dive into the Paginated Deletion Technique

1. Creating a Sample Table

Let’s initiate with a basic table named Logs:

CREATE TABLE [Logs] (
    [LogId] INTEGER NOT NULL IDENTITY(1, 1), 
    [Text] VARCHAR(MAX) NULL,
    [Severity] VARCHAR(7) NULL,
    [Created] DATETIME,
    PRIMARY KEY ([LogId])
);

And then, inject it with some mock records:

INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 1','Error',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 2','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 3','Info',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 4','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 5','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 6','Error',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 7','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 8','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 9','Info',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 10','Info',GETDATE())

2. Prepping for Deletion

Before diving into the deletion, some groundwork is essential:

  • Batch Size: Define the number of records you want to process in one go. This is essential for paginated deletion. For demonstration, we'll use a batch size of 2.
DECLARE @pageSize INTEGER = 2;

Temporary Tracking Table: This is used to track record IDs and their associated page index

DECLARE @tempLogIdWithPageIndexTable TABLE
(
    LogId INT,
    PageIndex INT
);

3. Deletion Logic

Now, for the core logic:

  • Assign Pages: With our batch size in place, every record in the Logs table is associated with a page index.
INSERT INTO @tempLogIdWithPageIndexTable (LogId, PageIndex) (
	SELECT  LogId, 
	CEILING(CAST(ROW_NUMBER() OVER(ORDER BY LogId) AS DECIMAL)/CAST(@pageSize AS DECIMAL)) AS PageIndex
	FROM [Logs] -- You can add WHERE conditions here
)

Calculate Total Pages:

SELECT @totalCount = count(1) FROM @tempLogIdWithPageIndexTable
​
DECLARE @pageCount INTEGER
SET @pageCount = CEILING(CAST(@totalCount as DECIMAL) / CAST(@pageSize as DECIMAL))
​
DECLARE @pageIndex INTEGER = 1

Paginated Deletion Loop: The paginated deletion is performed page-by-page using a loop.

WHILE(@pageIndex <= @pageCount)
BEGIN	
	-- you can delete related entries first using JOIN here
	DELETE [dbo].Logs where LogId in 
	(SELECT LogId FROM @tempLogIdWithPageIndexTable WHERE PageIndex = @pageIndex)
​
	PRINT 'Processed page ' + CAST(@pageIndex as VARCHAR(MAX)) + '/' + CAST(@pageCount as VARCHAR(MAX))
	SET @pageIndex = @pageIndex + 1
END

4. Post-Deletion Analysis

After the deletion, it's good to have an analysis of what transpired:

PRINT 'Script finished at : ' + CONVERT(varchar(25), getdate(), 120) 

SET @totalCount = (SELECT count(1) FROM dbo.Logs)

-- print the report when the job is done
PRINT 'Number of records after deletion : ' + CAST(@totalCount as VARCHAR(MAX))
Total number of records to remove: 10
Page size: 2
Total pages: 5
Script started at : 2020-09-05 15:36:21

(2 rows affected)
Processed page 1/5

(2 rows affected)
Processed page 2/5

(2 rows affected)
Processed page 3/5

(2 rows affected)
Processed page 4/5

(2 rows affected)
Processed page 5/5
Script finished at : 2020-09-05 15:36:21
Number of records after deletion : 0

Full code

use [dbName]

/* Removing data from large table in MSSQL using batch */

CREATE TABLE [Logs] (
    [LogId] INTEGER NOT NULL IDENTITY(1, 1), 
    [Text] VARCHAR(MAX) NULL,
    [Severity] VARCHAR(7) NULL,
    [Created] DATETIME,
    PRIMARY KEY ([LogId])
);
GO

INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 1','Error',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 2','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 3','Info',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 4','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 5','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 6','Error',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 7','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 8','Warning',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 9','Info',GETDATE())
INSERT INTO [Logs]([Text],[Severity],[Created]) VALUES('Lorem ipsum 10','Info',GETDATE())

SELECT count(1) FROM dbo.Logs;

DECLARE @tempLogIdWithPageIndexTable TABLE
(
	LogId INT,
	PageIndex INT
)

DECLARE @pageSize INTEGER = 2;
DECLARE @totalCount INTEGER
​
INSERT INTO @tempLogIdWithPageIndexTable (LogId, PageIndex) (
	SELECT  LogId, 
	CEILING(CAST(ROW_NUMBER() OVER(ORDER BY LogId) AS DECIMAL)/CAST(@pageSize AS DECIMAL)) AS PageIndex
	FROM [Logs] -- You can add WHERE conditions here
)

-- Get total counts of records to remove

SELECT @totalCount = count(1) FROM @tempLogIdWithPageIndexTable
​
DECLARE @pageCount INTEGER
SET @pageCount = CEILING(CAST(@totalCount as DECIMAL) / CAST(@pageSize as DECIMAL))
​
DECLARE @pageIndex INTEGER = 1
​
-- print the report with metrics before start

PRINT 'Total number of records to remove : ' + CAST(@totalCount as VARCHAR(MAX))
PRINT 'Page size : ' + CAST(@pageSize as VARCHAR(MAX))
PRINT 'Total pages : ' + CAST(@pageCount as VARCHAR(MAX))
PRINT 'Script started at : ' + CONVERT(varchar(25), getdate(), 120) 

-- delete records in a loop by paging

WHILE(@pageIndex <= @pageCount)
BEGIN	
	-- you can delete related entries first using JOIN here
	DELETE [dbo].Logs where LogId in 
	(SELECT LogId FROM @tempLogIdWithPageIndexTable WHERE PageIndex = @pageIndex)
​
	PRINT 'Processed page ' + CAST(@pageIndex as VARCHAR(MAX)) + '/' + CAST(@pageCount as VARCHAR(MAX))
	SET @pageIndex = @pageIndex + 1
END

PRINT 'Script finished at : ' + CONVERT(varchar(25), getdate(), 120) 

SET @totalCount = (SELECT count(1) FROM dbo.Logs)

-- print the report when the job is done
PRINT 'Number of records after deletion : ' + CAST(@totalCount as VARCHAR(MAX))

SELECT count(1) FROM dbo.Logs;

Bonus: Efficiently Clearing All Records from an MSSQL Table

Desiring a swift cleanse of your entire MSSQL table?

Here's how:

1. Utilizing TRUNCATE: The Speedster Command 🚀

The most straightforward method to obliterate all rows from a table is wielding the TRUNCATE command.

TRUNCATE TABLE table_name;

2. Foreign Key Constraints: A Note of Caution ⚠️

It's crucial to remember: TRUNCATE doesn't play well if your table has foreign key constraints. But, fear not; there's a handy detour.

Action Steps:

  • Drop the FK Constraints: This temporary removal ensures a smooth truncation.

  • Execute TRUNCATE Command: As demonstrated above.

  • Rebuild the FK Constraints: Reinstate them to ensure your data relationships remain intact.

3. Resetting Identity Post-Cleanse 🔢

After waving goodbye to all rows, you might want to reset the table's identity value.

Here’s how:

DBCC CHECKIDENT ('table_name', RESEED, 0)

You've now efficiently cleared your table while ensuring database integrity remains unscathed.

Key Takeaways and Concluding Thoughts

The paginated deletion technique is invaluable when managing large datasets.

It offers a balance between efficiency and resource consumption.

This method ensures data integrity 🔒, avoids system strain💥, and guarantees optimum performance.

📁 Fetch the complete source code for this guide over on GitHub repository

↑ Top ↑