Temporal Tables - Tales from the Field
- Posted in:
- SQL Server
- Database Administration
My organization has a few different approaches for maintaining and querying old "versions" of data: the best of them is a traditional design that uses triggers and a secondary history/audit table. I'm generally OK with that design. However, most of the other approaches I've seen are not very good in my opinion. Because of that, I've wanted to use SQL Server temporal tables for quite some time. I finally got an opportunity to use one in production. Most everything went smoothly, but there were a couple of unexpected side effects I wanted to share.
The source table that was chosen was using a design that periodically inserted the entirety of the table to a history/audit table throughout the day. A secondary process would delete rows more than 30 days old. Periodic "snapshots" of the data are better than nothing, I suppose. But the cost was fairly high: the source table had about 10 million rows, and the history/audit table grew by 50 million rows per day. A month of data resulted in about 1.5 billion rows. Unfortunately, the source table was rather "wide", with data types that consumed a lot of space. Even with PAGE compression and no nonclustered indexes, the table consumed 480 GB of space on disk.
This post isn't intended to be a "how to" guide for temporal tables, but I have included the script/code that was used. Note that the version used was SQL Server 2017, the history table was created in a "temporal" schema, row start/row end columns were hidden, and a history retention period of one month was specified.
USE SampleDatabase; --Enable TEMPORAL_HISTORY_RETENTION as needed. IF EXISTS ( SELECT * FROM sys.databases d WHERE d.name = DB_NAME() AND d.is_temporal_history_retention_enabled = 0 ) BEGIN ALTER DATABASE SampleDatabase SET TEMPORAL_HISTORY_RETENTION ON; END GO IF SCHEMA_ID('temporal') IS NULL BEGIN EXEC('CREATE SCHEMA temporal AUTHORIZATION dbo;') END GO /* --Add the Period columns. NOTE: ALTER TABLE only allows columns to be added that can contain nulls, or have a DEFAULT definition specified, or the column being added is an identity or timestamp column, or alternatively if none of the previous conditions are satisfied the table must be empty to allow addition of this column. Column 'ValidFrom' cannot be added to non-empty table 'SampleTable' because it does not satisfy these conditions. */ ALTER TABLE dbo.SampleTable ADD ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL CONSTRAINT DF_SampleTable_ValidFROM DEFAULT (CAST(0 AS DATETIME)), --HISTORY_RETENTION_PERIOD is based on the "ROW END" column. ValidTo DATETIME2 ALWAYS AS ROW END HIDDEN NOT NULL CONSTRAINT DF_SampleTable_ValidTo DEFAULT ('9999-12-31 23:59:59.9999999'), PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo); ALTER TABLE dbo.SampleTable SET ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = temporal.SampleTableHistory, DATA_CONSISTENCY_CHECK = ON, HISTORY_RETENTION_PERIOD = 1 MONTH --SQL2017+ ) ) GO
Our workload generated 20-25 million rows per day in the history table--so roughly half the number of rows as the old design. I was hoping the number of rows would be significantly less than this. On the other hand, the new history table size is only 182 GB vs 480 GB for the old table). I'm happy with the overall reduction, especially the reduced size of the table on disk. The source table is highly transactional, even though it's wide and not designed well for a high transaction volume. I discovered a single row might be updated as much as five times per second by our DML activity. I was hoping I might be able to streamline the code and reduce the number of updates per row, but it's complicated--I decided to leave it alone.
One stored procedure experienced serious performance degradation. A query on the source table that normally ran in a few hundred milliseconds began taking up to five minutes. While running sp_WhoIsActive, I could see the wait type. It was one of the HT* wait types, but I forget which one. (I think it was HTREPARTITION.) A quick search revealed those wait types are associated with parallel batch-mode plans. So I checked Query Store: prior to implementing the temporal table, the execution plan was single threaded. Afterwards, average DOP was eight. Although I wasn't crazy about doing it, I added a query hint to force DOP of one, which got performance back to where it had been previously.
There is a clustered index on the history table (ValidFrom, ValidTo), which experienced some fragmentation over time. I run Ola Hallengren's maintenance solution, and much to my surprise, an INDEX REORGANIZE operation caused some blocking. The blocking victim? It was another SQL Agent job step that uses Ola's Maintenance Solution to perform statistics updates. The blocking began when the stats update routine got to the history table. Ola gives us numerous ways to address that situation, so I'm not too concerned. But the big takeaway was the high amount of fragmentation on the history table.