I’ve put together a number of posts that will hopefully both inform and put to bed some misconceptions around the use of a Transaction Log within a database context and how SQL Server makes use of it across a number of features such as normal day-to-day transactions, Backups, Checkpoints/Lazywriter/Eagerwriter, Mirroring, Internal structure of the transaction log, DELAYED_DURABILITY feature and In-Memory tables.
Part 1: ACID, Write Ahead Logging, Structure of the Log (VLF’s) and the Checkpoint/Lazywriter/Eagerwriter process.
Part 2: Mirroring/Availability Groups and Full Database Backups
Part 3: Database DELAYED_DURABILITY feature (in the planning)
Part 4: Why my log won’t shrink – reasons to understand Parts 1 to 3 (in the planning)
Part 5: In-Memory Tables
ACID (Atomicity, Consistency, Isolation, Durability)
Databases store data, applications update data. What happens if in the middle of your update the server or your application crashes – what is your expected result when you revisit your data?
I would hope the answer to that question would be – I’d expect my data to be as it was before I started my update i.e. in a known state rather than either part updated or left in a corrupted state.
A database itself is not ACID compliant, data changes are applied inside Transactions – a database product will provide a means of applying the ACID rules through a Transaction. SQL Server by default provides transactions that satisfy the ACID rules but it also supplies a means of satisfying just the ACI rules – without Durability e.g. DELAYED_DURABILITY and SCHEMA_ONLY option for in-memory tables.
The transaction log is used to apply the Atomicity (all or nothing) and Durability (when it’s written it’s definitely written) rules in ACID, the next section on Write Ahead Logging (WAL) explains how.
For more information on ACID see: https://en.wikipedia.org/wiki/ACID
Write Ahead Logging (WAL)
As a programming task how would resolve the following requirement: you have 50GB of data, 1GB of memory and your update will cause all the 50GB to change, if the update should fail then according to the AD rules in ACID the original data (what is termed the known state) is returned, e.g. 50% of the 50GB has been updated but the program crashes – how do you get back to the known state?
The transaction log is a file separate to the main data, it contains a “log” of updates, in SQL Server anytime you cause a page to be modified data is written to the transaction log. Each database has it’s own transaction log file, it can have multiple log files however they are written to one at a time unlike data files which stream across using the proportional fill algorithm. The other thing to note is when a transaction log is first built or auto-grows the newly created piece of file needs to be zero initialised i.e. unlike data files which can be initialised using “instant file initialisation” the entire new piece of the file needs to be initialised on the file system which may take sometime.
What do we mean by Write Ahead Logging?
Say you execute the command UPDATE mytable SET x = 10 WHERE key = 1234 what happens in SQL Server when applying the AD rules in ACID?
The page containing the existing data row is fetched into the Buffer Pool, a transaction start marker is written to the transaction log to indicate a transaction for this session has begun; the data is then modified in the Buffer Pool, the modified page is written to the transaction log and on commit a transaction end marker placed in the log. At this point there has been no writes to the data file, the modified data is physically on storage in the transaction log file and in memory in the Buffer Pool.
What happens if SQL Server now fails i.e. before the committed data has had a chance to be written to the actual data files? When SQL Server starts up and the Database goes through the recovery process the transaction log is read sequentially bringing the data files up-to-date with any uncommitted transactions rolled back and any committed transactions rolled forward, the database is now in a consistent state.
Structure of the Transaction Log (VLF’s)
When creating a database the log file is specified using the LOG ON section of CREATE DATABASE, when adding files it’s specified in the ADD LOG FILE section of ALTER DATABASE.
The physical log file is segmented into “Virtual Log Files”, the size of the VLF’s are dependant on the initial and auto-growth increment sizes, small VLF’s can impact database recovery as well as backup procedures so make sure you size them properly. VLF’s are important because it is at that level the active portion of the log is defined, VLF’s in the active portion of the log cannot be truncated so as your database receives more transactions the log will continue to grow in size until the VLF’s in the active portion of the log are dealt with by moving forward the Minimum LSN (Log Sequence Number).
We now have a container to place our transactions; a transaction is made of one or more operations, each transaction has a unique number and so does each transaction log entry, the latter we call “Log Sequence Number”. The LSN is a key piece of information and critical to the consistency of data within the Database, you will even find a LSN in the page header records of data.
Remember the purpose of the transaction log is provide a write ahead log so the rules of A (Atomicity) D (Durability) of ACID can be maintained. The Atomicity rule means for UPDATE’s the old and new rows both need to be logged because on recovery it may be required to roll the transaction back (use the old version) or roll the transaction forward (use the new version).
For more information see: TechNet – SQL Server Transaction Log Architecture and Management
I said earlier that when you are doing an operation that modifies a page e.g. an INSERT, the modification is performed in the Buffer Pool (creating or modifying an existing dirty page) and with a corresponding physical write to storage but the Transaction Log (VLF) rather than the Data file itself.
On database recovery, because we have the initial data files that have not been modified yet and every page modification done against the database resides in the transaction log we can bring the database back to a consistent state. This has a number of downsides, if you run out of free pages in the Buffer Pool because they are all “dirty” and you aren’t modifying existing “dirty pages” then you can’t do any more page modifications, also, the transaction log can become extremely large meaning you may face the possibility of running out of storage or the recovery time will also suffer because when the database starts up it has to read a very large transaction log during the “recovery” process.
There is a restriction: the transaction log cannot be reduced in size i.e. VLF’s freed up until dirty pages are written to Data files on storage. You can backup the transaction log until you are blue in the face, the backup can use the LSN to determine what it last backed up so that it doesn’t back up the same part of the log, for example if you have a 50GB used portion of the log and have not done a log backup (assumes Full recovery here) then the first log backup will be 50GB i.e. the part of the log never backed up, the second log backup will only backup what’s new so if no new modifications then it will be very small – to be clear, if your used portion of the transaction log is consistently 50GB over 5 transaction log backups it is not the case that each back up file will be 50GB, 50GB, 50GB, 50GB and 50GB, instead it will be more like 50GB, 10KB, 10KB, 10KB and 10KB for instance (note 10KB is just to illustrate and is not an actual determined value in SQL Server).
Dirty Pages are linked to the transaction log through the Page header, each page header contains a min_lsn number which reflects the most recent LSN where the data on that page exists within the transaction log (remember – WAL). That gives us a number we can use to determine which pages have been written to the data files and as such we have the minimum LSN in the transaction log that needs to be active in order to provide database recovery. For example, we do a modification that changes pages 1 – 128, remember WAL requires that those modifications be written to the transaction log, so we now have 128 dirty pages for example 1 – 128 in the Buffer Pool and for the purpose of this example 128 operations on the transaction log for example LSN 10000 – 10128. To provide database recovery we need LSN 10001 – 10128 because none of the dirty pages have been written to the Data file storage – say we now write pages 1 – 64 to the Data file, the minimum LSN can now move to 65 so we need to keep LSN 10065 – 10128 to provide database recovery because our Data files contain the data that has log operations at LSN 10001 – 10064, the VLF’s holding LSN 10001 – 10064 can be marked inactive if they don’t contain operations for LSN above 10064, the inactive VLF’s can now be purged for reuse (taking requirements for backups aside – will cover that in the next post).
The checkpoint process looks for dirty pages in the Buffer Pool, each page header contains the min_LSN, if the page min_LSN is greater than the last checkpoint LSN (stored in the database boot page) then the page should be written to the Data File storage and the page marked “clean”. The pages are left in the buffer pool, it’s job is really to cut down on the time taken in case the database requires recovery – the more recent the checkpoint the less operations in the log to roll forward or roll backwards.
The lazywriter also writes dirty pages to storage however unlike the Checkpoint process it removes the pages, it focuses on older less used pages with the goal to keep a reasonable number of free pages in the buffer cache to pull data from storage/create new data. The checkpoint is more aggressive at bulking up dirty pages and writing them out to storage although in more recent versions it is less problematic and doesn’t entirely swamp the IO subsystem.
For minimally logged operations for example BULK INSERT the eagerwriter writes dirty pages to storage without waiting for the operations to complete which might starve the available buffers.
Write ahead logging provides a method of applying the Atomicity and Durability rules of ACID. Data is written in the Buffer Pool then to the transaction log, the transaction log contains details of every operation that modifies database pages (and thus anything in the database), the transaction log can be used to recover the Data Files to a known state – rolling forward transactions or rolling them back.