Data structures and transaction logging
We usually think of a database as a physical structure consisting of tables containing columns and rows of data and indexes. However, this is just a human point of view. From SQL Server's perspective, a database is a set of precisely structured files described in the form of metadata, also saved in the same database structures within the database. We are starting this chapter with an explanation of storage internals because a conceptual imagination of how every database works is very helpful when the database needs to be backed up correctly.
How data is stored
Every database on SQL Server must have at least two files:
- The primary data file with the usual suffix, mdf
- The transaction log file with the usual suffix, ldf
For most databases, this minimal set of files is enough. However, when the database contains big amounts of data or the database has big data contention, such as systems with thousands of transactions handled in seconds, it's good practice to design more data files. Another situation when a basic set of files is not enough can arise when documents or pictures are saved along with relational data. However, SQL Server can still store all our data in the basic file set, but it can lead to a performance bottleneck and management issue. That's why we need to know about all the possible storage types that are useful for different scenarios of deployment. A complete structure of files is depicted in the following diagram:
For administrators who have not focused on the database structure of SQL Server before, this tree of objects may be unclear and confusing. To make sure there's no confusion, let's explore every single node illustrated in the preceding diagram.
Database types
A relational database is defined as a complex data type. This complex type consists of tables with a given number of columns. Each column has a domain, which is actually a data type (such as an integer or a date) that's optionally complemented by some constraints.
SQL Server takes a database as a record written in metadata containing the name of the database, properties of the database, and the names and locations of all files or folders representing storage for the database. This is the same for user databases, as well as for system databases.
System databases are created automatically during SQL Server installation and they are crucial for correctly running SQL Server. We know of five system databases. Let's take a look at them now.
The master database
The master database is the basis for correctly running the SQL Server service. Logins, all databases and their files, instance configurations, linked server definitions, and lists of error messages are all examples of data stored in the master database. SQL Server finds this database at startup using two startup parameters, -d and -l, followed by the paths to master mdf and ldf files. These parameters are very important in situations where the administrator wants to move the master's files to a different location. Changing their values is possible in SQL Server Configuration Manager by selecting the Startup Parameters tab in the Properties dialog in the SQL Server service. When the master database is not found or it is corrupted, it prevents the SQL Server instance from starting.
The msdb database
The msdb database serves as the storage for SQL Server Agent objects, Database Mail definitions, Service Broker configurations, Integration Services objects, and so on. This data is used mostly for SQL Server automation, such as SQL Server Agent jobs, or for diagnostics, such as SSIS logs or database mail logs. The msdb database also stores logs about backups and restores the events of each database. If this database is corrupted or missing, SQL Server Agent cannot start and many other features such as Service Broker or Database Mail won't be accessible.
The database model
The database model can be used as a template for every new database while it is being created. During a database's creation (see the CREATE DATABASE statement on MSDN), files are created on defined paths and all the objects, data, and properties of the database model are created, copied, and set in the new database during its creation. This database must always exist on the instance because when it is corrupted, the tempdb database cannot be created at instance startup!
The tempdb database
Even if the tempdb database seems to be a regular database like many others, it plays a very special role in every SQL Server instance. This database is used by SQL Server, as well as developers, to save temporary data such as temporary tables, table variables, and static cursors (although this is not the best practice). As this database is intended for the short lifespan of all the objects stored in it (temporary data only, which can be stored during the execution of a stored procedure or until the session is disconnected), SQL Server clears this database by truncating all the data from it or by dropping and recreating this database every time it's started.
As the tempdb database will never contain durable data, it has some special internal behavior. This is the reason why accessing data in this database is several times faster than accessing durable data in other databases. If this database is corrupted, restart SQL Server.
The resourcedb database
The resourcedb database is the fifth in our enumeration and contains the definitions of all the system objects of SQL Server; for example, sys.objects. This database is hidden and read-only, and we do not need to care about it that much.
It is not configurable, and we do not use regular backup strategies for it. It is always placed in the installation path of SQL Server (in the binn directory) and it's backed up within the filesystem backup. In case of an accident, it is recovered as part of the filesystem as well.
We will now explore the filegroup node.
Filegroup
Filegroup is an organizational metadata object that contains one or more data files. Filegroup does not have its own representation in the filesystem – it's just a group of files (or folders). When any database is created, a filegroup called primary is always created. This primary filegroup always contains the primary data file.
Filegroups can be divided into the following:
- Row storage filegroups: These filegroups can contain data files (mdf or ndf).
- Filestream filegroups: This kind of filegroup does not contain files, but folders, to store binary data such as scanned documents or pictures. Using filestream filegroups for blob data brings better performance for manipulation because the blob's byte stream is stored in a secured folder on disk, rather than on the row storage filegroups. It facilitates better read and write operations.
- In-memory filegroup: Only one instance of this kind of filegroup can be created in a database. Internally, it is a special case of a filestream filegroup and it is used by SQL Server to persist data from in-memory tables.
Every filegroup has four simple properties:
- Name: This is the descriptive name of the filegroup. The name must fulfill the naming convention criteria.
- Default: In a set of filegroups of the same type, one of these filegroups has this option set to on. This means that when a new table or index is created, but not assigned a specific filegroup to store data in, the default filegroup is used. The primary filegroup is the default filegroup. The in-memory filegroup does not contain this property because we cannot have two or more in-memory filegroups in one database.
- Read-only: Every filegroup, except the primary filegroup and in-memory filegroup, can be set to read-only. Let's say that a filegroup is created for last year's historical data. When data is moved from the current period to the tables created in this historical filegroup, the filegroup could be set to read-only. Marking the filegroup as read-only prevents users from making any changes to the data placed in the read-only filegroup. For administrators, read-only filegroups can reduce the time of backing up as the read-only setting ensures that the data in the filegroup cannot be changed.
- Autogrow All Files: This property of row storage filegroups is new on SQL Server 2019. When more files are added to certain filegroups, SQL Server distributes data across all files using the Proportional Fill Algorithm (PFA). This means that more data is added to a bigger file within a set of files in the same filegroup. This behavior can lead to the uneven distribution of data. Hence, turning on the Autogrow All Files property ensures that SQL Server keeps the size of all the files within the filegroup the same when an autogrow event occurs.
Tip
It is a good approach to divide the database into smaller parts, known as filegroups. It helps to distribute data across more physical storage and makes the database more manageable; backups can be done part by part in shorter times, which fit better in a service window.
Data files
Every database must have at least one data file, called a primary data file. This file is always bound to the primary filegroup. This file stores all the metadata of the database, such as structure descriptions (these can be seen through views such as sys.objects, sys.columns, and others), users, and so on. If the database does not have other data files (in the same or other filegroups), all user data is also stored in this file, but this approach is good enough for smaller databases.
Considering how the volume of the data in the database grows over time, it is a good practice to add more data files. These files are called secondary data files. Secondary data files are optional and contain user data only.
Both types of data files have the same internal structure. Every file is divided into 8 KB small parts called database pages. SQL Server maintains several types of database pages, such as data pages, index pages, Index Allocation Maps (IAMs) to locate the data pages of tables or indexes, Global Allocation Maps (GAMs), and Shared Global Allocation Maps (SGAMs) to address objects in the database. Regardless of the type of a certain database page, SQL Server uses a database page as the smallest unit of logical I/O operations in memory. Let's describe some common properties of a database page:
- A data page never contains the data of several objects.
- Data pages do not know each other (and that's why SQL Server uses IAMs to allocate all the pages of an object).
- Data pages do not have any special physical ordering.
- A data row must always fit into a data page (this property is not completely true as SQL Server uses row overflow data to keep data that overflows over the 8060 B limit for records).
These properties could seem useless, but when we know about these properties, we can use this knowledge to better optimize and manage our databases.
Did you know that a data page is the smallest storage unit that can be restored from a backup?
As a data page is quite a small storage unit, SQL Server groups data pages into bigger, logical units called extents. An extent is a logical allocation unit containing eight physically coherent data pages. When SQL Server requests data from disk, extents are read into memory. This is the reason why 64 KB NTFS clusters are recommended to format disk volumes for data files. Extents can be uniform or mixed. A uniform extent is a kind of extent containing data pages belonging to one object only; on the other hand, a mixed extent contains the data pages of several objects.
Transaction log
When SQL Server processes any transaction, it works in a way called the two-phase commit. When a client starts a transaction by sending a single Data Manipulation Language (DML) request or by calling the BEGIN TRAN command, SQL Server requests data pages from disk to memory through a buffer cache and makes the requested changes in these data pages in memory. When the DML request is executed or the COMMIT command comes from the client, the first phase of the commit is completed, but the data pages in memory differ from their original versions in a data file on disk. The data page in memory is in a state called dirty.
When a transaction runs, a transaction log file is used by SQL Server for a very detailed chronological description of every single action that was done during the transaction. This description is called write-ahead logging (WAL), and it is one of the oldest processes known on SQL Server.
The second phase of the commit usually does not depend on the client's request and it is an internal process called a checkpoint. A checkpoint is a periodical action that does the following:
- Searches for dirty pages in the buffer cache
- Saves dirty pages to their original data file location
- Marks these data pages as clean in the buffer cache (or drops them out of memory to free memory space)
- Marks the transaction as a checkpoint or inactive in the transaction log
WAL is needed for SQL Server during the recovery process. The recovery process is run on every database every time the SQL Server service starts. When the SQL Server service stops, some pages could remain in a dirty state and be lost from memory. This can lead to two possible situations:
- The transaction is completely described in the transaction log (from BEGIN TRAN to COMMIT), the new content of the data page is lost from memory, and the data pages are not changed in the data file.
- The transaction is not completed when SQL Server stops, so the transaction is not completely described in the transaction log. Data pages in memory are not in a stable state (because the transaction did not finish and SQL Server cannot know if COMMIT or ROLLBACK will occur), and the original versions of the data pages in the data files are intact.
SQL Server decides these two situations when it is starting. If a transaction is complete in the transaction log but was not marked as a checkpoint, SQL Server executes this transaction again with both phases of COMMIT. If the transaction is not complete in the transaction log when SQL Server stops, SQL Server will never know what the user's intention with the transaction was, and the incomplete transaction will be erased from the transaction log as if it had never started.