Microsoft SQL Server 2012 Internals

Kalen Delaney, Bob Beauchemin, Conor Cunningham, Jonathan Kehayias, Benjamin Nevarez, Paul S. Randal

Mentioned 2

Dive deep inside the architecture of SQL Server 2012 Explore the core engine of Microsoft SQL Server 2012--and put that practical knowledge to work. Led by a team of SQL Server experts, you'll learn the skills you need to exploit key architectural features. Go behind the scenes to understand internal operations for creating, expanding, shrinking, and moving databases--whether you're a database developer, architect, or administrator. Discover how to: Dig into SQL Server 2012 architecture and configuration Use the right recovery model and control transaction logging Reduce query execution time through proper index design Track events, from triggers to the Extended Event Engine Examine internal structures with database console commands Transcend row-size limitations with special storage capabilities Choose the right transaction isolation level and concurrency model Take control over query plan caching and reuse

More on Amazon.com

Mentioned in questions and answers.

I'm reading the book Microsoft SQL Server 2012 Internals and try to understand how SQL Server stores data internally. Below is an excerpt from the book.

The structure of data rows
==========================

A tables data rows have the general structure shown in the figure below (as long as the data
is stored in uncompressed form). This format is called the FixedVar format because the data
for all fixed-length columns is stored first, followed by the data for all variable-length columns. 

+-----------------------------------------------------------------------------------------------
|1 byte  |1 byte  |2 bytes |n bytes |2 bytes |NULL bitmap|2 bytes |Column offset array|n bytes |
+---^--------^-------^----------^-------^----------^----------^------------^---------------^----
    |        |       |          |       |          |          |            |               |
    |        |       |          |       |          |          |            |               +
    |        |       |          |       |          |          |            |      data for
    |        |       |          |       |          |          |            |      variable-length columns
    |        |       |          |       |          |          |            |
    |        |       |          |       |          |          |            |
    |        |       |          |       |          |          |            +
    |        |       |          |       |          |          |    2x # varlength columns
    |        |       |          |       |          |          +
    |        |       |          |       |          |     Number of variable-length
    |        |       |          |       |          |     columns
    |        |       |          |       |          +
    |        |       |          |       |      NULL bit map, one bit
    |        |       |          |       |      for each column
    |        |       |          |       +
    |        |       |          |     Number of fixed-length
    |        |       |          |     columns
    |        |       |          +
    |        |       |       Fixed length data
    |        |       +
    |        |      Length of fixed-length portion of row,
    |        |      not including the 2 bytes for the number
    |        |      of columns and the NULL bitmap
    |        +
    |      Status bits B
    +
 Status bits A

My testing is as follows.

USE test;
GO

IF OBJECT_ID(N'dbo.t1', N'U') IS NOT NULL
    DROP TABLE t1;
GO

-- Create a sample table t1
CREATE TABLE t1
(
 id NVARCHAR(20) NOT NULL DEFAULT '',
 c1 CHAR(100) DEFAULT (REPLICATE('X', 100)),
 c2 NVARCHAR(20) NULL DEFAULT NULL,
 c3 CHAR(100) DEFAULT (REPLICATE('Y', 100)),
);
GO

INSERT INTO t1 VALUES(N'AAAA', REPLICATE('C', 100), N'BBBB', REPLICATE('D', 100));
INSERT INTO t1 DEFAULT VALUES;
INSERT INTO t1 VALUES(N'PPPP', REPLICATE('Q', 100), N'RRRR', REPLICATE('S', 100));
GO

-- Check the data
SELECT * FROM t1;

-- How many pages we have in the table
SELECT  t.name AS TableName,
        p.rows AS RowCounts,
        SUM(a.total_pages) AS TotalPages,
        SUM(a.used_pages) AS UsedPages,
        (SUM(a.total_pages) - SUM(a.used_pages)) AS UnusedPages
FROM    sys.tables t
        INNER JOIN sys.indexes i
            ON t.object_id = i.object_id
        INNER JOIN sys.partitions p
            ON i.object_id = p.object_id
               AND i.index_id = p.index_id
        INNER JOIN sys.allocation_units a
            ON p.partition_id = a.container_id
WHERE   t.name NOT LIKE 'dt%'
        AND t.is_ms_shipped = 0
        AND i.object_id > 255
        AND t.name = 't1'
GROUP BY t.name,
        p.rows
ORDER BY t.name;


-- Create a table to store the DBCC IND result
IF EXISTS ( SELECT  *
            FROM    sys.tables
            WHERE   name = 'sp_dbcc_ind' )
    DROP TABLE sp_dbcc_ind;
GO
CREATE TABLE sp_dbcc_ind
(
 PageFID TINYINT,
 PagePID INT,
 IAMFID TINYINT,
 IAMPID INT,
 ObjectID INT,
 IndexID TINYINT,
 PartitionNumber TINYINT,
 PartitionID BIGINT,
 iam_chain_type VARCHAR(30),
 PageType TINYINT,
 IndexLevel TINYINT,
 NextPageFID TINYINT,
 NextPagePID INT,
 PrevPageFID TINYINT,
 PrevPagePID INT,
 PRIMARY KEY (PageFID, PagePID)
);

INSERT  INTO sp_dbcc_ind
EXECUTE ('DBCC IND(''test'', ''test.dbo.t1'', 1)');

-- Generate DBCC PAGE command
SELECT  'DBCC PAGE ([test], ' + CAST(PageFID AS NVARCHAR(MAX)) + ', ' +
        CAST (PagePID AS NVARCHAR(MAX)) + ', 3) --WITH TABLERESULTS;' DBCC_PAGE,
        *
FROM    sp_dbcc_ind
WHERE PageType = 1;

-- Enable output
DBCC TRACEON(3604);
GO

-- Pasted from the first column of the last query
DBCC PAGE ([test], 1, 552, 3) WITH TABLERESULTS;

-- DBCC PAGE output    
/*
PAGE: (1:552)


BUFFER:


BUF @0x00000045BEE87500

bpage = 0x00000045929D6000          bhash = 0x0000000000000000          bpageno = (1:552)
bdbid = 8                           breferences = 0                     bcputicks = 380
bsampleCount = 1                    bUse1 = 38513                       bstat = 0x10b
blog = 0x212121cc                   bnext = 0x0000000000000000          bDirtyContext = 0x00000045941E04D0
bstat2 = 0x0                        

PAGE HEADER:


Page @0x00000045929D6000

m_pageId = (1:552)                  m_headerVersion = 1                 m_type = 1
m_typeFlagBits = 0x0                m_level = 0                         m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 173   m_indexId (AllocUnitId.idInd) = 256 
Metadata: AllocUnitId = 72057594049265664                                
Metadata: PartitionId = 72057594042974208                                Metadata: IndexId = 0
Metadata: ObjectId = 1909581841     m_prevPage = (0:0)                  m_nextPage = (0:0)
pminlen = 204                       m_slotCnt = 3                       m_freeCnt = 7425
m_freeData = 761                    m_reservedCnt = 0                   m_lsn = (34:2220:2)
m_xactReserved = 0                  m_xdesId = (0:0)                    m_ghostRecCnt = 0
m_tornBits = 0                      DB Frag ID = 1                      

Allocation Status

GAM (1:2) = ALLOCATED               SGAM (1:3) = NOT ALLOCATED          PFS (1:1) = 0x41 ALLOCATED  50_PCT_FULL
DIFF (1:6) = CHANGED                ML (1:7) = NOT MIN_LOGGED           

Slot 0 Offset 0x60 Length 229

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 229                   
Memory Dump @0x00000047E94BA060

0000000000000000:   3000cc00 43434343 43434343 43434343 43434343  0.Ì.CCCCCCCCCCCCCCCC
0000000000000014:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
0000000000000028:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
000000000000003C:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
0000000000000050:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
0000000000000064:   43434343 44444444 44444444 44444444 44444444  CCCCDDDDDDDDDDDDDDDD
0000000000000078:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
000000000000008C:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
00000000000000A0:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
00000000000000B4:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
00000000000000C8:   44444444 04000002 00dd00e5 00410041 00410041  DDDD.....Ý.å.A.A.A.A
00000000000000DC:   00420042 00420042 00                          .B.B.B.B.

Slot 0 Column 1 Offset 0xd5 Length 8 Length (physical) 8

id = AAAA                           

Slot 0 Column 2 Offset 0x4 Length 100 Length (physical) 100

c1 = CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

Slot 0 Column 3 Offset 0xdd Length 8 Length (physical) 8

c2 = BBBB                           

Slot 0 Column 4 Offset 0x68 Length 100 Length (physical) 100

c3 = DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD

Slot 1 Offset 0x145 Length 207

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 207

Memory Dump @0x00000047E94BA145

0000000000000000:   1000cc00 58585858 58585858 58585858 58585858  ..Ì.XXXXXXXXXXXXXXXX
0000000000000014:   58585858 58585858 58585858 58585858 58585858  XXXXXXXXXXXXXXXXXXXX
0000000000000028:   58585858 58585858 58585858 58585858 58585858  XXXXXXXXXXXXXXXXXXXX
000000000000003C:   58585858 58585858 58585858 58585858 58585858  XXXXXXXXXXXXXXXXXXXX
0000000000000050:   58585858 58585858 58585858 58585858 58585858  XXXXXXXXXXXXXXXXXXXX
0000000000000064:   58585858 59595959 59595959 59595959 59595959  XXXXYYYYYYYYYYYYYYYY
0000000000000078:   59595959 59595959 59595959 59595959 59595959  YYYYYYYYYYYYYYYYYYYY
000000000000008C:   59595959 59595959 59595959 59595959 59595959  YYYYYYYYYYYYYYYYYYYY
00000000000000A0:   59595959 59595959 59595959 59595959 59595959  YYYYYYYYYYYYYYYYYYYY
00000000000000B4:   59595959 59595959 59595959 59595959 59595959  YYYYYYYYYYYYYYYYYYYY
00000000000000C8:   59595959 040004                               YYYY...

Slot 1 Column 1 Offset 0x0 Length 0 Length (physical) 0

id =                                

Slot 1 Column 2 Offset 0x4 Length 100 Length (physical) 100

c1 = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Slot 1 Column 3 Offset 0x0 Length 0 Length (physical) 0

c2 = [NULL]                         

Slot 1 Column 4 Offset 0x68 Length 100 Length (physical) 100

c3 = YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

Slot 2 Offset 0x214 Length 229

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 229                   
Memory Dump @0x00000047E94BA214

0000000000000000:   3000cc00 51515151 51515151 51515151 51515151  0.Ì.QQQQQQQQQQQQQQQQ
0000000000000014:   51515151 51515151 51515151 51515151 51515151  QQQQQQQQQQQQQQQQQQQQ
0000000000000028:   51515151 51515151 51515151 51515151 51515151  QQQQQQQQQQQQQQQQQQQQ
000000000000003C:   51515151 51515151 51515151 51515151 51515151  QQQQQQQQQQQQQQQQQQQQ
0000000000000050:   51515151 51515151 51515151 51515151 51515151  QQQQQQQQQQQQQQQQQQQQ
0000000000000064:   51515151 53535353 53535353 53535353 53535353  QQQQSSSSSSSSSSSSSSSS
0000000000000078:   53535353 53535353 53535353 53535353 53535353  SSSSSSSSSSSSSSSSSSSS
000000000000008C:   53535353 53535353 53535353 53535353 53535353  SSSSSSSSSSSSSSSSSSSS
00000000000000A0:   53535353 53535353 53535353 53535353 53535353  SSSSSSSSSSSSSSSSSSSS
00000000000000B4:   53535353 53535353 53535353 53535353 53535353  SSSSSSSSSSSSSSSSSSSS
00000000000000C8:   53535353 04000002 00dd00e5 00500050 00500050  SSSS.....Ý.å.P.P.P.P
00000000000000DC:   00520052 00520052 00                          .R.R.R.R.

Slot 2 Column 1 Offset 0xd5 Length 8 Length (physical) 8

id = PPPP                           

Slot 2 Column 2 Offset 0x4 Length 100 Length (physical) 100

c1 = QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ

Slot 2 Column 3 Offset 0xdd Length 8 Length (physical) 8

c2 = RRRR                           

Slot 2 Column 4 Offset 0x68 Length 100 Length (physical) 100

c3 = SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS


DBCC execution completed. If DBCC printed error messages, contact your system administrator.

*/

Take the first record as an example. The dumped raw data for it is:

0000000000000000:   3000cc00 43434343 43434343 43434343 43434343  0.Ì.CCCCCCCCCCCCCCCC
0000000000000014:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
0000000000000028:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
000000000000003C:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
0000000000000050:   43434343 43434343 43434343 43434343 43434343  CCCCCCCCCCCCCCCCCCCC
0000000000000064:   43434343 44444444 44444444 44444444 44444444  CCCCDDDDDDDDDDDDDDDD
0000000000000078:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
000000000000008C:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
00000000000000A0:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
00000000000000B4:   44444444 44444444 44444444 44444444 44444444  DDDDDDDDDDDDDDDDDDDD
00000000000000C8:   44444444 04000002 00dd00e5 00410041 00410041  DDDD.....Ý.å.A.A.A.A
00000000000000DC:   00420042 00420042 00   

                   .B.B.B.B.

The table t1 has 4 columns, two fixed-length (c1 100 bytes, c3 100 bytes) and two variable-length (id and c2). Based on the theory of the book, the row structure will store the fixed-length columns first. So the first byte is 30, which is a status bits and is 00110000 when converted into binary form. The first 1 means we have variable-length columns in this row. The second 1 means we have NULL bitmap in this row (actually this is always 1). The second byte is 00, which is another status bits and has no special meaning here. The next two bytes is cc00, which is 204 when converted into decimal. Based on the book, this means the length of the fixed-length columns. But as we know, there are two columns and both of them are of type char(100). So the acutal length of them is 200 not 204. And based on my testing, the data reported by the "length of fixed-length portion of row" is always 4 bytes bigger than the actual fixed-length data size. Is it because of uniquifier? Does all heap rows have uniquifier?

No there is no uniqueifier involved.

Length of fixed-length portion of row, not including the 2 bytes for the number of columns and the NULL bitmap.

But this does include the first three elements in your question |1 byte |1 byte |2 bytes.

0x3000CC00 + the two 100 byte fixed length columns makes 204 bytes.

Another way of looking at this value is that gives you the offset of the fixed length column count section counting from the start of the row. SQL Server internals viewer uses that terminology as below. Alternatively you could view it as the offset to the null bitmap from the current point in the row (as Paul Randal does here)

enter image description here

This question contains complete learning curve for Cpp from Beginner,*Intermediate* and advance which is very helpful.

Please suggest me same for Databases,to become a good Database Expert or a Good DBA. As there are may books and tutorials which to choose.

I found following books written by Itzik Ben-Gan very helpful:

Beginning:

  1. Microsoft SQL Server 2012 T-SQL Fundamentals http://www.amazon.com/Microsoft-Server-2012-T-SQL-Fundamentals/dp/0735658145

Advanced:

  1. Training Kit (Exam 70-461): Querying Microsoft SQL Server 2012 (MSPress, December 2012) http://www.amazon.com/Training-Kit-Exam-70-461-Microsoft/dp/0735666059

  2. Inside Microsoft SQL Server 2008: T-SQL Programming (MSPress, 2009) http://www.amazon.com/Inside-Microsoft%C2%AE-Server%C2%AE-2008-Pro-Developer/dp/0735626022/ref=pd_sim_b_1?ie=UTF8&refRID=1ZFP1YGEB2QPZBZCBFZS

I'd say: learn the internals

http://www.amazon.com/Microsoft-SQL-Server-2012-Internals/dp/0735658560

From there you have a solid knowledge of querytuning, how the optmizer works and solid knowledge of how SQl Server works. Something every dba should know.

And the books of Iztik Ben Gan: http://www.amazon.com/Inside-Microsoft%C2%AE-SQL-Server%C2%AE-2008/dp/0735626030/ref=sr_1_sc_2?s=books&ie=UTF8&qid=1393572731&sr=1-2-spell&keywords=iztik+ben+gan