Microsoft KB Archive/40053

From BetaArchive Wiki

INF: SQL Server Disk-Space Management, Part 3 ID Number: Q40053

1.00 1.10 1.11 4.20 OS/2

Summary:

This is part 3 of the SQL Server disk-space management series. It covers space management within a database. Part 1 covered space management at the server level, not at the database level. Part 2 covered the segmentation of global page space.

More Information:

Space within each database is considered to be an array of fixed length: 2K pages. This is called local (or logical) page space, because the pages for each database are numbered consecutively beginning with zero. These page numbers are mapped to actual byte positions in one or more physical files by the information in SYSDEVICES and SYSSEGMENTS. This insulates each database from the complexities of managing discontinuous segments of physical page space spread over one or more physical files, which may be shared among other databases.

Space within each database is managed by allocation pages that occur every 256 pages. Each allocation page controls the next 255 pages. Pages are not controlled individually, but instead, in blocks of 8 contiguous pages. This approach requires fewer entries (32 versus 255) than if one entry were used for each page, and allows each entry to be longer (16 bytes versus 8 bytes) and still fit into a single allocation page.

The important items in an allocation entry are object ID, object type, and a bitmap showing which of the 8 pages in the block are actually in use. Allocation entries for the same object are chained together in a circular doubly-linked list of block-starting-page-number.

This approach has certain consequences. An 8-page block can contain data for only one object. Each index on an object is a different type object and thus requires a separate block of 8 pages. If a table containing a single byte is created, an entire 8-page block (32K) is allocated. If an index is created on that 1-byte table, another 8-page block must be allocated. No further allocations are required until either of the 8-page blocks is filled. An advantage of this approach is that data is physically clustered by object rather than spread out randomly.

The 8-page allocation granularity is the reason for the two values returned by the space-used commands: space allocated and space actually used.

Space within a page is managed by a free-space pointer, which is part of the page header. Also in the page header are logical page number, next page in chain, previous page in chain, and object ID of this object. Pages that contain data for the same object are linked together using the next and previous page numbers in a doubly-linked NULL terminated list. Data is close-packed within a page and the free-space pointer points to the first free byte in the page.

New items are added to the end of existing items if sufficient space is available in the page. If not, a new page must be spliced into the chain. Items cannot span pages (except for TEXT/IMAGE).

If a free page is available in the same 8-page block, it will be used; otherwise, a new 8-page block will have to be allocated to the object.

Space from deleted items is immediately available within the page, but a page remains allocated to the object even if it is empty, until all pages in the 8-page block are freed.

Updates that increase the length of variable-length items are handled by deleting the item, sliding the rest of the data down, and adding the longer item to the end. In data pages, logical sequence is maintained by a list of row pointers at the end of the page. The data in a page grows down toward the pointer list, and the pointer list grows up toward the data. If they meet, the page is full. When items are slid down to reclaim space freed by deletions, all of the pointers at the end of the page must also be updated. To avoid having to update indexes that may reference those pointers, pointers to deleted items are zeroed rather than removed.

Index pages do not have row pointers at the end of the page to maintain sequence (though an offset table is sometimes built on the fly if there is enough room at the end of the page. This helps scan performance). Space is opened up for insertions (and for updates that increase item length) by sliding the following items toward the end of the page. Deletions are handled by reversing the process.

All pages (normal data pages, index pages, text and image data pages, transaction log pages, and allocation pages) are controlled by the entry in the allocation page for that block of 8 pages. All pages for the same object (except allocation pages) are chained together by forward and back pointers in the page header. Free space in all pages (except allocation pages) is collected at the end of the page and controlled by a free-space pointer in the page header. Pages that contain normal table data (as opposed to index, text/image, log, or allocation data) have a list of pointers to each row in the page. This list is used to maintain the logical sequence of the data and to eliminate the updating of pointers in corresponding index pages when data rows are shifted around within a page.