Q2.2: FAQ on partitioning


Index of Sections

What Is Table Partitioning?

Table partitioning is a procedure that creates multiple page chains for a single table.

The primary purpose of table partitioning is to improve the performance of concurrent inserts to a table by reducing contention for the last page of a page chain.

Partitioning can also potentially improve performance by making it possible to distribute a table's I/O over multiple database devices.

Page Contention for Inserts
By default, SQL Server stores a table's data in one double-linked set of pages called a page chain. If the table does not have a clustered index, SQL Server makes all inserts to the table in the last page of the page chain.

When a transaction inserts a row into a table, SQL Server holds an exclusive page lock on the last page while it inserts the row. If the current last page becomes full, SQL Server allocates and links a new last page.

As multiple transactions attempt to insert data into the table at the same time, performance problems can occur. Only one transaction at a time can obtain an exclusive lock on the last page, so other concurrent insert transactions block each other.

Partitioning a table creates multiple page chains (partitions) for the table and, therefore, multiple last pages for insert operations. A partitioned table has as many page chains and last pages as it has partitions.

I/O Contention
Partitioning a table can improve I/O contention when SQL Server writes information in the cache to disk. If a table's segment spans several physical disks, SQL Server distributes the table's partitions across fragments on those disks when you create the partitions.

A fragment is a piece of disk on which a particular database is assigned space. Multiple fragments can sit on one disk or be spread across multiple disks.

When SQL Server flushes pages to disk and your fragments are spread across different disks, I/Os assigned to different physical disks can occur in parallel.

To improve I/O performance for partitioned tables, you must ensure that the segment containing the partitioned table is composed of fragments spread across multiple physical devices.

Caveats Regarding I/O Contention
Be aware that when you use partitioning to balance I/O you run the risk of disrupting load balancing even as you are trying to achieve it. The following scenarios can keep you from gaining the load balancing benefits you want:

Can I Partition Any Table?

No. You cannot partition the following kinds of tables:
  1. Tables with clustered indexes
  2. SQL Server system tables
  3. Work tables
  4. Temporary tables
  5. Tables that are already partitioned. However, you can unpartition and then re-partition tables to change the number of partitions.
How Do I Choose Which Tables To Partition?
You should partition heap tables that have large amounts of concurrent insert activity. (A heap table is a table with no clustered index.) Here are some examples:
  1. An "append-only" table to which every transaction must write
  2. Tables that provide a history or audit list of activities
  3. A new table into which you load data with bcp in. Once the data is loaded in, you can unpartition the table. This enables you to create a clustered index on the table, or issue other commands not permitted on a partition table.

Does Table Partitioning Require User-Defined Segments?

No. By design, each table is intrinsically assigned to one segment, called the default segment. When a table is partitioned, any partitions on that table are distributed among the devices assigned to the default segment.

In the example under "How Do I Create A Partitioned Table That Spans Multiple Devices?", the table sits on a user-defined segment that spans three devices.

Can I Run Any Transact-SQL Command on a Partitioned Table?

No. Once you have partitioned a table, you cannot use any of the following Transact-SQL commands on the table until you unpartition it:
  1. create clustered index
  2. drop table
  3. sp_placeobject
  4. truncate table
  5. alter table table_name partition n

How Does Partition Assignment Relate to Transactions?

A user is assigned to a partition for the duration of a transaction. Assignment of partitions resumes with the first insert in a new transaction. The user holds the lock, and therefore partition, until the transaction ends.

For this reason, if you are inserting a great deal of data, you should batch it into separate jobs, each within its own transaction. See "How Do I Take Advantage of Table Partitioning with bcp in?", for details.

Can Two Tasks Be Assigned to the Same Partition?

Yes. SQL Server randomly assigns partitions. This means there is always a chance that two users will vie for the same partition when attempting to insert and one would lock the other out.

The more partitions a table has, the lower the probability of users trying to write to the same partition at the same time.

Must I Use Multiple Devices to Take Advantage of Partitions?

It depends on which type of performance improvement you want.

Table partitioning improves performance in two ways: primarily, by decreasing page contention for inserts and, secondarily, by decreasing i/o contention. "What Is Table Partitioning?" explains each in detail.

If you want to decrease page contention you do not need multiple devices. If you want to decrease i/o contention, you must use multiple devices.

How Do I Create A Partitioned Table That Spans Multiple Devices?

Creating a partitioned table that spans multiple devices is a multi-step procedure. In this example, we assume the following: Here are the steps:
  1. Define a segment:
    sp_addsegment newsegment, my_database,data_dev1
  2. Extend the segment across all three devices:
    sp_extendsegment newsegment, my_database, data_dev2
    sp_extendsegment newsegment, my_database, data_dev3
  3. Create the table on the segment:
    create table my_table
    (names, varchar(80) not null)
    on newsegment
  4. Partition the table:
    alter table my_table partition 30

How Do I Take Advantage of Table Partitioning with bcp in?

You can take advantage of table partitioning with bcp in by following these guidelines:
  1. Break up the data file into multiple files and simultaneously run each of these files as a separate bcp job against one table.

    Running simultaneous jobs increases throughput.

  2. Choose a number of partitions greater than the number of bcp jobs.

    Having more partitions than processes (jobs) decreases the probability of page lock contention.

  3. Use the batch option of bcp in. For example, after every 100 rows, force a commit. Here is the syntax of this command:
    bcp table_name in filename -b100
    Each time a transaction commits, SQL Server randomly assigns a new partition for the next insert. This, in turn, reduces the probability of page lock contention.

Getting More Information on Table Partitioning

For more information on table partitioning, see the chapter on controlling physical data placement in the SQL Server Performance and Tuning Guide.