The purpose of this entry is to demonstrate an important RDBMS concept and how understanding the fundamental concept can be applied to different systems.
Background: For the past 3 to 4 years I have used the PostgreSQL ORDMS extensively in web application development and consider myself an expert. I have studied database concepts extensively and used them in real world applications. Because I was free to choose over the last few years I chose to use PostgreSQL. One of the main reasons being cost, it's free to use. However I wish to illustrate that just because I used the particular system my skills in other systems, particularly MS SQL Server have not been diminished, but actually enhanced from years of practical applied database theory, and those theories can be quickly transferred between systems.
The concept: Partitioning is a global database strategy for dealing with large unwieldy tables. Over time a heavily inserted table can grow so massive that working with it just becomes slow due to size despite correctly written queries. What I intend to show is how I understand partitioning and how I have used it in the past with Pg, and then show how in less than 6 hours I was able to translate that understanding into a working implementation using MS SQL Server 2008.
Skills demonstrated: Understanding of database concepts, virtual machine proficiency, software installation of Windows Server 2003, software installation of MS SQL Server 2008, research skill, coding proficiency in Transact SQL and PL/PgSQL and technical writing.
The demonstration background: One of my personal projects is a VOIP company: www.tonerouter.com, as well as in my last position our central technology was processing phone calls. All in all I have dealt with hundreds of millions if not billions of records representing phone calls. The application environment is high insert and typically summary reads over a known time period. Meaning when pulling data from this table it is typically for reporting on say the current months worth of calls. Over time we saw that older calls were hardly ever accessed except for large group reporting for example to see call trends over a past year. This scenario is perfect for partitioning.
Pg implementation: In Pg partitioning is accomplished through inheritance and use of the rule and or trigger systems. A parent table is created and then rules or triggers redirect inserts destined to the parent table into the proper child table based on a column criteria. Integrity is maintained through check constraints. Queries need only reference the parent table name and the constraint exclusion feature allows the planner to choose the proper child table based on the where clause. That's the theory anyway. In practical application I have found that Pg does not handle complex or "sliding" partition parameters very well, such as dates and time. For simple comparisons though it works great. To work around the limitation I developed the following strategy: I chose to create partition column that is a computed column based on the month and year of the record being inserted. The data type is varchar and the values are simply strings in the format "month_year". A string constant lends itself well to Pg's constraint checking. I used the trigger system to manage creation of the child tables so that they are created automatically upon insert and all the necessary rules are added to the parent table dynamically. The code to do this is here.
SQL Server Implementation: Converting this technique to SQL Server was not very difficult. Because the concept and desired effect were known I simply needed to find the implementation details. The resources I used to do this were MSDN and specifically the following document: MSSQL Partitioning Strategies.docx. At first read I was tempted to use the Partitioned View strategy and it on the surface is very similar to what Pg was doing: Horizontal partitioning with multiple tables. However upon further reading it became apparent that technique would not be the best. Instead SQL Server 2008 has the ability to internally manage partitions within a single table. This technique is actually closer to what Pg's technique is doing because it allows the queries to reference one table name, handles inserts automatically and places them in the right place and allows the planner to select the proper partition based on a where clause. The steps needed to accomplish this were:
1) Create a partitioning function.
2) Create a partition scheme.
3) Create the table and place it on the partition scheme.
The following code samples and screen shots illustrate the above steps.
CREATE PARTITION FUNCTION cdr_by_month ( varchar(14) )
AS RANGE LEFT
FOR VALUES ( 'november_2009', 'december_2009',
'january_2010', 'february_2010','march_2010', 'april_2010', 'may_2010', 'june_2010', 'july_2010',
'august_2010','september_2010', 'october_2010', 'november_2010', 'december_2010' ) ;
CREATE PARTITION SCHEME cdr_monthly
AS PARTITION cdr_by_month ALL TO ([PRIMARY]) ;


One thing I did not go into, but is extremely important is the disk management aspect of partitioning.
Knowing the access and usage patterns of your queries and knowing that you need to put your highly accessed tables on fast
disks is extremely important in db design. In Pg this is accomplished through Tablespaces, in MS SQL Server it is through
file groups. Different terms for the same concept which is knowing where to place your tables to best take advantage of fast
disks.
Final Thoughts: Partitioning is an important database concept. All major relational database systems implement it in some
form. Understanding the concept from a design point of view is important and being able to implement the strategy in
different systems is also important. You never know what tool you will be called upon to get a job done. I hope to have shown
how I am able to use and work with SQL Server 2008 because of my past experience with older SQL Server editions, as well as my ability to take a concept, research it for a different platform, and then implement it in that platform.
Table Definition
Partition String Function