I attended the SQL intersection conference in November 2014 in Las Vegas, Nevada. I am including my notes from the sessions in this blog, primarily for my own reference later. For anyone that comes across these notes, please take them with a grain of salt – I may very easily have misheard something or have written it down incorrectly.
Session Title: Squeezing the Most out of Row and Page Compression
Instructor: Andrew J. Kelly
A highly underutilized feature in SQL Server
-
- Row compression
- Fixed length columns stored as variable
- Does not compress BLOB/LOB (i.e., LOB pages)
- Storage format is a superset of vardecimal
- Overhead for more metadata (i.e., variable lengths), but optimized for this
- Recommended for high update systems
- Overall better compression than page compression
- Page compression
- Does row compression first
- Once page is full, further compression based on column and page dictionary
- Also no BLOB/LOG (LOB pages)
- Recommended for high read workloads
- Can be enabled at table, index (clustered and/or nonclustered), partition level
- Partition can be some or all
- sp_estimate_data_compression_savings
- Copies sample into tempdb and compresses it; extrapolate to entire table
- Can be enabled or disabled online (essentially does an online rebuild)
- Nulls and zero stored as four bits
- No trailing spaces on char columns
- Column prefix
- Determine common prefix on columns
- Even on binary and integer data
- Stores prefixes, then uses pointers to prefixes
- Page dictionary
- Find repeating columns
- Store dictionary, use pointer to dictionary entries
- If bulk-inserting or inserting into a heap with page compression, will only get row compression
- Unless TABLOCK is specified
- May increase size of compressed backups
- Don’t compress small objects without a good reason
- Row compression