I attended the SQL intersection conference in November 2014 in Las Vegas, Nevada. I am including my notes from the sessions in this blog, primarily for my own reference later. For anyone that comes across these notes, please take them with a grain of salt – I may very easily have misheard something or have written it down incorrectly.
Session Title: Parallel Query Execution, Part 2
Instructor: Paul White
Batch mode parallel execution
-
- Geared toward DW scenarios
- Row mode model not as efficient on modern CPU architecture
- Frequently has to access memory rather than L1, L2, L3 cache
- Batch mode
- Batches contain 64-900 rows
- Code designed to fit on L1 cache
- Data sized to fit on L2 cache
- Geared toward 64-bit register use
- Uses push model (whereas row mode is a pull model)
- Avoids expensive exchanges with low-contention shared structures
- Has high startup cost, so not good for OLTP workloads
- Threads are dynamic and can move between operators
- Batch is a 64KB structure designed to fit in L2 cache
- Contains a column vector that fits in L1 cache
- Allocated one time, reused many times
- TF 9453 – Disable batch mode
- Must keep columns small so that at least 64 rows will fit into 64KB structure
- Batch mode not implemented on string aggregates
- Columnstore
- High compression, reduces IO
- Stored as LOBs, optimizations for LOB
- Created by partitioning into row groups of 1024^2 (about one million)
- Partitioned into segment; reordered for compression
- Dictionary per column per segment, plus global dictionary per column
- SQL 2012 had many limitations, including being read-only
- Also had very limited plan operators that were supported
- Small changes to query could change to row mode and be much slower
- Can often rewrite query to stay in batch mode, but can get ugly and requires a fair bit of internals knowledge – not well suited for end users
- If operators spill to tempdb, query could start over in row mode
- SQL 2014 has many improvements
- Writeable, updateable, better compression, DDL allowed, more column types
- More operators added, but still limited
- No sort still
- Spills stay in batch mode
- Allows mixed mode (i.e., plan contains row mode, switch to batch mode, back to row mode) leaving more options for optimizer
- Improved bitmaps
- Any supported datatype
- Both simple and Bloom filters
- Bookmark support (this is not clustered index/RID lookups)
- Unique key for columnstore indexes
- Delta store for inserts; tuple mover to merge when full
- Deleted rows stay in columnstore (marked as deleted) until REBUILD
- Update = delete + insert
- Still has limitations
- No PK, FK, nonclustered indexes, unique constraints (can work around with materialized views), replication, CDC
- Not readable on AG secondaries
- (Interesting note – bug in error message when attempting to create FK – error message lists wrong table!)
- TF 8607 – Output text of internal execution plan
- TF 3604 – Output to client
- TABLESAMPLE SYSTEM (15 PERCENT) – sample 15% of a table