Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Similar presentations


Presentation on theme: "Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's."— Presentation transcript:

1 Sort in GPDB Feng Tian GreenPlum Inc.

2 WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's code reading experience easier.

3 Outline (Doesn't this look familiar?)‏ Motivation Review of Current Status Improve Sort Performance Remaining Work

4 Sort in Database One of the most important operator  Order by  Group by  OLAP Rollup and Cube Window (partition by and order by)‏  Merge Join  Build index

5 Sort in GPDB One of the most mysterious operator  Sort is slow v.s. Sort is OK  Fix planner to avoid sort v.s. Fix sort

6 Sort is fun One of the most extensively studied algorithm In memory sorting algorithm  CK always got some interesting links  Jie challenged my interview question  Sedgewick: Quicksort is optimal  Bentley & McIlory, 93. External sort  TAOCP

7 GPDB Sort is funny Good  Honest TAOCP  Honest BM93. Bad  Equal keys  Lots of columns  Sort strings Ugly  Combination of the bads

8 Goal Get rid of the ugly part of GPDB Sort.

9 Outline Motivation Review of Current Status Improve Sort Performance Remaining Work

10 GPDB Sort Quicksort if entries fit in memory External sort  An honest implementation from TAOCP I/O pattern is pretty good Amount of I/O when sorting tuple is OK  No compression  Sorting datum is terrible, but not a concern at this moment Only used for distinct May eventually be replaced by hash  Use Heap to merge

11 GPDB Sort Details  Cost of comparison Non trivial overhead (Unicode) String compare is extremely slow  Strcoll v.s. Strxfrm + strcmp  Cost of memtuple_getattr It is way better than heap_getattr Postgres devs know this for a long time Cache first sort column  Sort (1, 'a'), (2, 'a'), (3, 'c')... is fast.  Sort (1, 'a'), (1, 'b'), (1, 'c')... is miserably slow.

12 Outline Motivation Review of Current Status Improve Sort Performance Remaining Work

13 Goal It should be “invisible”  No API change  Keep fast cases fast Slow cases? What slow cases?  Planner can honestly optimize a query, without worrying about “avoiding” sort  User can write a query, without trying to be creative  In the cases that a sort cannot be avoided, may save out neck.

14 Quicksort Is Optimal (Sedgewick)‏ Equal keys  Equal keys is good (Bentley & McIloy)‏ Do not special case small n  Why? Not sure. Cache oblivious? Multi column sort keys  Comparison get slower and slower

15 Quicksort As the old algorithm, cache first sort column Quicksort on first column For the range with equal first column, cache the second sort column, quick sort the range Until all sort columns are processed  May stop early. Sort (1, 'a'), (2, 'b'), (3, 'c') will not compare string at all. Sort (1, 'a'), (1, 'b'), (1, 'c') will only call memtuple_getattr when necessary.

16 Example (1, ?), (3, ?), (2, ?), (0, ?), (3, ?), (2, ?)‏ Choose Pivot (2, ?)‏ (2, ?), (1, ?), (1, ?) :: (3, ?), (3, ?), (2, ?)‏ Swap to middle (0, ?),(1, ?) :: (2, ?),(2, ?) :: (3, ?), (3, ?)‏

17 Recursive Down Quick sort each partition For left, right, just quick sort. For the middle part, expand to level k+1  (2, ?), (2, ?)... (2, ?) to (2, 'a'), (2, 'x'), (2, 'd')... (2, 'z')‏  Of course, only if middle has not expanded all level NO EXTRA LEVEL EXPANSION NO EXTRA COMPARISON

18 Heapsort Used in external sort (both produce runs and merge runs)‏ Cache first sort column when insert into heap Expand to (n+1)th sort column only when first n column equals those of heap top Remember the lv of expand  Maintain an array of datum d,  entry.sort_column[x] = d[x] if x < lv Siftup and Siftdown  Siftdown hole

19 HeapSort Continued NO EXTRA EXPANSION NO EXTRA COMPARISION However, code became more complicated.

20 Handling String When cache a sort column, cache strxfrm  Comparison use strcmp Equal String  Collapse equal strings Compare pointer value first Save memory Problems  Memory consumption

21 Minor improvements Fast path some basic types  Int, maybe float later Limit Sort: Use heapsort instead of insertion sort

22 Outline Motivation Review of Current Status Improve Sort Performance Remaining Work

23 “Honest” Implementation Cut corners in performance prototype is dangerous  Error handling  Special cases Relatively honest  Does not handle unique check etc. Pass make installcheck-good. Pass TPCH and opperf if turn off hashagg and hashjoin

24 TPCH 1G Q1 Hashagg ~5.7 sec Old sort ~15 sec New sort ~8 sec  Aggregate computing takes ~4 sec  Hashagg proper ~ 1.5 sec  New sort, generated 3 runs, motioned 6M tuples, and do one more comparison in Agg in less than 4 sec. The extra comparison takes more than 1 sec Sort proper is ~2 sec

25 Building index On ship_instruction, ship_mode, comment  Old: All take 24 to 26 sec  New: 4 sec, 6 sec, 11 sec On two columns  Old: 70+ sec  New: 16?

26 OLAP (Cube and Rollup)‏ For “Big” OLAP CUBE/ROLLUP queries, 10~15% faster  Not much on “smaller” ones, some may even see some small regression Unstable timing, regression comes and goes  Our olap plan have many sorts, on 1 or 2 integer column, so this is expected However, we can finish some “machine freezing” queries now

27 Yahoo Hashagg Slightly slower  Heapsort Overhead :-(  On par once I fastpath-ed int4cmp

28 Outline Motivation Review of Current Status Improve Sort Performance Remaining Work

29 More Improvements We know the level of key change  Important for sort agg  Important for OLAP  Important for merge join Take (more) advantages of unique, limit, aggregate.

30 Improve the code Heap code (maybe) is (more) complicated (than necessary), don't know how to improve yet. Memory management. Explain analyze accounting and reporting.

31 Code Review Code is at ftian_main_cr2 branch  tuplesort.c Should make it tuplesortnew.c, and probably GUC it. Uses memtuple and logtape as before. Uses new quick sort and heap sort.  mk_qsort.c Multi key quick sort. Straightforward.  mk_heap.c Multi key heap sort. 700 lines heap sort :-( About time to port into MAIN.

32 Feedback (Thanks!)‏ Welcome ideas, new improvements and critique of the approach.


Download ppt "Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's."

Similar presentations


Ads by Google