Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information.

Similar presentations


Presentation on theme: "Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information."— Presentation transcript:

1 Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information Technology

2 Copyright © 2005, SAS Institute Inc. All rights reserved. 2 The (Unofficial) SAS Skydiving Team

3 Copyright © 2005, SAS Institute Inc. All rights reserved. 3 Keys to Sorting Performance  Know the conditions  Observe actual performance  Understand theoretical performance  Make adjustments

4 Copyright © 2005, SAS Institute Inc. All rights reserved. 4 Know the Conditions  System  SAS  Sort job

5 Copyright © 2005, SAS Institute Inc. All rights reserved. 5 System Conditions  Operating System Size of Virtual Memory Swap file location Load −Computational −Memory −Input/Output  Hardware Number of processors Size of RAM Storage Devices −Sustained Transfer Rate −Average positional latency −Average rotational latency

6 Copyright © 2005, SAS Institute Inc. All rights reserved. 6 SAS Conditions  Library Assignments LIBNAME to logical location Logical location to physical location  System Options Sort Choice −SORTPGM −SORTCUT −SORTCUTP −THREADS −CPUCOUNT  System Options Memory Group −MEMSIZE −REALMEMSIZE −SORTSIZE Other −UBUFSIZE −WORK −UTILLOC −SORTDUP −STIMER −MSGLEVEL

7 Copyright © 2005, SAS Institute Inc. All rights reserved. 7 Sort Job Conditions  Dataset (Input, Output) Location Dimensions −Size −# of observations −Observation length Compression Subsetting options  Sort key Length Value characteristics  Procedure Options THREADS DETAILS TAGSORT PSIZE NODUPREC NODUPKEY  Utility file location

8 Copyright © 2005, SAS Institute Inc. All rights reserved. 8 Observe Actual Performance  Monitor System Activity  Examine the SAS Log  Measure System Capabilities

9 Copyright © 2005, SAS Institute Inc. All rights reserved. 9 Identify and Observe Sorting Phases Sort Phase Merge Phase I/O Bound, External, Single-Threaded

10 Copyright © 2005, SAS Institute Inc. All rights reserved. 10 Identify and Observe Sorting Phases CPU Bound, Internal, Single-ThreadedCPU Bound, External, Single-Threaded Sort Phase Merge Phase

11 Copyright © 2005, SAS Institute Inc. All rights reserved. 11 Examine the SAS Log mrgcount = 1 mempage=16896 alocsize=24 isa=16896 osa=16896 xmisa=0 holds=2 nway=24789 sortsize=419430400 memoryuse=419429880.00 keylen=16 reclen=8184 dkin=0 inrec=262144 outrec=262144 yieldobs=0 nruns=6 xcbpage=16896 npages=131073 diskuse=2214609408.0 NOTE: SAS sort was used. NOTE: PROCEDURE SORT used (Total process time): real time 5:35.68 cpu time 54.39 seconds NOTE: 6 sorted runs written to utility file. NOTE: Utility file contains 32768 pages of size 65536 bytes for a total of 2097152.00 KB. NOTE: SAS threaded sort was used. NOTE: PROCEDURE SORT used (Total process time): real time 5:43.06 cpu time 1:27.49

12 Copyright © 2005, SAS Institute Inc. All rights reserved. 12 Measure Storage Device Sequential Transfer Rates From Within SAS  Create a large dataset (e.g. 4xRAM)  Read dataset, dumping to _NULL_  Ensure Real time » CPU time  Compute transfer rates ( R ) Where F: size of the dataset (bytes) t: real time (seconds)

13 Copyright © 2005, SAS Institute Inc. All rights reserved. 13 Measure In-Core Sorting Costs and Extrapolate CPU Time (seconds) Normalized CPU Time CPU Time (seconds) NActualln(N)ActualPredicted 100000.049.214.34E-071.22E-070.01 1000000.2211.511.91E-071.57E-070.18 10000002.6413.821.91E-07 2.64 1000000036.3716.122.26E-07 36.39 20000000 16.81 2.36E-0779.41 50000000 17.73 2.50E-07221.52 100000000 18.42 2.60E-07479.51

14 Copyright © 2005, SAS Institute Inc. All rights reserved. 14 Measure In-Core Sorting Costs Small job overhead

15 Copyright © 2005, SAS Institute Inc. All rights reserved. 15 Understand Theoretical Performance  Classify the job  Estimate SORT running time  Consider estimation hazards

16 Copyright © 2005, SAS Institute Inc. All rights reserved. 16 Classify the Job Performance Limitation  Compute Bound  I/O Bound  Mixed

17 Copyright © 2005, SAS Institute Inc. All rights reserved. 17 Classify the Job Size Where F: size of input dataset O: size of internal sorting overhead M: size of RAM B: utility file page (block) size

18 Copyright © 2005, SAS Institute Inc. All rights reserved. 18 Internal (in-core) Sorting Random Access Memory Sorting Overhead Data Input Output RAM

19 Copyright © 2005, SAS Institute Inc. All rights reserved. 19 External (out-of-core) Sorting Random Access Memory Sorting Overhead Data

20 Copyright © 2005, SAS Institute Inc. All rights reserved. 20 External Sorting – Data Flow RAM Output Input Temp Single-Pass RAM Output Input 2 nd Half Double-Pass 1 st Half Temp

21 Copyright © 2005, SAS Institute Inc. All rights reserved. 21 Estimate the Running Time Internal Sort, I/O Bound Input Output RAM Sequential Read Sequential Write Where t: real time (sec) F: dataset size (bytes) R: transfer rate (bytes/sec)

22 Copyright © 2005, SAS Institute Inc. All rights reserved. 22 Estimate the Running Time Single-Pass External, I/O Bound Output Input Sequential Read Sequential WriteRandom Read Sequential Write RAM U: utility file size (bytes) Temp Where

23 Copyright © 2005, SAS Institute Inc. All rights reserved. 23 Utility File Read Time Single-threaded: File Size Multi-threaded: Number of Pages Best Case (Sequential) Read TimeWorst Case (Random) Read Time where B: utility file page (block) size where F: size of input dataset o: # of observations × sort key length s: average positional latency r: average rotational latency

24 Copyright © 2005, SAS Institute Inc. All rights reserved. 24 Multi-Pass External Sorting Number of Sorted RunsNumber of Utility File Passes is the Maximum External Merge Order where and F: size of input dataset O: size of internal sorting overhead M: SORTSIZE B: utility file page (block) size

25 Copyright © 2005, SAS Institute Inc. All rights reserved. 25 Estimate the Running Time Single-Pass External, Compute Bound Output Input Temp Sequential WriteRandom Read RAM

26 Copyright © 2005, SAS Institute Inc. All rights reserved. 26 Single-Pass External, Compute Bound Utility File Creation Time where Where n obs is the total number of observations in the dataset t run is the time required to perform an in-memory sort the number of observations in a single run Utility File Merge Time, Compute Bound As previously described for I/O bound Utility File Read Time Utility File Merge Time, I/O Bound Worst Case: Best Case:

27 Copyright © 2005, SAS Institute Inc. All rights reserved. 27 Consider Estimation Hazards  File cache effects  Pseudo-internal sorting ( thrashing )  Pseudo-external sorting ( file cache )  Limitations within each sorting phase

28 Copyright © 2005, SAS Institute Inc. All rights reserved. 28 Pseudo-Internal Sorting Random Access Memory Virtual Memory (RAM+swap) Sorting Overhead Data SORTSIZE

29 Copyright © 2005, SAS Institute Inc. All rights reserved. 29 Pseudo-External Sorting Random Access Memory Overhead Data SORTSIZE File Cache Utility File

30 Copyright © 2005, SAS Institute Inc. All rights reserved. 30 Make adjustments  Determine if there is a problem  Identify the problem  Alter the conditions  Re-evaluate

31 Copyright © 2005, SAS Institute Inc. All rights reserved. 31 Identify the Problem  Processing speed  Memory  External Storage

32 Copyright © 2005, SAS Institute Inc. All rights reserved. 32 Alter the Conditions  Memory settings  Library to storage device mappings  Utility file location  Utility file page size

33 Copyright © 2005, SAS Institute Inc. All rights reserved. 33 Memory Group Option Settings Random Access Memory Virtual Memory (RAM+swap) SORTSIZE MEMSIZE REALMEMSIZE SAS Other Active Processes Operating System

34 Copyright © 2005, SAS Institute Inc. All rights reserved. 34 Copyright © 2005, SAS Institute Inc. All rights reserved. 34


Download ppt "Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information."

Similar presentations


Ads by Google