Download presentation
Presentation is loading. Please wait.
1
Copyright © 2005, SAS Institute Inc. All rights reserved. Getting the Best Performance from V9 Threaded PROC SORT Scott Mebust System Developer Base Information Technology
2
Copyright © 2005, SAS Institute Inc. All rights reserved. 2 The (Unofficial) SAS Skydiving Team
3
Copyright © 2005, SAS Institute Inc. All rights reserved. 3 Keys to Sorting Performance Know the conditions Observe actual performance Understand theoretical performance Make adjustments
4
Copyright © 2005, SAS Institute Inc. All rights reserved. 4 Know the Conditions System SAS Sort job
5
Copyright © 2005, SAS Institute Inc. All rights reserved. 5 System Conditions Operating System Size of Virtual Memory Swap file location Load −Computational −Memory −Input/Output Hardware Number of processors Size of RAM Storage Devices −Sustained Transfer Rate −Average positional latency −Average rotational latency
6
Copyright © 2005, SAS Institute Inc. All rights reserved. 6 SAS Conditions Library Assignments LIBNAME to logical location Logical location to physical location System Options Sort Choice −SORTPGM −SORTCUT −SORTCUTP −THREADS −CPUCOUNT System Options Memory Group −MEMSIZE −REALMEMSIZE −SORTSIZE Other −UBUFSIZE −WORK −UTILLOC −SORTDUP −STIMER −MSGLEVEL
7
Copyright © 2005, SAS Institute Inc. All rights reserved. 7 Sort Job Conditions Dataset (Input, Output) Location Dimensions −Size −# of observations −Observation length Compression Subsetting options Sort key Length Value characteristics Procedure Options THREADS DETAILS TAGSORT PSIZE NODUPREC NODUPKEY Utility file location
8
Copyright © 2005, SAS Institute Inc. All rights reserved. 8 Observe Actual Performance Monitor System Activity Examine the SAS Log Measure System Capabilities
9
Copyright © 2005, SAS Institute Inc. All rights reserved. 9 Identify and Observe Sorting Phases Sort Phase Merge Phase I/O Bound, External, Single-Threaded
10
Copyright © 2005, SAS Institute Inc. All rights reserved. 10 Identify and Observe Sorting Phases CPU Bound, Internal, Single-ThreadedCPU Bound, External, Single-Threaded Sort Phase Merge Phase
11
Copyright © 2005, SAS Institute Inc. All rights reserved. 11 Examine the SAS Log mrgcount = 1 mempage=16896 alocsize=24 isa=16896 osa=16896 xmisa=0 holds=2 nway=24789 sortsize=419430400 memoryuse=419429880.00 keylen=16 reclen=8184 dkin=0 inrec=262144 outrec=262144 yieldobs=0 nruns=6 xcbpage=16896 npages=131073 diskuse=2214609408.0 NOTE: SAS sort was used. NOTE: PROCEDURE SORT used (Total process time): real time 5:35.68 cpu time 54.39 seconds NOTE: 6 sorted runs written to utility file. NOTE: Utility file contains 32768 pages of size 65536 bytes for a total of 2097152.00 KB. NOTE: SAS threaded sort was used. NOTE: PROCEDURE SORT used (Total process time): real time 5:43.06 cpu time 1:27.49
12
Copyright © 2005, SAS Institute Inc. All rights reserved. 12 Measure Storage Device Sequential Transfer Rates From Within SAS Create a large dataset (e.g. 4xRAM) Read dataset, dumping to _NULL_ Ensure Real time » CPU time Compute transfer rates ( R ) Where F: size of the dataset (bytes) t: real time (seconds)
13
Copyright © 2005, SAS Institute Inc. All rights reserved. 13 Measure In-Core Sorting Costs and Extrapolate CPU Time (seconds) Normalized CPU Time CPU Time (seconds) NActualln(N)ActualPredicted 100000.049.214.34E-071.22E-070.01 1000000.2211.511.91E-071.57E-070.18 10000002.6413.821.91E-07 2.64 1000000036.3716.122.26E-07 36.39 20000000 16.81 2.36E-0779.41 50000000 17.73 2.50E-07221.52 100000000 18.42 2.60E-07479.51
14
Copyright © 2005, SAS Institute Inc. All rights reserved. 14 Measure In-Core Sorting Costs Small job overhead
15
Copyright © 2005, SAS Institute Inc. All rights reserved. 15 Understand Theoretical Performance Classify the job Estimate SORT running time Consider estimation hazards
16
Copyright © 2005, SAS Institute Inc. All rights reserved. 16 Classify the Job Performance Limitation Compute Bound I/O Bound Mixed
17
Copyright © 2005, SAS Institute Inc. All rights reserved. 17 Classify the Job Size Where F: size of input dataset O: size of internal sorting overhead M: size of RAM B: utility file page (block) size
18
Copyright © 2005, SAS Institute Inc. All rights reserved. 18 Internal (in-core) Sorting Random Access Memory Sorting Overhead Data Input Output RAM
19
Copyright © 2005, SAS Institute Inc. All rights reserved. 19 External (out-of-core) Sorting Random Access Memory Sorting Overhead Data
20
Copyright © 2005, SAS Institute Inc. All rights reserved. 20 External Sorting – Data Flow RAM Output Input Temp Single-Pass RAM Output Input 2 nd Half Double-Pass 1 st Half Temp
21
Copyright © 2005, SAS Institute Inc. All rights reserved. 21 Estimate the Running Time Internal Sort, I/O Bound Input Output RAM Sequential Read Sequential Write Where t: real time (sec) F: dataset size (bytes) R: transfer rate (bytes/sec)
22
Copyright © 2005, SAS Institute Inc. All rights reserved. 22 Estimate the Running Time Single-Pass External, I/O Bound Output Input Sequential Read Sequential WriteRandom Read Sequential Write RAM U: utility file size (bytes) Temp Where
23
Copyright © 2005, SAS Institute Inc. All rights reserved. 23 Utility File Read Time Single-threaded: File Size Multi-threaded: Number of Pages Best Case (Sequential) Read TimeWorst Case (Random) Read Time where B: utility file page (block) size where F: size of input dataset o: # of observations × sort key length s: average positional latency r: average rotational latency
24
Copyright © 2005, SAS Institute Inc. All rights reserved. 24 Multi-Pass External Sorting Number of Sorted RunsNumber of Utility File Passes is the Maximum External Merge Order where and F: size of input dataset O: size of internal sorting overhead M: SORTSIZE B: utility file page (block) size
25
Copyright © 2005, SAS Institute Inc. All rights reserved. 25 Estimate the Running Time Single-Pass External, Compute Bound Output Input Temp Sequential WriteRandom Read RAM
26
Copyright © 2005, SAS Institute Inc. All rights reserved. 26 Single-Pass External, Compute Bound Utility File Creation Time where Where n obs is the total number of observations in the dataset t run is the time required to perform an in-memory sort the number of observations in a single run Utility File Merge Time, Compute Bound As previously described for I/O bound Utility File Read Time Utility File Merge Time, I/O Bound Worst Case: Best Case:
27
Copyright © 2005, SAS Institute Inc. All rights reserved. 27 Consider Estimation Hazards File cache effects Pseudo-internal sorting ( thrashing ) Pseudo-external sorting ( file cache ) Limitations within each sorting phase
28
Copyright © 2005, SAS Institute Inc. All rights reserved. 28 Pseudo-Internal Sorting Random Access Memory Virtual Memory (RAM+swap) Sorting Overhead Data SORTSIZE
29
Copyright © 2005, SAS Institute Inc. All rights reserved. 29 Pseudo-External Sorting Random Access Memory Overhead Data SORTSIZE File Cache Utility File
30
Copyright © 2005, SAS Institute Inc. All rights reserved. 30 Make adjustments Determine if there is a problem Identify the problem Alter the conditions Re-evaluate
31
Copyright © 2005, SAS Institute Inc. All rights reserved. 31 Identify the Problem Processing speed Memory External Storage
32
Copyright © 2005, SAS Institute Inc. All rights reserved. 32 Alter the Conditions Memory settings Library to storage device mappings Utility file location Utility file page size
33
Copyright © 2005, SAS Institute Inc. All rights reserved. 33 Memory Group Option Settings Random Access Memory Virtual Memory (RAM+swap) SORTSIZE MEMSIZE REALMEMSIZE SAS Other Active Processes Operating System
34
Copyright © 2005, SAS Institute Inc. All rights reserved. 34 Copyright © 2005, SAS Institute Inc. All rights reserved. 34
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.