Presentation is loading. Please wait.

Presentation is loading. Please wait.

SAS Performance on SPARC T4 + Solaris: Customer experience performance study from the U.S. Bureau of Labor Statistics Edmond Cheng, Economist, Bureau of.

Similar presentations


Presentation on theme: "SAS Performance on SPARC T4 + Solaris: Customer experience performance study from the U.S. Bureau of Labor Statistics Edmond Cheng, Economist, Bureau of."— Presentation transcript:

1 SAS Performance on SPARC T4 + Solaris: Customer experience performance study from the U.S. Bureau of Labor Statistics Edmond Cheng, Economist, Bureau of Labor Statistics Steven Holmes, UNIX Systems Administrator, G&B Solutions Topics SAS solutions and technologies as part of BLS operations and statistical outputs Experience with SPARC T4-2 server performance Copyright © 2010, SAS Institute Inc. All rights reserved.

2 Bureau of Labor Statistics
The Bureau of Labor Statistics of the U.S. Department of Labor is the principal Federal agency responsible for measuring labor market activity, working conditions, and price changes in the economy. Its mission is to collect, analyze, and disseminate essential economic information to support public and private decision-making. As an independent statistical agency, BLS serves its diverse user communities by providing products and services that are objective, timely, accurate, and relevant. History The BLS has provided essential economic information to support public and private decision-making since That’s way before there were calculator and printers; statisticians had to manually process survey and calculate statistics by hand. Vision The Bureau of Labor Statistics will meet the information needs of a rapidly changing U.S. and global economy by continuously improving its products and services, investing in its work force, and modernizing its business processes.

3 Industry Employment Edmond and Steven are both member of the Division of Industry Data Development in BLS, which are tasked with the development/maintenance of the IT system and production of several major industry employment statistics products. Notably The monthly payroll number you hear the 1st Friday of each month The monthly real earning report The monthly job openings and labor turnover statistics The annual green jobs number Why are these statistics so important? They are a measure of the current economic conditions. The change in employment is a key indicator for the state of the economy. Congress uses CES data to help make policy decisions. The Fed, Bureau of Economic Analysis, and other Statistical Agencies uses BLS employment, hours, earnings data as inputs into their model. State and local government uses these data to measure economic health of State and areas and to guide monetary policy decisions. Businesses may use CES data to negotiate contracts, select building sites, forecast market demand for their products, and develop marketing strategies. Plus, there are other uses in the academics, labor organization, and researches.

4 Operation and Business Process
Survey Frame & Sample Design Questionnaire Design & Testing Data Collection & Cycle Management Data Processing & Validation / Micro Editing Estimation, Data Tabulation & Macro Editing Macro modeling, seasonal adjustment Data Dissemination / Publication Maintaining the operation to produce these economic statistics needs tremendous planning, coordination, budgeting, human effort, and IT resources. Taking the monthly payroll survey as an example, The data collection sends, collects, and process surveys from over 400,000 establishments each month from all over U.S and thru different collection modes. In a relative short amount of time, the data has to be validated, edited, and reconciled before they can go into estimation. The data will feed into the macro modeling, tabulation, editing, and adjustment before they become relevant and reliable estimates. Once all that is completed, the data are reported to the program analysts for review and verification. At last, the official statistics are disseminated to BLS publication office for public press release.

5 SAS Solutions and Others
SAS Base 9.2 SAS AppDev Studio SAS/ACCESS SAS/Connect SAS/ETS SAS/Graph SAS/IML SAS/IntrNet SAS/Share SAS/STAT SAS® Business Intelligence SAS Enterprise Guide 4.3 SAS Enterprise Guide BI Server Data Integration Server Metadata Server Microsoft Office Integration Others SAS software and solutions for data processing, statistical analysis, reporting, and data warehousing. For example, SAS Base ETL, customized statistical models, functions, reporting SAS AppDev Studio Java-based application SAS ACCESS, Connect Access to different database and platforms SAS IntrNet Web-client application SAS ETS, IML, STAT Statistical needs

6 Oracle Servers SPARC T4-2 SERVER Processor
Eight-core 2.85GHz SPARC T4 processor Two processors per system, maximum 128 threads Eight floating-point units Dual multithreaded 10 GbE PCI integrated onto chip Server platforms chosen to run SAS Long history of using UNIX servers and Solaris OS for the production system. multi-users, multi-tasking, resources-sharing secure, expandable, manageable, performance compatible with the software and other needs of our office Sun Fire V Sun Fire V Sun Fire V Sun E3500 Sun Fire T SPARC M SPARC M4000 SPARC T4-2 (certification)

7 Performance Test Servers Baselines
Server Model Linux Lab Linux HP Blade Sun Fire T5240 SPARC Enterprise M3000 SPARC T4-2 Operating System Red Hat Enterprise Linux Server release 6.3 (Santiago) Solaris 10 Processor Intel Xeon E5430 CPU Intel Xeon X5550 UltraSPARC T2+ SPARC64 VII SPARC T4 Specs 2 CPU, 2.66Ghz, quad core 2 CPUs, 1.2 GHz, 6-core 1 CPU, 2.75 GHz, quad-core 2 CPU 2.85 GHz, 8-core Thread 8 96 128 Ram 14GB 16GB 32GB 128GB SAS Version 9.2 9.1.3 9.3 A summary on hardware configurations and SAS installations. We want to know how SPARC T4-2 performances compares to our existing UNIX SPARC servers, as well as to the Linux servers we have in lab. All the SAS software and servers were configured similarly. Then we ran benchmarking tests using some of the identical production jobs selected from the current SAS system. The key points are: Comparable results between different configurations The performance positively/negatively might affect SAS users and production of timely/accurate statistics [Extra Information] Other misc test setup information: All SAS jobs were restricted to using 1024mb of memory using the sasv9.cfg file. No Solid State Drives were used on any server for this testing. Solaris servers file systems are ZFS and Linux servers are ext4.

8 SAS DATA and PROC Steps Test #1: A quick performance check
The first performance test is a self-contained program using a ZIPCODE database available with all SAS/BASE installations. The program runs common PROC procedures used in most offices. The CPU time and Real time are recorded. This gives a quick look on how the test servers measure against results recorded from previous testing. … Relevant result highlights… Database attributes Name: SAS ZIPCODE Size: GB Number of obs: millions Number of cols: 19 Test Setup Duplicate the SASHELP ZIPCODE database by 500 times. Run sort procedure, calculate summary means, and run regression.

9 Single-Threading Processing
Test #2: Single production job (single-thread) The second test perform record linkage using the BLS Establishment Longitude Database. The program runs thru series of complex logics over 13 consecutive quarters for about 8.5 million establishments. This test is taken from a production job. It is a close simulation of typical SAS production which runs in the office. …Highlight relevant results… Database attributes Name: Longitude Database Size: x 1.0 GB Number of obs: millions Number of cols: 41 Test Setup Merging 13 databases by specified linkage rules. Run logic procedures and mathematics computation to produce a final database table.

10 Multi-Theading Processing
Test #3: Four production jobs (multi-threads) The setup for Test #3 prepares four identical version of Test #2 program, and then starting all four SAS jobs simultaneously. The results provide different measures as how each server performs when running multiple concurrent threads. …Highlight relevant results… Database attributes Name: Longitude Database Size: x 1.0 GB Number of obs: millions Number of cols: 41 Test Setup Running four Test #2 programs #2 at the same time.

11 PROC IML Statistical Modeling
Test #4: Single and Multi statistical procedure The final test runs a set of SAS IML statistical procedures which performs combination of matrix algebra, statistical modeling, sample and estimates replications. The process which we know are both CPU and memory intensive. And here is the result. Remarks: the ‘eight threads’ testing was not performed for M3000. …Highlight relevant results… Database attributes Name: None Size: N/A Number of obs: N/A Number of cols: N/A Test Setup Running a SAS/STAT PROC IML in a single thread and eight concurrent threads.

12 Contacts Edmond Cheng Steven Holmes U.S. Bureau of Labor Statistics
2 Massachusetts Avenue, NE Washington, DC 20212 Steven Holmes U.S. Bureau of Labor Statistics 2 Massachusetts Avenue, NE Washington, DC 20212 Any opinions expressed in this paper are those of the author and do not constitute policy of the Bureau of Labor Statistics.

13


Download ppt "SAS Performance on SPARC T4 + Solaris: Customer experience performance study from the U.S. Bureau of Labor Statistics Edmond Cheng, Economist, Bureau of."

Similar presentations


Ads by Google