Download presentation
Presentation is loading. Please wait.
1
Grid Computing at Texas Tech University using SAS Ron Bremer Jerry Perez Phil Smith Peter Westfall* Director, Center for Advanced Analytics and Business Intelligence Texas Tech University
2
What is Grid Computing? Grid computing means using multiple resources connected by the net to perform demanding calculations. Example:
3
Economies of High Performance Computing Current fastest machine: ~40 Teraflops ($300M) 10 Tflops Machines (~$50M) Fastest Cluster at TTU: 0.1 Tflops (~$0.1M) Speed of a PC 0.003 Tflops (~$.001M)
4
Underused Resources Computers are everywhere, mostly idle! Grid computing leverages unused resources to create an effective “Supercomputer” Teraflops = (N computers) x (TFLPs per) For Free! (Almost)
5
Grid Initiatives at TTU and in Texas HipCAT – High Performance Computing Across Texas TIGRE – Texas Internet Grid for Research and Education SORCER – Service ORienter Computing EviRonment (TTU CS dept.) SAS/Connect grid
6
HipCAT Consortium of Texas institutions working together to use –High performance computing –Clusters –Massive data storage –Scientific visualization –Grid computing. Director: Phil Smith, Texas Tech University Members: –Baylor College of Medicine –Rice University –Texas A&M University –Texas Tech University –University of Houston –University of Texas –University of Texas at Austin –University of Texas at Arlington –University of Texas at El Paso –University of Texas Southwestern Medical Center
7
TIGRE Texas Internet Grid for Research & Education Two year project involving: UT, TTU, UH, Rice, and TAMU Funding announced by the Governor in September TIGRE will develop a grid software stack and policies and procedures to facilitate Texas grid computing efforts.
8
Grid Software Products Used at TTU AVAKI Globus Jini Networking Technology SAS/Connect (MPConnect), %Distribute macro
9
Benefits of SAS Ease of Use (relative to other grid products) Available and applicable for many scientists in their resp. fields Flexibility –Data base (DATA step, PROC SQL) –Math/Optimization (SAS/IML, SAS/OR) –Stat (SAS/STAT, SAS/ETS)
10
Problems Amenable to SAS Grid Replicates of Fundamental task Fundamental tasks are time consuming, lots of replicates Examples –Simulation –Astrophysics –Bioinformatics –Ensembles of predictive models
11
Success Story Financial Event Studies –Developed simulation tool to detect events –Simulated its performance –25 hours finished in 40 minutes –Published in J. Fin. Econometrics Old system: “Sneaker grid”
12
Another Success Story: Portfolio Analysis 300 portfolios, 50 securities each by randomly sampling securities from CRSP daily database (7.23 Gigabytes) 15 models created for each of 50 securities (PROC AUTOREG of SAS/ETS), under 169 treatment settings. 126,750 models and associated data steps per portfolio. 500 days of continuous computing time reduced to two weeks.
13
Notoriety Web articles appeared in SAS, Grid today, Next-Gen Data forum Interviewed by DataBase Trends and Applications
14
SAS Grid Structure Client connects to host machines Client sends replicates of fundamental task (“chunks”) to hosts Hosts process chunks, send back to client Client combines chunks and summarizes
15
The SAS Grid
16
SAS Farm 100 SAS machines in student lab 2.66 GhZ per node All have SAS software installed SAS “Spawner” must be started on all Avaki also installed - diagnoses problems
17
Student Lab
18
Load Balancing Automatically supports load balancing by farming out independent tasks to the next available resource. Students never noticed that their machines were being used!
19
Simulation-Based Methods PROC MULTTEST of SAS/STAT(first hard- coded bootstrap?)
20
Simulation-Based Methods, II Adjust=simulate in GLM and MIXED Posterior simulation in MIXED
21
Toy Example – Testing Random Number Generators Random number generators often fail to provide independent numbers. Test case: U 1, U 2 are Uniform on (0,1). If independent, then E{6(U 1 -U 2 ) 2 } = 1.00. Check: Generate many pairs, report average (should be 1.000000)
22
Code
23
Results
24
Startup (Windows) C:\Program Files\SAS\SAS 9.1>spawner -i -comamid tcp 1. Start Spawner: 2. Activate Spawner: 3. Set batch log in permissions:
25
The %Distribute Macro Written by Cheryl Doninger and Randy Tobias File: http://support.sas.com/rnd/scalability/pape rs/distribute.zip http://support.sas.com/rnd/scalability/pape rs/distribute.zip Supporting document: http://support.sas.com/rnd/scalability/pape rs/distConnect0401.pdf
26
Problems We Have Experienced Random crashes (client as well as hosts) Diagnosing errors I/O problems Windows Service Pack 2 Firewall Social issues (grid involves people!)
27
Future Plans Support from business and government: –grid-enabled bioinformatics –business intelligence/data mining Support HPC at TTU and in Texas
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.