Grid Computing at Texas Tech University using SAS Ron Bremer Jerry Perez Phil Smith Peter Westfall* Director, Center for Advanced Analytics and Business.

Slides:



Advertisements
Similar presentations
Condor use in Department of Computing, Imperial College Stephen M c Gough, David McBride London e-Science Centre.
Advertisements

Complete Event Log Viewing, Monitoring and Management.
Copyright © 2008 SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
TIGRE A TEXAS COLLABORATIVE GRID Phil Smith Sr. Director, TTU HPCC February 22, 2007 Copyright Philip W. Smith This work is the intellectual property.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
WSUS Presented by: Nada Abdullah Ahmed.
High Performance Computing Course Notes Grid Computing.
Copyright © 2007, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Cold Fusion High Availability “Taking It To The Next Level” Presenter: Jason Baker, Digital North Date:
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Copyright Anthony K. Holden, This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial,
CT2 Strategy Foundation Who We Serve: Mid Cap to Large Companies who consume multi-mode shipping services. Our Purpose: To achieve the status as ‘market.
Tiered architectures 1 to N tiers. 2 An architectural history of computing 1 tier architecture – monolithic Information Systems – Presentation / frontend,
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Ch 4. The Evolution of Analytic Scalability
Types of Operating System
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
SAS Grid at Statistics Canada BY: Yves DeGuire Statistics Canada June 12, 2014.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.
Overview of the HUBzero Platform
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Managing and Monitoring Windows 7 Performance Lesson 8.
© 2008 Ocean Data Systems Ltd - Do not reproduce without permission - exakom.com creation Dream Report O CEAN D ATA S YSTEMS O CEAN D ATA S YSTEMS The.
Distributed EU-wide Supercomputing Facility as a New Research Infrastructure for Europe Gabrielle Allen Albert-Einstein-Institut, Germany Jarek Nabrzyski.
Jarek Nabrzyski, Ariel Oleksiak Comparison of Grid Middleware in European Grid Projects Jarek Nabrzyski, Ariel Oleksiak Poznań Supercomputing and Networking.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Module 13 Implementing Business Continuity. Module Overview Protecting and Recovering Content Working with Backup and Restore for Disaster Recovery Implementing.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
ALICE-USA Grid-Deployment Plans (By the way, ALICE is an LHC Experiment, TOO!) Or (We Sometimes Feel Like and “AliEn” in our own Home…) Larry Pinsky—Computing.
Lighting the Next-Generation Network Across Texas an update on the project _________________________ ___________ Jim Williams.
1 Implementing Monitoring and Reporting. 2 Why Should Implement Monitoring? One of the biggest complaints we hear about firewall products from almost.
Voltron A Peer To Peer Grid Networking Client Rice University Software Construction Methodology Dr. Stephen Wong, Instructor.
CN2140 Server II Kemtis Kunanuraksapong MSIS with Distinction MCT, MCITP, MCTS, MCDST, MCP, A+
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
HiPCAT, The Texas HPC and Grid Organization 4 th DOSAR Workshop Iowa State University Jaehoon Yu University of Texas at Arlington.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Module 4 Planning for Group Policy. Module Overview Planning Group Policy Application Planning Group Policy Processing Planning the Management of Group.
 Load balancing is the process of distributing a workload evenly throughout a group or cluster of computers to maximize throughput.  This means that.
High Performance Computing Across Texas –HPC systems, Clusters, and advanced visualization –Grids and Massive data storage –Scientific Computing and Projects.
TeraGrid n US based, funded by the NSF n High speed network, combines clusters into one Super-Cluster! n Doesn’t just combine computation… combines storage,
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
Configuring and Troubleshooting Identity and Access Solutions with Windows Server® 2008 Active Directory®
Globus online Software-as-a-Service for Research Data Management Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Cloud Computing Lecture 5-6 Muhammad Ahmad Jan.
Install, configure and test ICT Networks
The History of Clustering. What is computer clustering? Computer clustings is when a group of computers are linked together operating as one, sharing.
Tackling I/O Issues 1 David Race 16 March 2010.
Our Client Dr. Mawjood Institute of Indigenous Medicine University Of Colombo.
Hello Cloud… Mike Benkovich
AT LOUISIANA STATE UNIVERSITY CCT: Center for Computation & LSU Condor in Louisiana Tevfik Kosar Center for Computation & Technology Louisiana.
Computer System Evolution. Yesterday’s Computers filled Rooms IBM Selective Sequence Electroinic Calculator, 1948.
Slide 1 © 2016, Lera Technologies. All Rights Reserved. SAP BO vs SPLUNK vs OBIEE By Lera Technologies.
The Future of Whole Human Genome Data Management and Analysis, Available on the Microsoft Azure Platform Today MICROSOFT AZURE APP BUILDER PROFILE: SPIRAL.
Equipment Maintenance Direct $pend Opportunity
Deploying Regional Grids Creates Interaction, Ideas, and Integration
Types of Operating System
Computer Science Courses
Lab A: Installing and Configuring the Network Load Balancing Driver
Wonderware Online Cost-Effective SaaS Solution Powered by the Microsoft Azure Cloud Platform Delivers Industrial Insights to Users and OEMs MICROSOFT AZURE.
The Client/Server Database Environment
Patrick Dreher Research Scientist & Associate Director
Ch 4. The Evolution of Analytic Scalability
Press ESC for Startup Options © Microsoft Corporation.
Lecture 34: Testing II April 24, 2017 Selenium testing script 7/7/2019
Computer Science Courses in the Major
Presentation transcript:

Grid Computing at Texas Tech University using SAS Ron Bremer Jerry Perez Phil Smith Peter Westfall* Director, Center for Advanced Analytics and Business Intelligence Texas Tech University

What is Grid Computing? Grid computing means using multiple resources connected by the net to perform demanding calculations. Example:

Economies of High Performance Computing Current fastest machine: ~40 Teraflops ($300M) 10 Tflops Machines (~$50M) Fastest Cluster at TTU: 0.1 Tflops (~$0.1M) Speed of a PC Tflops (~$.001M)

Underused Resources Computers are everywhere, mostly idle! Grid computing leverages unused resources to create an effective “Supercomputer” Teraflops = (N computers) x (TFLPs per) For Free! (Almost)

Grid Initiatives at TTU and in Texas HipCAT – High Performance Computing Across Texas TIGRE – Texas Internet Grid for Research and Education SORCER – Service ORienter Computing EviRonment (TTU CS dept.) SAS/Connect grid

HipCAT Consortium of Texas institutions working together to use –High performance computing –Clusters –Massive data storage –Scientific visualization –Grid computing. Director: Phil Smith, Texas Tech University Members: –Baylor College of Medicine –Rice University –Texas A&M University –Texas Tech University –University of Houston –University of Texas –University of Texas at Austin –University of Texas at Arlington –University of Texas at El Paso –University of Texas Southwestern Medical Center

TIGRE Texas Internet Grid for Research & Education Two year project involving: UT, TTU, UH, Rice, and TAMU Funding announced by the Governor in September TIGRE will develop a grid software stack and policies and procedures to facilitate Texas grid computing efforts.

Grid Software Products Used at TTU AVAKI Globus Jini Networking Technology SAS/Connect (MPConnect), %Distribute macro

Benefits of SAS Ease of Use (relative to other grid products) Available and applicable for many scientists in their resp. fields Flexibility –Data base (DATA step, PROC SQL) –Math/Optimization (SAS/IML, SAS/OR) –Stat (SAS/STAT, SAS/ETS)

Problems Amenable to SAS Grid Replicates of Fundamental task Fundamental tasks are time consuming, lots of replicates Examples –Simulation –Astrophysics –Bioinformatics –Ensembles of predictive models

Success Story Financial Event Studies –Developed simulation tool to detect events –Simulated its performance –25 hours finished in 40 minutes –Published in J. Fin. Econometrics Old system: “Sneaker grid”

Another Success Story: Portfolio Analysis 300 portfolios, 50 securities each by randomly sampling securities from CRSP daily database (7.23 Gigabytes) 15 models created for each of 50 securities (PROC AUTOREG of SAS/ETS), under 169 treatment settings. 126,750 models and associated data steps per portfolio. 500 days of continuous computing time reduced to two weeks.

Notoriety Web articles appeared in SAS, Grid today, Next-Gen Data forum Interviewed by DataBase Trends and Applications

SAS Grid Structure Client connects to host machines Client sends replicates of fundamental task (“chunks”) to hosts Hosts process chunks, send back to client Client combines chunks and summarizes

The SAS Grid

SAS Farm 100 SAS machines in student lab 2.66 GhZ per node All have SAS software installed SAS “Spawner” must be started on all Avaki also installed - diagnoses problems

Student Lab

Load Balancing Automatically supports load balancing by farming out independent tasks to the next available resource. Students never noticed that their machines were being used!

Simulation-Based Methods PROC MULTTEST of SAS/STAT(first hard- coded bootstrap?)

Simulation-Based Methods, II Adjust=simulate in GLM and MIXED Posterior simulation in MIXED

Toy Example – Testing Random Number Generators Random number generators often fail to provide independent numbers. Test case: U 1, U 2 are Uniform on (0,1). If independent, then E{6(U 1 -U 2 ) 2 } = Check: Generate many pairs, report average (should be )

Code

Results

Startup (Windows) C:\Program Files\SAS\SAS 9.1>spawner -i -comamid tcp 1. Start Spawner: 2. Activate Spawner: 3. Set batch log in permissions:

The %Distribute Macro Written by Cheryl Doninger and Randy Tobias File: rs/distribute.zip rs/distribute.zip Supporting document: rs/distConnect0401.pdf

Problems We Have Experienced Random crashes (client as well as hosts) Diagnosing errors I/O problems Windows Service Pack 2 Firewall Social issues (grid involves people!)

Future Plans Support from business and government: –grid-enabled bioinformatics –business intelligence/data mining Support HPC at TTU and in Texas