Download presentation
Presentation is loading. Please wait.
Published byKieran Kimberly Modified over 10 years ago
1
Copyright © 2008 SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Cheryl Doninger Nancy Rausch R&D Director, SAS Senior Software Mgr, SAS Data Integration in a Grid- Enabled Environment
2
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. 30 Years Ago - the Mainframe
3
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Exploiting multiple processors in a machine
4
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Grid goes beyond a single machine SAS Grid Manager
5
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. SAS Grid Manager Key Capabilities SAS Grid Manager Distributed Enterprise Scheduling Workload Balancing Parallelized Workload Balancing Distribute parallelized SAS workloads to a shared pool of resources. Distribute workloads to a shared pool of resources. Distribute jobs within workflows to a shared pool of resources. Optimize the Efficiency and Utilization of Computing Resources
6
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. What Products Can Leverage SAS Grid Manager? SAS Grid Manager Distributed Enterprise Scheduling Workload Balancing Parallelized Workload Balancing SAS Data Integration Studio SAS Enterprise Miner SAS Risk Dimensions* Any SAS program* SAS Stored Processes** *(with modification) **(with limitations) SAS Data Integration Studio SAS Enterprise Guide* SAS Workspace Server Any SAS program* SAS Stored Processes** *(with wrapper) **(with limitations) SAS Data Integration Studio SAS Web Report Studio SAS Marketing Automation SAS Marketing Optimization Any SAS program* *(with modification)
7
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. SAS Code Importer
8
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Once Imported... http://support.sas.com/documentation/onlinedoc/gridmgr/index.html
9
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. SAS Data Integration Studio – Distributed Enterprise Scheduling
10
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. SAS Data Integration Studio – Multi-User Workload Balancing … PUBLIC SECTOR MANUFACTURING FINANCIAL LIFE SCIENCES
11
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Data Integration Studio on a Grid: Loops and Iterations Example: A simple job Specific physical tables referenced Specific transform logic
12
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Repetition can be helpful Processing data in multiple pieces Same process over several data sets Examples: Same process every hour Same process for multiple stores Same process for every state
13
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. How to do repetitive things? Here is one way Copy, Paste Edit in new job Problem: Multiple maintenance
14
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Doing this more automatically Use Looping Loop Loop End
15
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. How to do iteration Loop input: list of items to repeat over Loop body: one or more jobs and transforms to run repeatedly Loop output: status table (optional) Can be input into next loop
16
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. How to loop Sequential One SAS session runs them all Parallel Connect licensed Parallel on same machine SMP Grid Manager licensed Parallel on a grid
17
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. How to loop in parallel – Lots of options 1 per CPU Dont overload machine Specified number Help prevent overload Can double up per CPU Run all Let er rip!
18
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Controlling iteration with parameters By default, tables are specific physical locations Many things in SAS can accept macro variables Parameters are macro-enabled ETL objects Data Integration Studio provides user interface Input values can be mapped to parameters
19
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Creating Parameters Parameter name Macro variable &StateParm Default value Used in many ETL/S activities Running a test job Viewing data
20
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Parameters on objects Property tab: add parameters for that object Jobs can import them From referenced tables From included nested objects Loop transform will use them Can supply a default for testing
21
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Some good Examples of Parameters In a table name RETAIL&StateParm In a filepath In library path ODS Titles Mapping SQL Query …Anywhere you want…
22
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Real Example: Start with 1 Retail Store 1,000,000 orders 1 year = 80 MB data
23
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Scale up the data 10,000 stores 52 billion orders 5 years = 4.2 terabytes of data
24
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Run Jobs in Parallel by Looping Loop transform 1Atlanta Store 2Chicago Store 3Miami Store 4…
25
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Substitute Variables 1Week1 2Week2 3Week3 4… Add parameter to Table: Name = &week 1OutWeek1 2OutWeek2 3OutWeek3 4… Add parameter to Table: Name = Out&week ProcessLoad Input Existing Job Output
26
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. The Results were very good … 3.22 terabytes per hour 50 GB / minute ~1 GB / sec
27
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Grid Partitions Data Integration Studio … n Enterprise Miner EM grid DI grid Base, Connect, SAS Grid Mgr SAS Servers Connect Client
28
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Grid Partitions Restart sessions Log directory Error handling Abort all remaining Abort only current Continue on error …others Useful Loop Transform Options
29
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Another Case Study: Census data Running sequentially Data from 50 states Running on one computer at a time About 580 minutes (just under 10 hours)
30
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Running in parallel Running on six computers About 108 minutes (under 2 hours)
31
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Adding more computers Across nine computers ~ 77 minutes
32
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Did we keep the computers busy? In this case, we really did Running 6 jobs at a time on 6 processors
33
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Additional Case Studies Unstructured Data …See the paper for more examples Using Grid harnesses the power of your enterprise
34
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Questions and Want to know more? Achieving High Availability in a SAS® Grid Environment, Paper 001-2009 What's New in SAS® Data Integration Studio 4.2, Paper 093- 2009 For Base SAS® Users: Welcome to SAS® Data Integration!, Paper 092-2009 Cross Validation and Learning Curve Model Comparison with JMP® Genomics and Grid Computing, Paper 286-2009 ISOs Evolution to BI on the Grid: A Customer Perspective, Mon, 5:30 PM, Maryland 3; Paper 269-2009 Going from Good to Great: The Value of an Analytic Grid Platform at ISO, Tues 11:00 PM, National Harbor 12; Presentation only The University of Phoenix Wins Big with SAS® Grid, Tues 11:00 PM, National Harbor 5; Presentation only
35
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
36
Supply a Default Value for Testing Default value: Week1
37
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Existing Job ExtractLoad Input Existing Job Output
38
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Reuse existing job and run in parallel 1Portugal 2France 3Spain 4… Existing Job …
39
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Iteration (Looping) in Parallel 1Portugal 2France 3Spain 4… Add parameter to Table: Name = &Country 1OutPortugal 2OutFrance 3OutSpain 4… Add parameter to Table: Name = Out&Country ExtractLoad Input Existing Job Output
40
Copyright © 2008, SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. What products can leverage SAS Grid Manager? SAS Grid Manager Distributed Enterprise Scheduling Multi-User Workload Balancing Parallel Workload Balancing Optimize the Efficiency and Utilization of Computing Resources SAS Data Integration Studio SAS Enterprise Miner SAS Risk Dimensions* Any SAS program* SAS Stored Processes** *(with modification) **(with limitations) SAS Data Integration Studio Any SAS program* SAS Enterprise Guide* SAS Workspace Server SAS Stored Processes** *(with modification) **(with limitations) SAS Data Integration Studio SAS Web Report Studio SAS Marketing Automation SAS Marketing Optimization Any SAS program * *(with modification)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.