Copyright © 2004, SAS Institute Inc. All rights reserved. Paul Kent VP SAS Platform Research & Development Forthcoming Changes in SAS
Copyright © 2004, SAS Institute Inc. All rights reserved. Where do I come from? New Hill, North Carolina Y’all Johannesburg, South Africa Julle Fareham, England ???
Copyright © 2004, SAS Institute Inc. All rights reserved. R & D :: Loyal Employees
Copyright © 2004, SAS Institute Inc. All rights reserved. R & D groups, and where I come from Platform Clients Solutions With Analytics
Copyright © 2004, SAS Institute Inc. All rights reserved. R & D groups, and where I come from Platform Clients Solutions With Analytics
Copyright © 2004, SAS Institute Inc. All rights reserved. What do we programmers do? Gather Data Organise Data Arrange Data for consumption Facilitate said consumption Create understanding of Data Promote understanding of said Data Valu e
Copyright © 2004, SAS Institute Inc. All rights reserved. Power Reporting Web Reporting Information Delivery Framework Information ConsumersDomain Experts Power User Business Analyst Info Tech Large%Small% Web Report Viewing Analytic Reporting Who do we programmers do it for? Audience Continuum Value
Copyright © 2004, SAS Institute Inc. All rights reserved. Forthcoming Improvements in the SAS Foundation ODS (and the new ODS statistical graphics) SAS Database Storage capabilities The Data Step and Proc SQL Grid Computing Capabilities Bits and Pieces
Copyright © 2004, SAS Institute Inc. All rights reserved. ODS Statistical Graphics
Copyright © 2004, SAS Institute Inc. All rights reserved. Survival Plot Using PROC LIFETEST in SAS 8 J. Zhou, NESUG 2002 Three-page SAS program with macros Use GPLOT and GREPLAY for graphics Statistical Metadata Overlaid Curves
Copyright © 2004, SAS Institute Inc. All rights reserved. Statistical Graphics Essential for modern data analysis Difficult to create in SAS prior to SAS 9 Context lost when statistical procedure terminates Programmer must recreate context, metadata Statistical procedures should automatically create graphics Follow the rule – 20% of these might need further tweaking, but for the most part…
Copyright © 2004, SAS Institute Inc. All rights reserved. Life Is Easier in SAS 9 … ods graphics on; ods html file="lifetest.htm"; proc lifetest data=surv; time surv*censor(1); survival plots=(survival hwb); strata trt; id patient; run; ods html close; ods graphics off;
Copyright © 2004, SAS Institute Inc. All rights reserved. LIFETEST Procedure – Survival Plot
Copyright © 2004, SAS Institute Inc. All rights reserved. LIFETEST Procedure – HWB plot
Copyright © 2004, SAS Institute Inc. All rights reserved. Usage of ODS Statistical Graphics in SAS 9 Experimental in 30 SAS/STAT and SAS/ETS procedures - SAS 9.1 Automates creation of commonly used graphical displays for a particular analysis Production in SAS 9.2
Copyright © 2004, SAS Institute Inc. All rights reserved. Procedures Using ODS Graphics SAS/STAT Procedures ANOVA CORRESP GAM GENMOD GLM KDE LIFETEST LOESS LOGISTIC MI MIXED PHREG PLS PRINCOMP PRINQUAL REG ROBUSTREG TPSPLINE SAS/ETS Procedures ARIMA AUTOREG ENTROPY EXPAND MODEL SPECTRA SYSLIN TIMESERIES UCM VARMAX X12 SAS High Performance Forecasting HPF
Copyright © 2004, SAS Institute Inc. All rights reserved. ODS Graphics Primer One statement “turns on graphics” ODS GRAPHICS ON; Procedure options determine “which plot” Template determines “what plot looks like” SAS provides default template for each plot Style determines “what all my plots look like” Destination determines “where my plots go”
Copyright © 2004, SAS Institute Inc. All rights reserved. PROC ROBUSTREG Templates ODS Output Object Data & Template Data & Template Data & Template ODS Output Destination Engine Statistical Graphic Engine Styles Table Data Graph Data Table Template Graph Template Diagnostics Stat.Robustreg.Graphics.Diagnostics ResidualHistogram Stat.Robustreg.Graphics.ResidualHistogram ResidualQQPlot Stat.Robustreg.Graphics.ResidualQQPlot HTMLData Set RTF PDFPostscript
Copyright © 2004, SAS Institute Inc. All rights reserved. Graphics Supported in ODS Destinations DestinationSAS Release HTML9.1 RTF9.1 PRINTER9.1 PDF9.1 LATEX9.1 LISTING9.2
Copyright © 2004, SAS Institute Inc. All rights reserved. Histogram of Robust Residuals
Copyright © 2004, SAS Institute Inc. All rights reserved. Template for Histogram proc template; define statgraph Stat.Robustreg.Graphics.ResidualHistogram; dynamic _DEPLABEL; Layout Gridded; Layout Gridded / columns=2; EntryTitle "Distribution of Robust Residuals for”; EntryTitle _DEPLABEL; EndLayout; Layout Overlay / xaxisopts=(label="Robust Residuals") yaxisopts=(label="Percent"); Histogram RResidual; Density RResidual / LegendLabel="Normal Density" name="Normal"; Density RResidual / Kernel() LinePattern=dashlong LegendLabel="Kernel Density" name="Kernel"; EndLayout; DiscreteLegend "Normal" "Kernel"; EndLayout; end; run; proc template; define statgraph Stat.Robustreg.Graphics.ResidualHistogram; dynamic _DEPLABEL; Layout Gridded; Layout Gridded / columns=2; EntryTitle "Distribution of Robust Residuals for”; EntryTitle _DEPLABEL; EndLayout; Layout Overlay / xaxisopts=(label="Robust Residuals") yaxisopts=(label="Percent"); Histogram RResidual; Density RResidual / LegendLabel="Normal Density" name="Normal"; Density RResidual / Kernel() LinePattern=dashlong LegendLabel="Kernel Density" name="Kernel"; EndLayout; DiscreteLegend "Normal" "Kernel"; EndLayout; end; run;
Copyright © 2004, SAS Institute Inc. All rights reserved. PROC GLM
Copyright © 2004, SAS Institute Inc. All rights reserved. PROC GLM (ANCOVA)
Copyright © 2004, SAS Institute Inc. All rights reserved. GAM Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. HPF Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. KDE Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. KDE Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. LOESS Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. LOGISTIC Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. MIXED Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. MIXED Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. PHREG Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. PLS Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. PRINCOMP Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. REG Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. REG Procedure (Simple Regression)
Copyright © 2004, SAS Institute Inc. All rights reserved. TIMESERIES Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. UCM Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. UCM Procedure
Copyright © 2004, SAS Institute Inc. All rights reserved. Integration with ODS Styles Over 30 different styles New style elements for statistical graphics Fitted line Confidence lines and bands Prediction Lines Outliers Classification groups
Copyright © 2004, SAS Institute Inc. All rights reserved. Style Demonstration ods html file=“robustreg.htm” style=journal; ods graphics on; title “Journal Style”; proc robustreg data=mydata plot=all; model y = x1 x2 x3; run; ods html close; Journal Analysis DefaultStatisticalJournalAnalysisDefaultStatistical ( only Summary Statistics and Residual Histogram output shown)
Copyright © 2004, SAS Institute Inc. All rights reserved. Summary Goal is to automate creation of graphics by statistical procedures Minimum work for user Maximum built-in functionality Experimental in SAS 9.1 Production in SAS 9.2
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Transactional Storage (aka SAS Database Capabilities) Demo Time 1. Color_table Remember to start your TableServer 2. Customers Remember to start your AppServer (tomcat5)
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Transactional Storage (aka SAS Database Capabilities) A more traditional Database Capability From SAS. (not oracle, ibm, or microsoft) Based on OpenSource “Firebird” Real Datatypes – INT, MONEY, VARCHAR Real Connectors – JDBC, ODBC, SAS Libname Real Transactions – Rollback and Commit MultiUser Server
Copyright © 2004, SAS Institute Inc. All rights reserved. What’s New in SAS Grid Automation Cheryl Doninger R&D Director, Grid Development Roger Thompson Relationship Manager Merry Rabb Product Manager, Grid
Copyright © 2004, SAS Institute Inc. All rights reserved. Grid Computing Market Size & Growth Rapid Adoption of Grid Computing Based on Benefits
Copyright © 2004, SAS Institute Inc. All rights reserved. Grid Adoption is Increasing A high percentage of firms using analytical applications are considering grid 2/3 of firms surveyed are using or considering grid technology
Copyright © 2004, SAS Institute Inc. All rights reserved. Benefits of Grid Computing Faster results More executions – more data Time to recover from errors Better use of resources Virtualize resources Incremental IT spend
Copyright © 2004, SAS Institute Inc. All rights reserved. Types of Applications Suitable for Grid Long running Many replicate runs of same fundamental task simulation (what if analysis) optimization (testing lots of scenarios) BY GROUP processing data segmentation Independent tasks running against large data sources scoring – risk analysis multiple procedures and data steps
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Grid Strategy Infrastructure benefits SAS applications large data / complex algorithms Focus areas Development Run-time System management Incremental Releases
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Grid Roadmap Phase I SAS 8.2 functionality %Distribute SAS/CONNECT SAS log
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Grid Success Stories Texas Tech University Statistics Canada Large Pharmaceutical Company
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Grid Roadmap Phase II SAS Q3/2005 functionality smarter engines for SAS IDEs SAS/Platform integration SASMC monitoring
Copyright © 2004, SAS Institute Inc. All rights reserved. Business Analytics - Enterprise Miner on SMP
Copyright © 2004, SAS Institute Inc. All rights reserved. Business Analytics - Enterprise Miner on Grid
Copyright © 2004, SAS Institute Inc. All rights reserved. Data Integration – ETL Studio on SMP/Grid
Copyright © 2004, SAS Institute Inc. All rights reserved. Data Integration – ETL Studio on SMP/Grid
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS Stored Process Business Intelligence – Enabled on SMP/Grid SAS Program ETL Studio Enterprise Miner Web Services
Copyright © 2004, SAS Institute Inc. All rights reserved. Grid Manager Plugin – job view
Copyright © 2004, SAS Institute Inc. All rights reserved. Grid Manager Plugin – host view
Copyright © 2004, SAS Institute Inc. All rights reserved. SAS 9 Grid Computing Components SAS Applications Piping Distribution Session Spawning Grid Enabled Code Generation NEW September 2005Multi-Processor SAS Multiple Components Working Together to Provide Grid Computing SAS 9 Grid Computing Grid Manager Plug-in Platform Suite for SAS Grid Monitoring Grid Management Job Termination Dynamic Load Balancing Job, Queue & Host Management Enterprise Miner Stored Processes Data Integration SAS Connect
Copyright © 2004, SAS Institute Inc. All rights reserved. General Layout of a SAS Grid Client Machine Metadata Server Grid Control Machine Grid Node … n SAS Grid Machine Grid Mgr plugin Platform Suite for SAS LSF SAS ETL SAS EM SAS Foundation
Copyright © 2004, SAS Institute Inc. All rights reserved. Grid Work Flow … n Node1 Node2 Node3 Node1 ! ! 1 () (SASMain) Node2 ! ! 1 () () Node3 ! ! 1 () (SASMain) … LSF Cluster File SASMain – Server Context Platform Server Component sas -noobjectserver SAS Servers Metadata Server Workspace Server Connect Client LSF SAS MC SAS Metadata session resource sascmd wl options p1 SASMain sas –noobjectserver grdsvc_enable(p1, “resource=SASMain”); ETL Studio Enterprise Miner signon p1;
Copyright © 2004, SAS Institute Inc. All rights reserved. Partitioning the Grid … n EM grid ETL grid Node1 Node2 Node3 Node1 ! ! 1 () (SASMain,EM) Node2 ! ! 1 () (SASMain,EM,ETL) Node3 ! ! 1 () (SASMain, ETL) … LSF Cluster File Metadata Server Workspace Server Connect Client LSF SAS MC SAS Servers SASMain – Server Context Platform Server Component sas –noobjectserver EM, ETL SAS Metadata ETL Studio Enterprise Miner session resource sascmd wl options p1 SASMain sas –noobjectserver ETL grdsvc_enable(p1, “resource=SASMain, workload=ETL”); signon p1;
Copyright © 2004, SAS Institute Inc. All rights reserved. Grid Provides: Speed and Efficiency
Copyright © 2004, SAS Institute Inc. All rights reserved. Analytics are working, so people… Build more models For successively refined segments of customers Use more data in those models Integrate the results into operational systems A SAS9.2 datastep moviemovie
Copyright © 2004, SAS Institute Inc. All rights reserved. Implications More Multi thread enablement within SAS Yes, even the DATA STEP Saved Programs Multi Threaded Server Capabilities Same model, parallel data for thruput Many models, same data – one off scores in operational systems Models Management can deploy models to “score servers” without restarting them
Copyright © 2004, SAS Institute Inc. All rights reserved. Bits and Pieces Reverse Engineer SAS jobs Checkpoint and Restart SAS jobs Encode (and protect) your SAS jobs ZIP functions CRC …
Copyright © 2004, SAS Institute Inc. All rights reserved.
Protect your IP PROC SCRAMBLE file=‘myfile.sas’ outfile=‘secret.sas’ … ; Send secret.sas to your customers %include ‘secret.sas’; Implies nosource; your macros can reset NOMPRINT…
Copyright © 2004, SAS Institute Inc. All rights reserved. Checkpoint/Restart and Parallelization Features in the Core Supervisor Rick Langston, Core Systems Department
Copyright © 2004, SAS Institute Inc. All rights reserved. Checkpoint/Restart Craig R.’s request as per user community Job fails – want to restart where it left off ETL Studio also wanted a restart facility
Copyright © 2004, SAS Institute Inc. All rights reserved. A simple solution Record a checkpoint number, save it in WORK If restarting, skip PROC / DATA steps to there Tokenize everything Execute all global statements
Copyright © 2004, SAS Institute Inc. All rights reserved. To set up for checkpointing Use NOWORKINIT, NOWORKTERM Have WORK refer to a permanent directory Use the CHECKPOINT option
Copyright © 2004, SAS Institute Inc. All rights reserved. Subsequent restarting Again use NOWORKINIT, NOWORKTERM Again use WORK to the permanent directory Use the RESTART option Job will restart as of the last successful step
Copyright © 2004, SAS Institute Inc. All rights reserved. Is this what users want? We can’t do this without user being proactive data temp / set temp issues skipped steps may need to be executed Output files (flat files – DISP=MOD, databases…)
Copyright © 2004, SAS Institute Inc. All rights reserved. EXECUTE_ALWAYS CHECKPOINT / EXECUTE_ALWAYS; Use it for a step that must be executed For example, SYMPUT and CALL EXECUTE
Copyright © 2004, SAS Institute Inc. All rights reserved.
Example Using options debug=‘checkpoint-implicit’; Option names still to be decided
Copyright © 2004, SAS Institute Inc. All rights reserved.
data temp1; x=1; run; data temp2; x=2; run; data temp3; x=3; run; data _null_; if "&sysparm."="1" then abort abend 999; run; data temp4; x=4; run;
Copyright © 2004, SAS Institute Inc. All rights reserved. Invoke once with checkpoint-implicit Then reinvoke with restart-implicit
Copyright © 2004, SAS Institute Inc. All rights reserved.
Additional info Planned for 9.2 Option names still being decided Wanting additional input
Copyright © 2004, SAS Institute Inc. All rights reserved. Parallelization Efforts Reading in arbitrary SAS code Producing metadata in comments This could be post-processed by ETL Studio This could be post-processed by Grid Computing
Copyright © 2004, SAS Institute Inc. All rights reserved. Parallelization Efforts Researching so far Hooks in dependency opens Catalogs, flat files, SAS data sets, etc. Emitting info in comments Example of use
Copyright © 2004, SAS Institute Inc. All rights reserved.
Exposure to User New option, such as DEPMETA=fileref SAS program with comments written to this file
Copyright © 2004, SAS Institute Inc. All rights reserved. Questions/comments?
Copyright © 2004, SAS Institute Inc. All rights reserved. Ideas for the Future! How can the software learn? So the user doesn’t have to learn about the software; they can learn the business! Some future ETL studio JOB Remembers data volumes from last weeks run Uses that memory to choose a better strategy
Copyright © 2004, SAS Institute Inc. All rights reserved. Your Turn!! You tell me next time SAS forgets something it should have remembered And why remembering that would help SAS improve next time Thanks for listening!