Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dane Stubben QuintilesIMS Database Manager

Similar presentations


Presentation on theme: "Dane Stubben QuintilesIMS Database Manager"— Presentation transcript:

1 Dane Stubben QuintilesIMS Database Manager Dane@KelsandDane
Microsoft R Explained Dane Stubben QuintilesIMS Database Manager

2 TAILGATE When: Between sessions Where: Parking lot Why: Get up
Move around Network Grab a drink or snack

3 What is R? Powerful statistical programming language
Data Visualization tools Scalable to Big Data Most widely used data analysis software Used by 2M+ data scientists, statisticians, and analysts Thriving open-source community Leading edge of analytics research Provides a suite of operators for calculations on arrays, lists, vectors, and matrices. Provides graphical facilities for data analysis and display. 11/14/2018 | @2016 KelsandDane All Rights Reserved

4 History of R 1993: Research Project 1995: Open-source Release
Ross Ihaka and Robert Gentlemen – Auckland, NZ 1995: Open-source Release Compatible w/ the IBM S statistical language 1997: R Development Core Team 2000: R Release 2003: R Foundation 2004: First international user conference 2007: Revolution Analytics founded 2013: Revolution R Open Release 2015: Microsoft acquires Revolution Analytics 11/14/2018 | @2016 KelsandDane All Rights Reserved

5 Microsoft R Editions Microsoft R Open
Open-source R Distribution Enhanced and distributed by Revolution Analytics SQL Server R Services or R Services (In-Database) Built-in Advanced Analytics Standalone Server Capability Integrated w/ SQL 2016 Enterprise Used to develop and deploy R packages in a development environment Microsoft R Server Microsoft R Server for Redhat Linux, SUSE Linux, Teradata DB, Hadoop on Redhat Microsoft R Server Developer Edition 11/14/2018 | @2016 KelsandDane All Rights Reserved

6 Microsoft R Editions Microsoft R Client
Separate, free installer Develop solutions that can be deployed to R Services (In-Database) or Microsoft R Server running on Windows, Teradata, or Hadoop Microsoft Data Science Virtual Machine Azure VM pre-installed and configured with common data analytics and machine learning tools: Microsoft R Server Developer Edition Anaconda Python distribution Jupyter notebook (w/ R, Python kernels) Visual Study Community Edition Power BI desktop SQL Server 2016 Developer Edition Machine Learning tools: Computational Network Toolkit (CNTK) Vowpal Wabbit XGBoost Rattle Mxnet Libraries in R and Python for Azure Machine Learning Git 11/14/2018 | @2016 KelsandDane All Rights Reserved

7 I Did The Math Microsoft R Open Microsoft R Server Multi-threaded
Capacity Handles large size datasets and models Speed Overcome R’s traditional memory limits Parallelize across cores and nodes Minimize data movement w/ in-database 11/14/2018 | @2016 KelsandDane All Rights Reserved

8 RStudio 11/14/2018 | @2016 KelsandDane All Rights Reserved

9 Microsoft R Client 11/14/2018 | @2016 KelsandDane All Rights Reserved

10 R Services (In-Database)
11/14/2018 | @2016 KelsandDane All Rights Reserved

11 R Libraries / Components
ScaleR Collection of proprietary functions in Microsoft R Client and R Server used for practicing data science at scale Works on both small and large datasets Enables analysis of very large data sets that would otherwise exceed the memory and processing capabilities on the machine. DeployR Turns R Scripts into analytic web services 11/14/2018 | @2016 KelsandDane All Rights Reserved

12 Microsoft Data Science Virtual Machine
11/14/2018 | @2016 KelsandDane All Rights Reserved

13 Advanced Analytics w/ Data Science
11/14/2018 | @2016 KelsandDane All Rights Reserved

14 Data Science Focus Big Data Engineering Data Visualization
Advanced Analytics Cybersecurity Healthcare Preventative Policies (Reactive vs Proactive) Virulent outbreak Intervention Deployment Personalize Healthcare Delivery Artificial Intelligence / Machine Learning Deep Learning Computer Vision Natural Language Processing Autonomous Systems (Robots!) GIS Research Politics Finance Marketing Education Sports Analytics 11/14/2018 | @2016 KelsandDane All Rights Reserved

15 Mathematics / Statistics Computer/DB Languages
Data Science Skillset Mathematics / Statistics Computer/DB Languages Computer Database Business Skills Tools Linear Algebra R SQL SSIS / BI Statistics Python Hadoop Data Modeling RStudio Logic Spark / Scala Oracle Communication Microsoft R Open Discrete Optimization Data Visualization MongoDB Analytical Curiosity Microsoft R Client Calculus Machine Learning Continuous Learner Microsoft R Server Artificial Intelligence Tenacity R Services (In-Database) Julia Adaptability Microsoft Data Science VM Java SAS T-SQL Tableau MATLAB Octave 11/14/2018 | @2016 KelsandDane All Rights Reserved

16 Build-out Training Environment
R Environment Microsoft Environment Route: SQL R Services – Download and Install Feature in SQL Server 2016 Developer Microsoft SQL R Developer – Download and Install R System Route: R – Download and Install OR RStudio Server – Download and Install 11/14/2018 | @2016 KelsandDane All Rights Reserved

17 Build-out Training Environment
R Client Microsoft Environment Route: Microsoft R Client – Download RStudio– Download and Install R System Route: 11/14/2018 | @2016 KelsandDane All Rights Reserved

18 Build-out Training Environment
R Examples MRAN Tutorials R Services (In-Database) Data Science Walkthroughs R Examples 11/14/2018 | @2016 KelsandDane All Rights Reserved

19 Microsoft R References
MRAN Packages R Project for Statistical Computing Comprehensive R Network R Services for SQL Server 2016 (YouTube) R Services for SQL Server 2016 KB Data Science with SQL Server R Services 11/14/2018 | @2016 KelsandDane All Rights Reserved

20 Community and Learning
Twitter @MicrosoftR @revoDavid @RBloggers @BecomingDataSci @RWomenTaskForce Online Community PASS Big Data Virtual Chapter PASS Data Science Virtual Chapter PASS Women in Technology Virtual Chapter Microsoft R Server Tiger Team <- How cool of a team name is that? Online Training DataCamp Pluralsight Coursera edX Lynda.com <- Omaha/Lincoln libraries have free access Microsoft Virtual Academy MIT 11/14/2018 | @2016 KelsandDane All Rights Reserved

21 Becoming a Data Scientist References
Data Science Learning Camp Doing Data Science The Data Science Handbook 11/14/2018 | @2016 KelsandDane All Rights Reserved

22 Thank our Sponsors! 11/14/2018 | @2016 KelsandDane All Rights Reserved

23 Questions? Dane Stubben URL: KelsandDane Event & Session Evals – ONLINE ONLY Event: Session: 11/14/2018 | @2016 KelsandDane All Rights Reserved


Download ppt "Dane Stubben QuintilesIMS Database Manager"

Similar presentations


Ads by Google