R and the Cloud Cloud Era Ltd 27 June 2013 Karim Chine Deuxièmes rencontres R.

Slides:



Advertisements
Similar presentations
Immersive Teaching and Research in Data Sciences via Cloud Computing Cloud Era Ltd 13 June 2013 Karim Chine.
Advertisements

Leveraging scriptable infrastructures, Towards a paradigm shift in software for data science Cloud Era Ltd 14 June 2013 Karim.
Elastic-R A cloud platform for web computing, real-time collaboration, rapid applications development and reproducible modelling Karim Chine Cloud Era.
Introduction To Java Objectives For Today â Introduction To Java â The Java Platform & The (JVM) Java Virtual Machine â Core Java (API) Application Programming.
, Towards a universal platform for research and education in the cloud Karim Chine, Cloud Era Ltd 13 December 2013 Elastic-R e-Age 2013.
Emerging Platform#6: Cloud Computing B. Ramamurthy 6/20/20141 cse651, B. Ramamurthy.
© Prentice Hall CHAPTER 5 Organizational Systems.
Embedding - for better collaborations - Toshiyuki Takahei RIKEN.
V1.00 © 2009 Research In Motion Limited Introduction to Mobile Device Web Development Trainer name Date.
Web-Enabling the Warehouse Chapter 16. Benefits of Web-Enabling a Data Warehouse Better-informed decision making Lower costs of deployment and management.
Be Smart, Use PwrSmart What Is The Cloud?. Where Did The Cloud Come From? We get the term “Cloud” from the early days of the internet where we drew a.
M.A.Doman Model for enabling the delivery of computing as a SERVICE.
Innovative Data Integration Foundation to Business Intelligence Tiara USA (San Ramon, CA) | India (Chennai) | Singapore India Market.
Cloud computing Tahani aljehani.
Data-intensive Computing on the Cloud: Concepts, Technologies and Applications B. Ramamurthy This talks is partially supported by National.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
What the Cloud can do for Computational Life Sciences: Biocep-R's Unified Perspective Karim Chine
Clouds on IT horizon Faculty of Maritime Studies University of Rijeka Sanja Mohorovičić INFuture 2009, Zagreb, 5 November 2009.
Introduction to Cloud Computing
Cloud Computing.
Portal and AQAS-Philadelphia University 21-22/6/2011 AVCI Platform in PU Dr. Abdel-Rahman Al-Qawasmi Philadelphia University Director of Computer Center.
February Semantion Privately owned, founded in 2000 First commercial implementation of OASIS ebXML Registry and Repository.
Chapter 4 Computer Software.
By Mihir Joshi Nikhil Dixit Limaye Pallavi Bhide Payal Godse.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
CSCI 115 Computer Programming Overview. Computer Software System Software –Operating systems –Utility programs –Language compilers Application Software.
Introduction to Cloud Technology StratusLab Tutorial (Orsay, France) 28 November 2012.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI CloudBroker Platform integration into WS-PGRADE/gUSE Zoltán Farkas MTA.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
GumTree Feature Overview Tony Lam Data Acquisition Team Bragg Institute eScience Workshop 2006.
James Williams e: eTutor Project SUMMARY OF KEY FINDINGS for 2 Pilot studies of the.
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
August 2003 At A Glance VMOC-CE is an application framework that facilitates real- time, remote cooperative work among geographically dispersed mission.
Visualization Workshop David Bock Visualization Research Programmer National Center for Supercomputing Applications - NCSA University of Illinois at Urbana-Champaign.
Experiment Management with Microsoft Project Gregor von Laszewski Leor E. Dilmanian Acknowledgement: NSF NMI, CMMI, DDDAS
Australian Nuclear Science & Technology Organisation GumTree A Java Based GUI Framework for Beamline Experiments Tony Lam (ANSTO) Andy Götz (ESRF) Ferdi.
Convert generic gUSE Portal into a science gateway Akos Balasko 02/07/
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Machine Learning as a Service
Datalayer Notebook Allows Data Scientists to Play with Big Data, Build Innovative Models, and Share Results Easily on Microsoft Azure MICROSOFT AZURE ISV.
PaaSport Introduction on Cloud Computing PaaSport training material.
Paperless Timesheet Management Project Anant Pednekar.
IBM Software Group ® Managing Reusable Assets Using Rational Suite Shimon Nir.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Cloud computing Cloud Computing1. NIST: Five essential characteristics On-demand self-service Computing capabilities, disks are demanded over the network.
Introduction to PresentED 6/2014. PresentED is a software solution merging Video & Presentation, Attachments & Links in a single, powerful and uniform.
CSCI 115 Computer Programming Overview. Computer Software System Software –Operating systems –Utility programs –Language compilers Application Software.
Document Name CONFIDENTIAL Version Control Version No.DateType of ChangesOwner/ Author Date of Review/Expiry The information contained in this document.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Built on the Microsoft Azure Platform, UberCloud Helps Engineers and Software Providers to Offer and Deploy Powerful Cloud Services On Demand MICROSOFT.
1 NETE4631 Using Google Web Services Lecture Notes #6.
ENEA GRID & JPNM WEB PORTAL to create a collaborative development environment Dr. Simonetta Pagnutti JPNM – SP4 Meeting Edinburgh – June 3rd, 2013 Italian.
CyberGIS Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
JIA’2016 March 25th 2016 A platform at the crossroads of data science, big data and IoT.
Technologies For Creating Rich Internet Applications Presenter's name
Google. Android What is Android ? -Android is Linux Based OS -Designed for use on cell phones, e-readers, tablet PCs. -Android provides easy access to.
Cloud Computing for Science
VisIt Project Overview
Introduction to Cloud Technology
IBM Predictive Analytics Virtual Users’ Group Meeting March 30, 2016
Integrating Scientific Tools and Web Portals
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
CNIT131 Internet Basics & Beginning HTML
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

R and the Cloud Cloud Era Ltd 27 June 2013 Karim Chine Deuxièmes rencontres R

22 Outline Introduction Rethinking virtual research and teaching Elastic-R: Towards a universal platform for data science Elastic-R: Design and technologies overview Elastic-R: The scriptability framework Demo Conclusion

33 Introduction Science and the 4th paradigm The e-Research dancing bears Experimental Science Theoretical Science Computational Science e-Science / Data-intensive Science The townspeople gather to see the wondrous sight as the massive, lumbering beast shambles and shuffles from paw to paw. The bear is really a terrible dancer, and the wonder isn't that the bear dances well but that the bear dances at all.

4 Introduction * The rise of data science Data science incorporates varying elements and builds on techniques and theories from many fields, including math, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products

5 Introduction office.microsoft.com  Open-source (GPL) software environment for statistical computing and graphics  Lingua franca of data analysis.  Repositories of contributed R packages related to a variety of problem domains in life sciences, social sciences, finance, econometrics, chemo metrics, etc. are growing at an exponential rate.  R is Super Glue Fragmentation and friction in the data science arena

6 Introduction Arduino / Raspberry pi Democratizing electronics Elastic-R Democratizing data science The Next Generation Data Science Platform

77 Introduction On-demand self service Rapid elasticity Resource pooling Mesured service Broad network access 5 Essential Characteristics The Cloud and its capabilities

88 Introduction The cloud and its capabilities

9 Introduction 3D printers are becoming a common place:  Creating three-dimensional solid object of virtually any shape from a digital model.  Scripting the physical world  Sharing physical reality on Facebook is now as easy as sharing your holiday pictures The cloud and its capabilities

10 Rethinking virtual research and teaching Free researchers from their IT services dictatorship. Give them self-service access to the IT resources they need Make real-time collaboration a free, reliable and ubiquitous service Allow Researchers to share without restrictions. Make the cloud become an ecosystem for Open Science where all research artifacts can be produced, discovered and reused Allow researchers to produce and publish to the web advanced applications/services without recourse to developers/admins

11 Rethinking virtual research and teaching Provide platforms for Science-as-a-Service Allow Researchers to « sell » the software/models/algorithms/techniques they invent seamlessly: Create a market place for data science artifacts and application Provide capabilities for making data analysis and computational research traceable and reproducible Bridge the gap between the different computational research tools: interconnect SCEs, workflow workbenches, Documents editors…

12 Rethinking virtual research and teaching Provide affordable and reliable tools for remote education  High-quality voice and video chat for a large number of users  Self-service collaboration tools: Editors, White boards, IDEs, etc.  Modules for Traceability/reproducibilty Extend existing on-line courses platforms to include capabilities such as:  Companion software environments in SaaS mode  Collaborative problem solving tools  Interactive courses  Tokens for Ready-to-run e-Learning applications  E-Learning environments’ visual designers

13 Elastic-R: Towards a universal platform for data science Computational Components R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits… Open source or commercial Computational Resources Clusters, grids, private or public clouds Free or pay-per-use Computational GUIs HTML5 and Desktop Workbench Built-in views /Plugins /Collaborative views Open source or commercial Computational Scripts R / Python / Matlab / Groovy Computational APIs Java / SOAP / REST, Stateless and stateful Computational Storage Local, NFS, FTP, Amazon S3, EBS Generated Computational Web Services Stateful or stateless, mapping of R objects/functions Elastic -R

14 Elastic-R: Towards a universal platform for data science Robot submarine dives to the deepest part of the ocean controlled by a 7-mile cable as thin as single human hair

15 Elastic-R: Towards a universal platform for data science

16 Elastic-R: Towards a universal platform for data science Public Clouds Private Cloud

17 Elastic-R: Design and technologies overview You are here Your data is here

18 Elastic-R: Design and technologies overview Remote Java/R Processes  Events-driven Remote Objects/Engines  R, Python, Mathematica, Matlab, Scilab,...  Collaborative Spreadsheets  Collaborative Scientific Graphics Canvas  Collaborative Dashboard with collaborative widgets

19 Elastic-R: Design and technologies overview Elastic-R AMI 1 R 2.10 BioC 2.5 Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R AMI 3 R 2.8 BioC 2.0 Elastic-R Amazon Machine Images Elastic-R EBS 1 Data Set XXX Elastic-R EBS 2 Data Set YYY Elastic-R EBS 3 Data Set ZZZ Elastic-R EBS 4 Data Set VVV Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R EBS 4 Data Set VVV Amazon Elastic Block Stores Eastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R.org Elastic-R EBS 4 Data Set VVV

20 Elastic-R: Design and technologies overview Individuals with AWS accounts  Standard AMIs (Amazon Machine Images): paid-per-use to Amazon. For Academic use and trial purposes  Paid AMI: Paid-per-use software model. For business users Individuals without AWS accounts  Trial tokens, purchased tokens, tokens granted by other users.  Resources (data science engines) shared by other users  Individual subscribtions Companies/Educational & Research Institutions  Dedicated Platform and AMIs on an Amazon VPC (Virtual Private Cloud). Paid via subscribtion Modes of access

21 Elastic-R: The scriptability framework Command Line Web Console SDK API

22 Elastic-R: The scriptability framework Command Line Web Console SDK API

23 Elastic-R: The scriptability framework WS generator rws.war + mapping.jar + pooling framework + R Java Bridge + JAX-WS - Servlets - Generated artifacts Eclipse Web Service Client Generator public static void main(String[] args) throws Exception { RGlobalEnvFunctionWeb g=new RGlobalEnvFunctionWebServiceLocator().getrGlobalEnvFunctionWebPort(); RNumeric x=new RNumeric(); x.setValue(new Double[]{6.0}); System.out.println(g.square(x).getValue()[0]); } square  function(x) {return(x^2) } typeInfo(square)  SimultaneousTypeSpecification( TypedSignature(x = "numeric"), returnType = "numeric")‏ Script / globals.r Script / rjmap.xml Deploy

24 Elastic-R: The scriptability framework API: Soap and Restful Web Services  Data analysis engines control  Data analysis engines management (life cycle, etc.)  Virtual appliances/artifacts management  Platform administration  Generated web services from R functions SDKs: Java, R, Microsoft Office (Vba) Command line: elasticR package Html5 Workbench  Elastic-R artifacts management interface  Engines control interface

25 Demo Register to Elastic-R academic and trial portal ( ) Create data science engines using trial tokens Work with R, Python and scientific Spreadsheets in the browser Share Data Science Engine and Collaborate Use The Visual and Collaborative Scientific Applications designer to create and publish to the web an interactive dashboard Connect to the remote Data Science Engine from withing a local R session, push and pull data, execute commands and show impact on the dashboard

26 Conclusion Elastic -R unlocks the potential of the cloud for Data scientists and educators With Elastic-R, the cloud becomes a cyberspace for collaborative research and sharing and an eco-system suited for open Science, open innovation and open education Elastic-R improves dramatically the productivity of the data scientists: The entire data science factory chain, from resources acquisition to services and applications publishing, becomes under their direct control Elastic-R provides Analytics-as-a-Service platform that can extend any existing portal or application

27 What to do Next Register to Elastic-R and try the HTML 5 Workbench and the collaboration Download the R package elasticR and use it to access the cloud from local R sessions Download the Java SDK and try to create your first Analytical application using AWS and the most advanced tools for programming with data. Get in touch with me to explore potential collaborations

28 Contact details Karim Chine