Empowering teams with scalable Shiny applications

Slides:



Advertisements
Similar presentations
How We Manage SaaS Infrastructure Knowledge Track
Advertisements

DevOps and Private Cloud Automation 23 April 2015 Hal Clark.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
Introduction to Amazon Web Services (AWS)
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Oracle Application Express (Oracle APEX)
Getting to Push Button Deploys Moovweb January 19, 2012.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Lecture 8 – Platform as a Service. Introduction We have discussed the SPI model of Cloud Computing – IaaS – PaaS – SaaS.
Customized cloud platform for computing on your terms !
1 G A A new Document Control System “A new system to manage LIGO documents” Stuart Anderson Melody Araya David Shoemaker 29 September, 2008
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
608D CloudStack 3.0 Omer Palo Readiness Specialist, WW Tech Support Readiness May 8, 2012.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
Amazon Web Services MANEESH MOHANAVILASAM. OLD IS GOLD?...NOT Predicting peaks Developing partnerships Buying and maintaining hardware Upgrading hardware.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1 Isolating Web Programs in Modern Browser Architectures CS6204: Cloud Environment Spring 2011.
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Deploying Elastic Java EE Microservices in the Cloud with Docker
If it’s not automated, it’s broken!
242: Get Your Head in the Cloud!
Progress Apama Fundamentals
Segments Introduction: slides 2–6, 8 10 minutes
Joonas Sirén, Technology Architect, Emerging Technologies Accenture
Nithyamoorthy S Core Mind Technologies
Deployment Architectures For Containers
4/24/ :07 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
100% Exam Passing Guarantee & Money Back Assurance
Dag Toppe Larsen UiB/CERN CERN,
Docker and Azure Container Service
Deploy, Manage, and Scale Your Apps with OpsWorks, Elastic Beanstalk, and CodeDeploy Part 1 – Elastic Beanstalk © 2017 Amazon Web Services, Inc. and.
Infrastructure Orchestration to Optimize Testing
Dag Toppe Larsen UiB/CERN CERN,
Google App Engine Mandeep Singh (37926)
Platform as a Service.
4th Forum How to easily offer your application as a self-service template by using OpenShift and GitLab-CI 4th Forum Alberto.
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Printer Admin Print Job Manager
AWS DevOps Engineer - Professional dumps.html Exam Code Exam Name.
2018 Amazon AWS DevOps Engineer Professional Dumps - DumpsProfessor
Get Amazon AWS-DevOps-Engineer-Professional Exam Real Questions - Amazon AWS-DevOps-Engineer-Professional Dumps Realexamdumps.com
Buy September 2018 Valid Amazon AWS-SysOps Dumps Questions - Amazon AWS-SysOps Braindumps Realexamdumps.com
Solving ETL Bottlenecks with SSIS Scale Out
Azure Container Instances
X in [Integration, Delivery, Deployment]
Tapping the Power of Your Historical Data
Winter 2016 (c) Ian Davis.
Simplified Development Toolkit
Microsoft Virtual Academy
Google App Engine Ying Zou 01/24/2016.
AWS Cloud Computing Masaki.
Docker in AWS ECS.
JENKINS TIPS Ideas for making your life with Jenkins easier
Versioning system (e.g. github) holding code artifacts like war files
Distributed System using Web Services
Configuration management suite
Introduction to Docker
OpenShift as a cloud for Data Science
Roots/Git to Deploy What is continuous integration and continuous delivery How they are used at the Innovation Co-Lab Victor Wang, Software Engineer &
NIEM Tool Strategy Next Steps for Movement
OpenStack Summit Berlin – November 14, 2018
Distributed System using Web Services
Azure Container Service
Harrison Howell CSCE 824 Dr. Farkas
Erik Vollebekk Application Architect
Deploying machine learning models at scale
Containers and DevOps.
Thanks to our Sponsors Platinum Sponsor: Gold Sponsors:
Presentation transcript:

Empowering teams with scalable Shiny applications Data for all Empowering teams with scalable Shiny applications

Data culture In many organisations, data scientists find themselves responsible for introducing a data culture in an environment where data was not used historically One strategy for doing so involves making data accessible to teams across the organisation in a way that is easy to understand and use Internal dashboards are a great way of exposing data to teams at the right level – not too raw, but not so summarised as to be unactionable Our tool of choice for building internal dashboards is Shiny

No-ETL No ETL required – queries run directly against the analytics database Pro: data scientists work independently of engineers as they can simply code any extra complexity (e.g. cohorting) into the dashboard directly Con: there's potential for long-running queries that would lock up the dashboard for other users

Shiny tech challenges How do we: Allow for potentially long-running reactive() expressions Support 10s or 100s of concurrent users Package all this up for cloud deployment via a DevOps pipeline Integrate with an existing user account system (e.g. Exchange)

Asynchronicity and concurrency How we use futures and promises to support long-running queries and many concurrent users

Promises R is traditionally single-threaded This restriction is inherited by Shiny Whilst we wait for a DB query to return, our dashboard stalls for other users Enter promises and futures! Promises represent a value that will be ready later We can do other work whilst waiting In a dashboard context, we use promises to represent DB queries that retrieve a data frame, and are started with the "Apply filters" button

Futures Promises alone don’t fix our problem They represent the “promise” of an eventual value but don’t give us any way to actually do work elsewhere to retrieve it Futures are a concrete method of fulfilling promises by executing code in a separate R interpreter instance They meet the spec for promises, so we can use them like so: future({ slow_db_query() }) %...>% print()

Futures and promises in Shiny Support has been rolling out into the ecosystem over the last few years Fully supported by reactive expressions and render functions

Packaging with Docker and Packrat How do we take our source code plus dependencies and package it all up into a single standalone container?

Packrat Solves the problem of managing package dependencies for an R project Maintains a manifest file, packrat.lock, describing exactly which versions of what packages your project needs Saves your colleagues from the inevitable hell of coordinating package upgrades Just updated to latest from your dashboard repo? Simply run packrat::restore()

Docker Allows us to package an application up with all of its system-level dependencies, even the OS You write a Dockerfile, which describes how to build the service These containers can then be downloaded and run via several different methods: Standalone, with docker run $your_container_name On an orchestration platform, like Kubernetes or ECS More on this later!

Docker – pseudo-Dockerfile steps Use Ubuntu 18 as a base Install R and packrat Copy our packrat.lock into the container, and run packrat::install() Copy the rest of our dashboard source code Record that we’re exposing port 3838 for incoming connections

Deploying to the cloud With our dashboard packaged up, how do we get it deployed somewhere? What other services can we make use of to solve some of our problems?

Amazon Web Services Most popular cloud provider Basic services like Elastic Compute Cloud (EC2), for manually provisioning VMs Sometimes useful, but we want a little less work to do Value-add services like Elastic Container Service (ECS)

Elastic Container Service (ECS) Allows us to orchestrate the deployment of applications we’ve bundled up into Docker containers First, we push our dashboard container image to Amazon’s private container registry Then we can represent each dashboard as a service in ECS Upgrades are simple: just update the service definition with the latest container ID, and ECS will roll out the change for you

Scaling on ECS A single service is fine, but do we handle situations where we have more users than one instance can handle? A service can have a configurable number of tasks For static use cases, you can just set a number of tasks to run For complex or variable situations, you can auto-scale your task count based on metrics like current CPU usage This gives us additional ability to handle traffic beyond even what the aforementioned futures and promises give us

Network infrastructure Services can integrate with an Application Load Balancer (ALB) that's tied to a domain, like some-dashboard.my-company.com Now, users can simply navigate to that address and browse the dashboard! To lock this down to just our authenticated users, we utilise another service called Cognito that integrates with our user account database and handles the sign-in process for us

Application Load Balancer Users User loads https://dashboard.my-company.com in a browser Application Load Balancer ALB passes the request to Cognito Account DB Cognito If the user is already authenticated, forward the request to Shiny! Otherwise, prompt the user to sign in Dashboard service in ECS Data Warehouse Dashboard task 1 Dashboard task 2 Dashboard task N...

The data science workflow

Dashboard workflow goals No data engineers required for day-to-day work Minimal latency when publishing new analyses and views Code review in GitHub for all non-trivial changes

Continuous deployment Just responsible for building and pushing our Docker images Tooling: TeamCity, Jenkins, CodePipeline, etc Any commits merged into the master branch automatically trigger a build and redeploy New content is made available to users in a matter of minutes

HypotheSci Consultancy for data science and engineering End-to-end data and dashboard pipelines that empower teams and help organisations leverage their data Feel free to reach out :)  Ruan Pearce-Authers Twitter: @returnString Email: ruan@hypothesci.com Alexandra Țurcan Twitter: @AleTurcan Email: alex@hypothesci.com