Using Jupyter to Empower Enterprise Analysts

Slides:



Advertisements
Similar presentations
Test Automation Success: Choosing the Right People & Process
Advertisements

Enterprise Architecture
MERCURY BUSINESS PROCESS TESTING. AGENDA  Objective  What is Business Process Testing  Business Components  Defining Requirements  Creation of Business.
Workforce Analytics Dashboard
Ellis Paul Technical Solution Specialist – System Center Microsoft UK Operations Manager Overview.
Self-Service Data Integration with Power Query Stéphane Fréchette.
Going Hybrid – part 2 Moving to Hybrid Cloud with Windows Azure Virtual Machines & System Center 2012 R2.
© 2015 IBM Corporation Introducing. © 2015 IBM Corporation 2 SELF-SERVICE Covers a wide spectrum of users Power users Consumers Creators Level of Management.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
If it’s not automated, it’s broken!
NIH – A Vision Through 2020 Philip E. Bourne, PhD, FACMI Associate Director for Data Science
The optimisation of rosters to meet public demand
The end of schooling , the beginning of LEARNING
Accessing the VI-SEEM infrastructure
Rapid Launch Workshop ©CC BY-SA.
Data Platform and Analytics Foundational Training
WEBINAR In The Digital Age, Agile Testing With The Right Data Matters
Joslynn Lee – Data Science Educator
Fiscal Administrators Meeting February 8, 2017 Partner 4 Change
Digital Workplace.
Busting Inclusion Myths
Decision Support Systems Course
Some Good Career Options Available in IT Industry
A UNIFIED ECOSYSTEM FOR MARKET DATA VISUALIZATION
aBAP – NextGen QA Delivery Gear
What do students gain by doing research?
V6.2 Draft uWaterloo IT Community Together, we Enable
Learning to Code isn’t enough.
Extensible Platform Microsoft Dynamics 365
Virtual Hands-on Teaching
Introduction.
Maximize the value of your cloud
Script-less Automation: An Approach to Shift-Left.
Python Classes in Pune |
Scope of Advanced Excel Advance excel training in Chandigarh.
ServiceNow Assessments
Assurance: the Evolution of Test Management?
Supporting your strengths to reach your goals… Coaching for Macmillan Professionals November 2017.
Quantifying Quality in DevOps
Pilot project training
INTASC Standards By: Maddie Gonsalves.
AutomIQ Inc. Proprietary & Confidential – DO NOT DISTRIBUTE
7 powerful questions of Data Science
Modernizing your enterprise with hybrid it
Digital Repositories The management of learning objects
What-If Testing Framework
Geospatial and Problem Specific Semantics Danielle Forsyth, CEO and Co-Founder Thetus Corporation 20 June, 2006.
Introducing ISTQB Agile Foundation Extending the ISTQB Program’s Support Further Presented by Rex Black, CTAL Copyright © 2014 ASTQB 1.
Why Does Employee Experience Matter?
Microsoft Project Past, Present and Future
Introduction to Visual Analytics
Statistics Canada and Data’s New Realty
Introducing the online professional community for Clinical Research
The Organizational Impacts on Software Quality and Defect Estimation
Ignite innovation with hybrid cloud
Transforming Engineering Education The Road Ahead
Think Like a Product manager
CITE THIS CONTENT: RYAN MURPHY, “QUALITY IMPROVEMENT”, ACCELERATE UNIVERSITY OF UTAH HEALTH CURRICULUM, JANUARY 30, AVAILABLE AT: 
Add-on Solution for Microsoft Dynamics365 Business Central
Building an Informatics-Savvy Health Department
Government of Canada Digital Standards
Establishing a Framework for Intelligence Education and Training
Data Science Infrastructure as Code
Are you measuring what really counts?
CITE THIS CONTENT: RYAN MURPHY, “QUALITY IMPROVEMENT”, ACCELERATE UNIVERSITY OF UTAH HEALTH CURRICULUM, JANUARY 30, AVAILABLE AT: 
OU BATTLECARD: Business Intelligence
What is UiPATH? For more details visit this link online-training.
Changing Role Tier 1 SOC Analysts Should You Stop Hiring?
R for Data Science Data science Data science is a booming field in today’s world. Since Artificial Intelligence is the main focus of today’s technology,
Presentation transcript:

Using Jupyter to Empower Enterprise Analysts Dave Stuart Department of Defense @somedavestu github.com/nbgallery

About me 15 years in the Intelligence Community Multiple large scale adoptions of technology Last 3 years leading Jupyter efforts

This Presentation Jupyter in the Intelligence Community Challenges of enterprise-wide adoption of Jupyter

Analysis in the IC

Analysis in a Large Enterprise Large analytic user-base Diverse and highly dynamic mission space Pairing analysts to developers doesn’t scale

Analysis in a Large Enterprise Use many different tools to access data Weave together manual workflow Use Excel for analysis

Reality “I recognized how much of my day was spent doing menial tasks – writing queries, bouncing different tools off each other, manipulating my data in various ways.”

Jupyter as a Solution

Developers writing scripts  Analysts writing reproducible workflows Jupyter as a Solution Developers writing scripts  Analysts writing reproducible workflows

Jupyter as a Solution “Instead of thinking about my workflow largely in terms of tools, I’ve started thinking about it more in terms of questions and data.”

Python Ecosystem

Goals for Jupyter Efficiency Collaboration Approachability

Enterprise Challenges And Some Solutions

Enterprise Challenges User-base Collaboration Data Security Compliance

User-Base Challenges Mostly non-coders Domain knowledge experts Tasked with deriving insight from data Huge untapped coding potential

Collaboration Challenges Large geographically dispersed user-base Maximize discovery and collaboration

Data Security Challenges Sensitive subject matter Extremely fine grained security Input data varies by time and user

Compliance Challenges System accreditation Retention Oversight

Our Approach

Jupyter Execution Environment Unified execution environment Ephemeral personalized virtual machines Creates a micro data science environment

nbgallery sharing and collaboration platform (code only) Open source released Easy-to-use repository of community-authored notebooks

Tie Together Decoupled Environments

What have we addressed? User-base Collaboration Compliance Data Security

Collaboration at Enterprise Scale

Collaboration Challenges High volume and redundancy Recommendations Notebook health

: Recommendations Recommend notebooks to users based on: Notebook usage Non-personalized recommendations Personalized recommendations White paper: https://nbgallery.github.io/recommendation.html

: Recommendations

: Notebook Health Broken notebooks decrease confidence How to define health? Why not use automated tests?

How We Measure Notebook Health A notebook is only healthy if the cells are healthy The further a user makes it the better Repeated use and more users builds confidence Health decays over time White paper: https://nbgallery.github.io/health_paper.html

: Notebook Health

Python Training and Outreach

Analyst-level Training Find balance between training and a demanding workload Success doesn’t mean new computer/data scientists But rather an analyst who can use some code

Learning to Code is not a Career Shift “Learning to code is not a career shift, but a means to better think about and tackle my workflow.” “I worried that learning enough code to actually add value would be a bigger time commitment than I was willing to make…I’d been wrong about how much value I could add even with just some basic coding skills. Even as a beginner, I’ve written several notebooks that automate parts of my workflow or manipulate my data in a way that helps me make quicker, better sense of it all.”

Intro to Python for Analysis Demystify code to nontraditional coders Flipped classroom using Jupyter notebooks Virtual and in-person mentoring

Provide Metrics to Users “It seems like a little thing on its face, but I want to specifically applaud this. Automating collection of ‘news you can use’ metrics reduces toil and drag, improves morale, and hopefully improves our efficacy. Numeric metrics aren’t mission impact, but ‘making the numbers cheap’ is a great start.”

Results and Use

Current Status 1000’s of users 1000’s of notebooks by 100’s of authors 10:1 ratio of notebook users to authors

Notebook Use-Cases Operational Course Material Building Blocks

Notebook Use-Cases: Operational Supports a business/mission function Uses data from corporate repositories Most prompt users for query parameters

Notebook Use-Cases: Course Material Notebooks in formal training series Teach code within the same platform they can use for business purposes

Notebook Use-Cases: Building Blocks Organic use-case Reproducible code snippets shared within the community

Notebook Use-Cases 48% Operational 13% Course Material 39% Building Blocks

Future Efforts

Mirror Jupyter for reporting/dissemination use-cases

Questions? @somedavestu github.com/nbgallery