Download presentation
Presentation is loading. Please wait.
1
THE BUSINESS CASE FOR AI, SPARK & MORE
Sanjay Mathur, CEO •
2
Silicon Valley Data Science is a boutique consulting firm focused on transforming business through data science and engineering. We are a company of experienced software engineers, architects, data scientists, strategists, and designers who specialize in data-driven product development and experimentation. We work in blended teams trained in holistic, strategic, and agile approaches.
3
Come see us & say hello! Tuesday, 23 May
Data 101: The Business Case for Deep Learning, Spark, and Friends with Sanjay Mathur Architecting a Data Platform with John Akred & Stephen O’Sullivan Developing a Modern Enterprise Data Strategy with John Akred & Scott Kurth Thursday, 25 May What’s Your Data Worth? with John Akred Ask Me Anything with John Akred, Scott Kurth & Stephen O’Sullivan
4
To view SVDS speakers and scheduling, or to receive a copy of our slides, go to:
5
… it’s really about agility
BIG DATA … it’s really about agility
6
BUYING AGILITY Linear scale-out cost Opex vs capex Ease of purchase
7
Scale-out systems move us from managing scarcity to promoting utility
8
DEVELOPMENT AGILITY Architectural factors Schema on read
Rapid deployment Mirror production setup Executes faster Programmer factors Fun to program Concision Easier to test Faster to write DEVELOPMENT AGILITY
9
The experimental enterprise
Supports investigative work and builds a solid layer for production. Conducts experiments and responds to the changing environment. Makes foundational infrastructure readily accessible.
10
Data Management Security, Operations, Data Quality, Meta Data Management and Data Lineage
Analytics Low Latency Access Data Ingest Repository Persistence Offline Processing Real-Time Processing Batch Processing Data Services External Systems Data Acquisition Internal Data Platform
11
What is Docker? Container technology: bundles every part of an application Provides isolation for each application without the overhead of running a virtual machine Ships only the parts that are needed—leaves out the operating system Answers the question of “how do I get my data” VM – full isolation, guaranteed resources Docker – process isolation, and better resource sharing
12
WHY SHOULD BUSINESS CARE?
Better use of server resource than virtual machines A fast and reliable way of deploying applications It’s the ideal packaging mechanism for scale-out distributed systems Easy for developers to work in an environment identical to production Sharing containers leads to innovation DevOps enabler with programmatic infrastructure components. Fast scale up and tear down.
13
What is Apache Spark? In-memory distributed computing platform
Comes from Berkeley AMPlab In production with early adopters, now integral to every commercial Hadoop distribution Doesn’t need Hadoop, but runs easily on top
14
Use cases Managing a major retailer’s inventory across a diverse network of entities in near real time Managing and processing event streams for online gaming Supporting data science initiatives across massive data sets at a media analytics company
15
Why should business care?
Enables use cases Hadoop didn’t provide, all in one platform streaming, interactive analytics, machine learning, graphs Fast Iteration time down, more productive Use existing cluster investment Sits on HDFS, can run under YARN (or use Amazon S3, or Cassandra)
16
Why should business care? (2)
SparkSQL Use SQL skills and tools, e.g. Tableau Dataframes integrate external data sources into one context: RDBMS, Hive, JSON… Developer-friendly Concise and fluid to program Language integration: Scala, R, Python, Java
17
THE AI BOOM Image from Deep learning uses deep neural networks which have been around for a few decades; what’s changed in recent years is the availability of large labeled datasets and powerful GPUs. Neural networks are inherently parallel algorithms and GPUs with thousands of cores can take advantage of this parallelism to dramatically reduce computation time needed for training deep learning networks.
18
THE AI BOOM A long history of under-delivered promise—why?
Traditionally, an encoding of knowledge Expert systems Current excitement is based on statistical methods Machine learning—many methods Deep learning—neural networks
19
AI USE CASES Cognition at scale Pattern recognition
Predictive analytics Deep learning takes this further than before Extraordinary levels of composition, eg. Image captioning, generation
20
AI, WHY CARE? Leverage data previously “invisible”
e.g. search images by text Automate low-value cognition tasks Call center triage Natural interfaces X.ai — scheduling bot Amazon Alexa, Google Now, Cortana, etc. Amazing open source toolkits and cloud APIs Google Tensorflow, MSFT, IBM Watson
21
What are Notebooks? Interactive documents that contain a program and its output Long history: Mathematica Particularly successful with data science Projects to watch Jupyter Apache Zeppelin Answers the question of “how do I get my data”
22
Demo goes here
23
Why should business care?
Easy collaboration and sharing of data science Think “Docker for analysis” Easy access to data and compute resource A building block for more self-service analytical capabilities Commercial version of Notebooks + Spark is the Databricks Cloud
24
Data is your business
25
SILICON VALLEY’S DATA MACHINE
26
The experimental enterprise
Supports investigative work and builds a solid layer for production. Conducts experiments and responds to the changing environment. Makes foundational infrastructure readily accessible.
27
BECOME DATA NATIVE Can only win with situational awareness
New architectures offer new opportunities Creation of data-driven value requires new approach Create an Experimental Enterprise Business must lead, and understand the potential of the technology
28
Sanjay Mathur• sanjay@svds. com • @sanjaymathur www. svds
Sanjay Mathur•
29
Come see us & say hello! Tuesday, 23 May
Data 101: The Business Case for Deep Learning, Spark, and Friends with Sanjay Mathur Architecting a Data Platform with John Akred & Stephen O’Sullivan Developing a Modern Enterprise Data Strategy with John Akred & Scott Kurth Thursday, 25 May What’s Your Data Worth? with John Akred Ask Me Anything with John Akred, Scott Kurth & Stephen O’Sullivan
30
To view speakers and scheduling, or to receive a copy of our slides, go to:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.