Open Data Science A Strategy for Success in 2018
Portfolio Director – Data Analytics Speaker Introduction @V1Analytics Paul.Norris@version1.com ie.linkedin.com/in/norrispaul Paul Norris Portfolio Director – Data Analytics www.version1.com
End of a era 01
Compare to app. dev. evolution Standalone Code Desktop App Web App Mobile App API Microservices We are at least 10 Years Behind (DevOp’s 2008)
Where are we today? 02 01
Open Data Science Today Standalone scripts and desktop development environments the norm Reliance on key individuals - not teams & processes Tooling makes it difficult to collaborate Version, release and source control management processes are often not defined
A Vision for 2018 03
Open Data Science in 2018 R & Python usage overtakes proprietary systems Demand for data science as an application emerges in enterprises Industry data operations best practices start to be defined and used GDPR forces our profession to mature fast I.T.’s a team sport - lone data scientists are seen as a risk Desktop development moves server side Data lake hype ends, they doesn’t work for analytics Columnar storage databases make a come back in cloud as PaaS
Strategy For Success 04 01
3. Data Op’s 2. Platform 1. Team
Team Data Engineer Data Analyst Data Scientist
Open Data Science Platform jupyter.org dataiku.com
Data Platform druid.io snowflake.net
Analytics Portal superset.incubator.apache.org looker.com
The Data Op’s Manifesto Dev Op’s Process Engineering Data Management dataopsmanifesto.org
Thank You Any questions?