Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to ODPi Roman VP of

Similar presentations


Presentation on theme: "Introduction to ODPi Roman VP of"— Presentation transcript:

1 Introduction to ODPi Roman Shaposhnik @rhatr VP of Technology @ODPi

2 How the Hadoop stack has grown
Hadoop 10 years ago Hadoop today MapReduce Monitoring Security Governance Workflow Data Management Interactive SQL Machine Learning Streaming Data Other Data Flows Data Access MapReduce YARN Data Processing HDFS HDFS/Hadoop Compatible Filesystems Column Data Stores (HBase) Data Storage Hadoop in early days Core Hadoop Ecosystem filling in holes (how we work with this data) MapReduce - SQL, Streaming Broad sense and how they work together; now concept of governance and security come into play Car analogy = car - weld on vs bolt on

3 Hadoop Apache Project Commercial Support Tracker April 2016
Projects Amazon Cloudera HortonWorks IBM MapR Number of Supporters Apache HDFS 2.7.2 2.6 2.7.1 API 5 Apache Mapreduce 2.6.0 Apache YARN Apache Avro 1.7.5 1.7.6 1.7.7 1.7.4 Apache Flume 1.5.0* 1.6 1.5.3 1.5.2 1.6.0 Apache HBase 1.2 1.1.2 1.1.1 1.1 Apache Hive 1.0 1.2.1 Apache Oozie 4.2.0 4.1 Apache Parquet 1.5.0 1.5 2.2.0 1.8.1 Apache Pig 0.14 0.12 0.15.0 0.15 Apache Solr 4.10.3 5.2.1 5.1.0 Apache Spark 1.6.1 1.5.1 Apache Sqoop 1.4.6 1.4.6 (1.99.6) Apache Zookeeper 3.4.6* 3.4.5 3.4.6 Apache Kafka 0.9 0.9.0 4 Apache Mahout 0.11.1* .11 Hue 3.7.1 3.1 2.6.1 3.9.0 Apache DataFu 1.0.0 1.3.0 3 Cascading 2.5 3.0.1 This has caused inconsistency on what the market perceives what Hadoop is But, while distros are better aligning on the components to include The versions included still don’t align Often time the same version doesn’t mean the same thing across distros Api changes Config differences GE HDFS Adrian, M. (2016, April 27). Hadoop Apache Project Commercial Support Tracker April Merv Adrian. Retrieved April 29, 2016, from

4 Apache (ASF) Projects Produce
Downstream Products End-Users Want Predictability Stability Consistency Why is this? Upstream projects focus on building the raw components, but end users are capable of absorbing them with out the right resources. These projects have left the end-user needs of consistency, predicablity, and stability to distro vendors.

5 ODPi Overview Without ODPi With ODPi
Multi-distro certifications and regression testing increases ISV development, burden, and enterprise support costs With ODPi Commercial/ ISV Solutions Commercial/ ISV Solutions Multiple Hadoop Distribution Compliance Required Single compliance Multiple deployments ODPi Runtime Compliance Duplicate effort Altiscale IBM Hadoop Distribution A Arenadata Infosys Hadoop Distribution B Hortonworks Hadoop Distribution C ODPi Runtime Hadoop Distribution D HDFS YARN MapReduce @ODPiorg

6 ODPi Bridges ASF Engagement
Hadoop Distros Hadoop Components App Vendors Solution Providers End Users More ASF engagement Less ASF Engagement We see many different constituents in the Hadoop ecosystem The pattern we’ve seen is engagement with the ASF and their project decrease as you get to the constituents closer to the end-user ODPi is organized to support the ASF, as we promote innovation and development of upstream projects like Apache Hadoop and Apache Ambari By providing a common runtime, reference implementations and test suites, ODPi helps to and of big data solutions Provide stability and predictability at the Hadoop layer without limiting innovation and vendor value adds In keeping with our mission to increase ease of deployment for Hadoop and Big Data apps, our technical investments are directed to the initiatives that will have the greatest impact on the greatest number of Hadoop/Big Data deployments @ODPiorg

7 ODPi is a nonprofit organization committed to simplification & standardization of the big data ecosystem. As a shared industry effort , ODPi is focused on promoting and advancing the state of Apache Hadoop® and big data technologies for the enterprise.

8 ODPi Benefits End-users Run any “ODPi-compatible” big data software on any “ODPi-compliant” platform Long term interoperability with “ODPi-compliant” solutions Compatibility guidelines to “test once, run everywhere” Eliminate the cost of certification and testing across multiple Hadoop distributions Apps Developers ISVs System Integrators Hadoop platform providers Enable more ODPi-compatible software to run successfully on their solutions @ODPiorg

9 ODPi Members All members have an equal vote on ODPi decisions, regardless of investment level, ensuring equality among all participants and an industry-wide consolidation of enterprise requirements ODPi is open to all with a very low hurdle for ANY developer or company to participate and have an impact Membership includes 29 companies from Hadoop distributions, ISVs, Sis, and end users and more than 35 maintainers from 25 companies are dedicated to its ongoing work All members have an equal vote on ODPi decisions, regardless of investment level, ensuring equality among all participants and an industry-wide consolidation of enterprise requirements @ODPiorg

10 Operations Spec 1.0 (draft)
Runtime and Operations Spec Components Runtime Spec 2.0 Operations Spec 1.0 (draft) Evolution from v1 spec to include new components and expand coverage on existing components Next release focus areas include: Apache Hive Alternative Filesystems ( HCFS ) Other areas considered for future specs Apache Spark Impact of Security Projects on Hadoop ( Apache Ranger, Apache Sentry, Apache Knox ) This specification outlines the requirements for ODPi-compliant applications to be installed, managed, and monitored by tools such as Apache Ambari and the guarantees that consumers, ISVs, and service developers can count on to develop custom applications Focus on building a standard service definition language, based off of the work done in Apache Ambari 2.4 Runtime Spec 1.0 Included: Hadoop 2.7 Common, HDFS, YARN, and MapReduce components JRE 7 and 8 supported OS Platforms Linux (any) Windows Server 2012 Architectures x86_64 s390x PPC64le Available Sept/Oct Available Late Fall 2016 @ODPiorg

11 ODPi Runtime Compliant Apache Hadoop Distribution Vendors
To be ODPi Runtime Compliant, a Hadoop Distro must: Be a descendant of Version 2.7 Use version 7 or 8 of Java Expose environmental variables Not alter the public API Vendors may include additional features/functions, provided that: They make the source code available All the code must be committed to the ASF ODPi Runtime Compliant tests are: Self-certification tests by Hadoop Distro vendors Linked directly to lines in the ODPi Runtime Specification, which covers covers HDFS, YARN, and MapReduce components and specifies How Apache components should be installed and configured. Provides a set of tests for validation to make it easier to create big data solutions and data-driven applications. ODPi provides a reference build to assist with testing ODPi Test Framework is based on Apache Bigtop These controls ensure that Big Data Application Vendors can, to the greatest extent possible, test their App against one ODPi Runtime Compliant Distro and be confident that it will run on All ODPi Runtime Compliant Distros @ODPiorg

12 ODPi Interoperable ODPi Compliant
Compliance program for Big Data Application Vendors A vendor that is ODPi Interoperable commits that their product is and will continue to be designed in such a way that adheres the the ODPi specifications for Apache Hadoop components that they leverage. Launching Fall 2016 Available Now ODPi Compliant @ODPiorg

13 Release Plan @ODPiorg March 2016 April 2016 May 2016 June 2016
July 2016 August 2016 September 2016 October 2016 Runtime Spec 1.0 HDFS, YARN, MapReduce Runtime Spec 2.0 Hive, HCFS Release Proposal Draft Release Candidate Release Operations Spec 1.0 Ambari Proposal Draft Release Candidate Release @ODPiorg

14 ODPi is open to all with a very low hurdle for ANY developer or company to participate and have an impact Membership includes 29 companies from Hadoop distributions, ISVs, Sis, and end users and more than 35 maintainers from 25 companies are dedicated to its ongoing work All members have an equal vote on ODPi decisions, regardless of investment level, ensuring equality among all participants and an industry-wide consolidation of enterprise requirements @ODPiorg

15 Join us via: User Advisory Board
UAB User Advisory Board Collecting and addressing CDO (Chief Data Officer) concerns Special Interest Groups: Collecting and addressing CIO (Chief Information Officer) concerns Collecting and addressing cross-cutting systems integration concerns SIGs - Security - Data Sci. Certification ODPi Platform Compliance ODPi ISV Interoperable @ODPiorg

16 Join us as a member ODPi is a nonprofit organization committed to simplification & standardization of the big data ecosystem. As a shared industry effort , ODPi is focused on promoting and advancing the state of Apache Hadoop® and big data technologies for the enterprise.

17 Thank You


Download ppt "Introduction to ODPi Roman VP of"

Similar presentations


Ads by Google