Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg.

Similar presentations


Presentation on theme: "© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg."— Presentation transcript:

1 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg Battas Big Data Chief Technologist

2 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 2 Several shifts beginning in Big Data Architecture Big Data is Growing Up Big Data cluster consolidation Software defined storage taking root Software organizing around a common base Purpose built hardware for Big Data

3 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 3 Comparing Big Data and CI architectures Ethernet Switches Shared Storage Blade SAN Switches Argos Ethernet Switches Converged InfrastructureBig Data Ethernet designed for Flexibility Blades allow dense compute nodes Storage arrays shared by SAN designed to be accessible to any node so that it can be dynamically allocated Network designed for low cost/high cross sectional bandwidth Argos allows maximum density with mediocre CPU power Direct attached storage with minimal hardware resiliancy is used for cost and cultural reasons

4 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 4 Big Data Architecture Principals and Pitfalls Principals Began with a movement away from proprietary storage and databases Parallel Programming/Distributed Filesystem’s on industry standard hardware “Move” compute closer to the data/disk to reduce overhead Direct Attached Storage with S/W resiliency Strong Open Source culture Major Ecosystems with Rapidly evolving, mix and match functionality Pitfalls Provisioning servers means moving data Difficult to quickly “re-slice” a configuration No simple sharing of data amongst clusters Big Data must be copied to each cluster to leverage various H/W and S/W Node Hadoop Batch Processing Hbase Event ProcessingVertica AnalyticsSAS VA 12am – 6am 6am – 12am

5 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 5 Common Wisdom “Take the Processing to the Data” Unlike other apps, big data depends on massive IO to read huge amounts of data from disk Traditional SAN approaches where every block must be shipped over a SAN does not scale cost effectively Big data scales because the processing happens close to the data by using internal DAS and shipping work to each node Ethernet Switches Shared Storage SAN Switches Ethernet Switches Traditional ITBig Data App

6 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 6 Reality “Take the Processing to the Data” Only a portion of the processing can be done locally Shuffles redistribute data across the grid Replication pushes inserts and updates to multiple nodes MPP RDBMS’s have spent years optimizing this problem Learned that operations should be pushed down if they are Data reducing Have complete locality of data Learned that the majority of the CPU power is still needed for work that can’t be pushed down Reduce

7 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 7 Big Data is often deployed with distributed file systems on industry standard hardware Software Defined Storage A different approach The largest data stores in the world chose to move to industry standard servers running parallel file systems rather than traditional storage arrays or databases HDFS, S3, Swift and Cinder are becoming most significant as interfaces Today a mix of proprietary and open source technologies Big data is accelerated the adoption of SDS into other areas

8 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 8 HDFS becoming the common substrate for many Big Data Software vendors HDFS Hadoop MapReduce MPP DBMS Data Integration Analytic Tools &Frameworks Enterprise Security Unstructured Analytics

9 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 9 Fueling a shift to open source NoSQL products being adopted by software vendors The first wave of Big Data was around Batch Hadoop for Analytics and ETL offload Often coupled with interactive SQL co-processors Now we are seeing growing interest in NoSQL products Commercial ISV’s are the canary in the coal mine Some very aggressive projects to port to NoSQL Hbase seems to be preferred by ISV’s Challenge of moving commercial products to NoSQL SQL Language Transactions Joins

10 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 10 System on a Chip creates a new model for servers The Shift to Optimized hardware The significance of Moonshot goes far beyond packaging The power of purpose built hardware The Economics of Dark Silicon Acceleration Open source opens the door

11 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 11 Where we are working in Big Data Allow customers to converge big data clusters Leverage shared resources for multiple big data environments Allow rapid elasticity and provisioning without moving data Ability to store data once and operate on it with different types of compute nodes Bring big data software together into a common framework Hadoop, Unstructured analytics, MPP DBMS, Enterprise Security, analytic tools and data integration tools Aligned around a common distributed filesystem (HFDS compliant) Support multi-temperate data Assist ISV’s and customers moving to NoSQL Leverage HP Intellectual property in database Use Moonshot to leverage the shift to optimized hardware

12 © Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 12


Download ppt "© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg."

Similar presentations


Ads by Google