Download presentation
Presentation is loading. Please wait.
Published byRosaline Randall Modified over 9 years ago
3
3
7
Hadoop? Cloud data warehousing? Machine learning? NoSQL?
8
Ecosystems around open source projects are very active Basis in commodity hardware Scale out, and cloud Change in economics of computing power Change in economics of storage
15
Employee IDAgeIncome 1439000 23810000 433510000 Employee ID 123 Age 433835 Income 900010000 Imagine if instead of: You have: Perf: values you wish to aggregate are adjacent Efficiency: great compression from identical or nearly-identical values in proximity Fast aggregation and high compression means huge volumes of data can be stored and processed, in RAM
19
mapper Input reducer Input Output Input K1K1 K2K2 K3K3 Output
27
Impala + Kafka
33
Store raw data, centrally in HDFS Use different processing engines for different analyses Data Lake
39
NO PURCHASE NECESSARY. Open only to event attendees. Winners must be present to win. Game ends May 9 th, 2015. For Official Rules, see The Cloud and Enterprise Lounge or myignite.com/challenge
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.