Download presentation
Presentation is loading. Please wait.
Published byMarianna Barnett Modified over 8 years ago
1
B ig D ata Analysis for Page Ranking using Map/Reduce R.Renuka, R.Vidhya Priya, III B.Sc., IT, The S.F.R.College for Women, Sivakasi.
2
Overview Introduction What is Big Data! Why Big Data? 4 V’s Of Big Data Big Data Analytics Technologies Map/Reduce Applications Case Study Conclusion
3
Introduction Data have outgrown the storage and processing capabilities of a single host. Two fundamental challenges: –how to store and –how to work with voluminous data sizes, and, –how to understand data and turn it into a competitive advantage.
4
What is Big Data! ‘Big-data’ is similar to ‘Small-data’, but bigger But having data bigger requires different approaches: techniques, tools & architectures To solve: New problems and old problems in a better way.
5
The Blind men and the Elephant
6
Why Big Data? Key enablers for the growth of “Big Data” are: Increase of Processing Power Increase of Storage Capacities Availability of Data
7
4 V’s of Big Data
8
Big Data Analytics Technologies Hadoop PLATFORA WibiData PIG Hive MapReduce NoSQL databases Column-oriented databases
9
Hadoop Hadoop is a distributed file system and data processing engine Hadoop has two components: –The Hadoop distributed file system (HDFS) –The MapReduce programing.
10
Map / Reduce A High level abstracted framework for distributed processing of large datasets Fault Tolerant, Parallelization Computation consists of two phases Map Reduce A Master-Slave architecture Computations occurs in multiple slave nodes And it tries to provide data locality as much as possible.
11
MR model Map –Process a key/value pair to generate intermediate key/value pairs Reduce –Merge all intermediate values associated with the same key Users implement interface of two primary methods: 1. Map: (key1, val1) → (key2, val2) 2. Reduce: (key2, [val2]) → [val3]
12
Applications
13
Homeland Security FinanceSmarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading AnalyticsFraud and Risk Log Analysis Search Quality Retails
14
Case Study
16
Conclusion Real-time big data isn’t just a process for storing petabytes or exabytes of data in a data warehouse, It’s about the ability to make better decisions and take meaningful actions at the right time.
17
Queries ??
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.