Download presentation
Presentation is loading. Please wait.
Published byFrederica Booker Modified over 8 years ago
2
BIG DATA
3
BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools
5
V V V V V V
6
WHY IS IT INTRODUCED Lots of data is being collected & warehoused. Processing exceeds database system capacity. Structured & Unstructured data Separation of data from application. Understanding data analytics. Faster development, faster runtime. Elastic Feature-Level Scalability.
7
APACHE HADOOP APACHE HADOOP Provides massive scalable storage, its not a database Data Processing Platform HDFS, a fault tolerant storage Store data in native format Reduce cost & lower risks Extracting business value from data Deliver new insights Automatically handles s/w & h/w failures
8
HDFS Fault tolerant storage Survive failure on disk, network and network interface Uses Map-Reduce programs Creates clusters of machines and co-ordinates Storage on clusters using blocks No special hardware compared to RAID
10
PROBLEMS WITH BIG DATA PROBLEMS WITH BIG DATA Will be so overwhelmed Costs escalate too fast Storage consumed 3 times Timeliness Analysis Poor data locality Incompatible & Replicated data
11
CONCLUSION Big Data will replace the approaches, tools and systems that underpin development work. Better analysis of the large volumes of data. Potential for advancing in many scientific disciplines. Improving the profitability. Technical challenges to be addressed dynamically
12
REFERENCES www.bigdatauniversity.com www.bigdatauniversity.com www.sas.com/big-data/ www.sas.com/big-data/ en.wikipedia.org/wiki/Big_data en.wikipedia.org/wiki/Big_data cra.org/ccc/docs/init/bigdatawhitepaper.pdf cra.org/ccc/docs/init/bigdatawhitepaper.pdf dataanalyticssummit.com dataanalyticssummit.com hadoop.apache.org hadoop.apache.org
13
THANK YOU Today’s Big Data Is Not Tomorrow’s Big Data Tomorrow’s Big Data
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.