Download presentation
Presentation is loading. Please wait.
Published byLaurence Oliver Modified over 9 years ago
1
MIS 3500 Instructor: Bob Travica Newer DB Topics 2015
2
Big Data 3 big V: Volume: terabytes (15 zeroes), petabytes (18 zeroes) Variety: Social media, communications, sensors everywhere*, Internet of Things, video feeds, GPS… Implication: various formats Velocity: wired and wireless continuous feeds 2
3
Goals and Uses Goals: Integrate data on the same object across sources (Customer, Citizen etc.; spatial mashups) Analysis: Existing patterns, Predictive analysis Application domains: Monitoring for business & other purposes (sensors) Marketing (relationship mktg., Sentiment analysis is social media…) Energy grid management Transportation networks management Health (analysis of cancer cell behavior and of patient vital signs) Science (human genome) Policy analysis (United Nations’ system for predicting social problems) 3
4
Big Data Tasks 4
5
Machine-generated data (sensors); automatic creation and transfer * Home appliances (security, energy consumption, heating, food, entertainment) Monitoring/Control (cars, athletic equipment, machinery, appliances)* Example: Smart power grid** 5 Smart meter; Internet & Wi-Fi connectivity
6
Technologies Hadoop (framework for file system and processing of large datasets on server clusters)* Machine learning – automated construction of models to fit data (instead of hypothesis testing as with DW and Analytics) Open source Notable developers: Yahoo, Facebook, Yahoo!, Google, Microsoft 6 Microsoft Azure-based Hadoop
7
7 DATA PROCESSING
8
A database for Big Data Distributed, non-relational, scalable Based on Google’s BigTable * 8 Row Key (reversed URL)Time StampColumn Key – “Anchor” (Family) + URLpart (Qualifier) "com.cnn.www"t9anchor:cnnsi.com = "CNN" "com.cnn.www"t8anchor:my.look.ca = "CNN.com" Row KeyTime StampColumn Key – “Contents” + keyword in tagged content "com.cnn.www"t6 contents:html = " … " "com.cnn.www"t5 contents:html = " … " "com.cnn.www"t3 contents:html = " … " DATA are cites of “CNN*” Referencing sites DATA are webpages Compressed. There can be any Number of unbound Contents Columns. All columns put together make a “BigTable”.
9
NoSQL – Not Only SQL 9
10
Modern Database environments 10
11
Modern Database environments 11
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.