Presentation is loading. Please wait.

Presentation is loading. Please wait.

MIS 3500 Instructor: Bob Travica Trendy Database Topics 2016.

Similar presentations


Presentation on theme: "MIS 3500 Instructor: Bob Travica Trendy Database Topics 2016."— Presentation transcript:

1 MIS 3500 Instructor: Bob Travica Trendy Database Topics 2016

2 Big Data  3 big V:  Volume: terabytes (15 zeroes), petabytes (18 zeroes)  Variety: Social media, communications, sensors everywhere*, Internet of Things, video feeds, GPS… Implication: various formats  Velocity: wired and wireless continuous feeds 2

3 Big Data Goals and Uses  Goals:  Integrate data on the same object across sources (Customer, Citizen, Patient...; spatial mashups*)  Analysis: Existing patterns (e.g., el. energy consumption over time), Predictive analysis (prediction of energy needs)  Application domains:  Product & object monitoring in real time via sensors (Internet of Things- IoT)  Marketing (sentiment analysis in social media, discovering investment opportunities – major US banks)  Energy grid management (IoT) 3

4 Big Data Uses (cont’d)  Transportation networks management (cueing airplanes in air corridors in Brazil, optimizing cargo railroad net in Germany)  Operations/process optimization (UPS sensors in trucks, manufacturing)  Strategy making (Google, Facebook, banks; emerging strategy not planned)  Health (integration of customer data, tracking/analyzing patient vital signs & cancer cell behavior)  Science (human genome analysis, 2TB of data/person+gene interactions)  Public safety/security (profiling outlaws)  Policy analysis (United Nations’ system for predicting social problems) 4

5 Big Data Tasks 5 Querying unstructured data

6 Big Data Benefits & Costs 6  Comprehensive informing on business objects (customers, patients…)  Pattern discovery, predictive analysis (fraud detection)  More effective decision making (Citigroup)  Savings (e.g., UPS operations)  Strategizing for innovation (Google)  Direct technology costs  Truthfulness (“veracity”) of sources & findings  Sense making challenges (big & “small” data)  Legality, ethics  Implementation & fit with organization

7  Machine-generated data (sensors); automatic creation and transfer *  Home appliances (security, energy consumption, heating, food, entertainment)  Monitoring/Control (cars, athletic equipment, machinery, appliances)*  Example: Smart power grid** 7 Smart meter; Internet & Wi-Fi connectivity

8 Technologies  Hadoop (framework for file system and processing of large datasets on server clusters)*  Machine learning – automated construction of models to fit data (instead of hypothesis testing as with DW and Analytics)  Open source  Notable developers: Google, Facebook, Yahoo!, Microsoft 8 Microsoft Azure-based Hadoop

9 9 DATA PROCESSING

10  A database for Big Data  Distributed, non-relational, scalable  Based on Google’s BigTable * 10 Row Key (reversed URL)Time StampColumn Key – “Anchor” (Family) + URLpart (Qualifier) "com.cnn.www"t9anchor:cnnsi.com = "CNN" "com.cnn.www"t8anchor:my.look.ca = "CNN.com" Row KeyTime StampColumn Key – “Contents” + keyword in tagged content "com.cnn.www"t6 contents:html = " … ​ " "com.cnn.www"t5 contents:html = " … ​ " "com.cnn.www"t3 contents:html = " … ​ " DATA are cites of “CNN*” referencing sites DATA are webpages compressed. There can be any number of unbound Contents Columns. All columns put together make a “BigTable”.

11 NoSQL – Not Only SQL* 11

12 Modern Database Systems 12

13 Modern Database Systems 13

14 Conclusion  Modern database systems (DBS) still rely predominantly on relational DBS, while trying to integrate these with Big Data systems for unstructured & multi-type data, which are based on distributed storage & parallel processing.  Ad-hoc relationship discovery and predictive analytics are major tasks and benefits. 14


Download ppt "MIS 3500 Instructor: Bob Travica Trendy Database Topics 2016."

Similar presentations


Ads by Google