Download presentation
Presentation is loading. Please wait.
1
Big Data
2
Different Types of Analytics
Descriptive and Predictive Analytics: Descriptive analytics is reporting what happened and analyzing the data that contributed to figuring out why it happened. Predictive Analytics is using statistics and data mining techniques to make predictions about the future. Prescriptive Analytics: Analytics that recommends actions Social Media Analytics: Doing analysis on public opinion (behavioral patterns, tastes, targeted marketing) Entity Analytics: Analytics that groups/clusters data about entities (and learns from the raw data) Cognitive Computing: Human/Computer Interaction that is targeted for information exchange
3
Big Data Big data are datasets whose size exceeds the typical reach of a DBMS to capture, store, manage, and analyze. One way to categorize the different types of big data is according to the 4 V's. Volume Velocity Variety Veracity
4
Big Volume Volume - the size of the data managed by the system
Often automatically collected information can lead to huge amounts of data. Examples: Sensor Data (environmental or manufacturing/processing) Scanning Equipment (card readers) Industrial Internet of Things (heavily sensored manufacturing processing / RFID) Multimedia Data (Video / Audio / Everything Else)
5
Velocity Velocity - the speed at which data is created, accumulated, ingested and processed. Even if a database can handle the amount of data that needs to be stored, it also needs to be fast enough to process the information as quickly as needed. Examples: High Frequency Stock Trading Detection of Malicious Activity in Call Network Real-Time Processing of Trends on Facebook / Twitter
6
Variety Big data includes structured, semi structured, and unstructured data in different proportions based on context. Structured data feature a formally structured data model, such as the relational model (rows and columns) or hierarchical (nested structures). Unstructured data has no identifiable formal structure. Examples: In MongoDB, we used semi structured document-oriented data. In Neo4j, we stored data as a graph. Other unstructured data: s, web content (blogs), pdfs, audio, video, images, clickstreams (cookie tracking).
7
Veracity Veracity is composed of two components:
Credibility of the source Suitability of the data for its target audience Much of the data in big data stores has different levels of trustworthiness and must go though quality testing and credibility analysis before being used. Many sources generate data that is uncertain, incomplete, and inaccurate. Databases holding such information needs to be able to manage such questionable data.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.