Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.

Slides:



Advertisements
Similar presentations
Nokia Technology Institute Natural Partner for Innovation.
Advertisements

CS 791z Graduate Topics on Software Engineering
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.
Architecting for the Internet of Things
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
Data Mining – Intro.
Brenda Woods John Williams Daniel Bailey Breia Stamper.
CS525: Special Topics in DBs Large-Scale Data Management
Evolution in Coming 10 Years: What's the Future of Network? - Evolution in Coming 10 Years: What's the Future of Network? - Big Data- Big Changes in the.
Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee.
Big Data. What is Big Data? Analog starage vs digital. The FOUR V’s of Big Data. Who’s Generating Big Data The importance of Big Data. Optimalization.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
Basic Marketing Research Customer Insights and Managerial Action
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Charles Tappert Seidenberg School of CSIS, Pace University
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Data Science and Big Data Analytics Chap1: Intro to Big Data Analytics
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
How Companies are Using Spark And where the Edge in Big Data will be Matei Zaharia.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
Internet Studies. Faculty Members The specialty has now 2 faculty members Prof. Ronen Feldman: Text Mining, Data Mining, Social Media Analysis, Information.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Big Data, Learning Analytics and Education Aleksanda Klašnja-Milićević Mirjana Ivanović.
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
Advanced Database Concepts
SUPPLY CHAIN OF BIG DATA. WHAT IS BIG DATA?  A lot of data  Too much data for traditional methods  The 3Vs  Volume  Velocity  Variety.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
IoT Meets Big Data Standardization Considerations
Big Data Analytics Platforms. Our Team NameApplication Viborov MichaelApache Spark Bordeynik YanivApache Storm Abu Jabal FerasHPCC Oun JosephGoogle BigQuery.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
Big Data Yuan Xue CS 292 Special topics on.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
BIG DATA BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools.
Microsoft Ignite /28/2017 6:07 PM
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Data Analytics (CS40003) Introduction to Data Lecture #1
CNIT131 Internet Basics & Beginning HTML
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Big Data Analytics on Large Scale Shared Storage System
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Tutorial: Big Data Algorithms and Applications Under Hadoop
BIG DATA IN ENGINEERING APPLICATIONS
Chapter 14 Big Data Analytics and NoSQL
Dr. Chengwei Lei CEECS California State University, Bakersfield
Big-Data Fundamentals
Mohammad J. Mansourzadeh
Database Design (constructed by Hanh Pham based on slides from books and websites listed under reference section) Big Data.
Database Vs. Data Warehouse
Data Warehousing and Data Mining
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Lecture 1: Introduction
Big Data Analysis in Digital Marketing
Big DATA.
Data Analysis and R : Technology & Opportunity
UNIT 6 RECENT TRENDS.
Big Data.
Presentation transcript:

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1

Introduction to Big Data What is Big Data? What makes data, “Big” Data? 2

Big Data Definition No single standard definition… “Big Data” is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it… 3

Characteristics of Big Data: 1-Scale (Volume) Data Volume 44x increase from From 0.8 zettabytes to 35zb Data volume is increasing exponentially 4 Exponential increase in collected/generated data

Characteristics of Big Data: 2-Complexity (Variety) Various formats, types, and structures Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc… Static data vs. streaming data A single application can be generating/collecting many types of data 5

Characteristics of Big Data: 3-Speed (Velocity) Data is begin generated fast and need to be processed fast Online Data Analytics Late decisions  missing opportunities Examples E-Promotions: Based on your current location, your purchase history, what you like  send promotions right now for store next to you Healthcare monitoring: sensors monitoring your activities and body  any abnormal measurements require immediate reaction 6

Big Data: 3V’s 7

Some Make it 4V’s 8

Harnessing Big Data OLTP: Online Transaction Processing (DBMSs) OLAP: Online Analytical Processing (Data Warehousing) RTAP: Real-Time Analytics Processing (Big Data Architecture & technology) 9

Who’s Generating Big Data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data) The progress and innovation is no longer hindered by the ability to collect data But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 10

The Model Has Changed… The Model of Generating/Consuming Data has Changed Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data 11

What’s driving Big Data - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources - Very large datasets - More of a real-time 12

Value of Big Data Analytics Big data is more real-time in nature than traditional DW applications Traditional DW architectures (e.g. Exadata, Teradata) are not well- suited for big data apps Shared nothing, massively parallel processing, scale out architectures are well-suited for big data apps 13

Challenges in Handling Big Data The Bottleneck is in technology New architecture, algorithms, techniques are needed Also in technical skills Experts in using the new technology and dealing with big data 14

What Technology Do We Have For Big Data ?? 15

16

Big Data Technology 17

What You Will Learn… We focus on Hadoop/MapReduce technology, NoSQL (key- Value), BigTable, Cassandra, SPARK, Stream… Learn the platform (how it is designed and works) How big data are managed in a scalable, efficient way Learn writing Hadoop jobs/SPARK in different languages Programming Languages: Java, C, Python, Scala High-Level Languages: Apache Pig, Hive, CQL Learn advanced analytics tools on top of Hadoop Mahout: Data mining and machine learning tools over big data Applications: Recommendations Graph Processing: Pregel (Giraf) 18