Big Data. What is Big Data? Analog starage vs digital. The FOUR V’s of Big Data. Who’s Generating Big Data The importance of Big Data. Optimalization.

Slides:



Advertisements
Similar presentations
R and HDInsight in Microsoft Azure
Advertisements

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
John Lenhart.  Data stores are growing by 50% each year, and that rate of increase is accelerating [1]  In 2010, we crossed the barrier of the zettabyte.
IBM SPSS Solutions A SELECT INTERNATIONAL COMPANY.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.
25 Need-to-Know Facts. Fact 1 Every 2 days we create as much information as we did from the beginning of time until 2003 [Source]Source © 2014 Bernard.
Axis Intelligent Video Intelligence where you need it.
Summary of “New Ways to Exploit Raw Data May Bring Surge of Innovation, a Study Says” Steve Lohr, New York Times, May 13th, 2011 Presented by: Zhe Jiang.
CS525: Special Topics in DBs Large-Scale Data Management
Evolution in Coming 10 Years: What's the Future of Network? - Evolution in Coming 10 Years: What's the Future of Network? - Big Data- Big Changes in the.
The Importance Of Transactions In The World Of Analytics Doug Aoyama Director, Product Marketing.
Amadeus Travel Intelligence ‘Monetising’ big data sets
Basic Marketing Research Customer Insights and Managerial Action
© 2012 TeraMedica, Inc. Big Data: Challenges and Opportunities for Healthcare Joe Paxton Healthcare and Life Sciences Sales Leader.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Cyber Basics and Big Data. 2 Semantic Extraction Sentiment Analysis Entity Extraction Link Analysis Temporal Analysis Geospatial Analysis Time Event Matrices.
Chapter 11 Databases.
© 2013 IBM Corporation Version 1.0 The New Eye Insight through Big Data and Analytics: A Case Study on Citizen Sentiment Analysis Sandipan Sarkar, Executive.
If BIG DATA is the answer, then what was the question?
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
© 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 Ecommerce Antoine Harfouche.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Big Data Bijan Barikbin Denisa Teme Matthew Joseph.
Big Data Mark Theissen CEO, Cirro, Inc.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Innovation Work Circle: Big Data Presented By: Innovation Work Circle Group.
What’s a mobile app? A mobile app is a software program you can download and access directly using your phone or another mobile device, like a tablet.
Ronald L. Ramos October Download the presentation at s.info/
© 2009 IBM Corporation Smarter Decisions for Optimized Performance IBM Global Executive Forum Panel Discussion Business Analytics and Optimization Fred.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
© 2012 IBM Corporation Converting Big Data into Big Knowledge.
Big Data: Electronic Gold And why Oreus should invest in Big Data Thomas Snuverink.
HADOOP Carson Gallimore, Chris Zingraf, Jonathan Light.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
IoT Meets Big Data Standardization Considerations
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
B IG D ATA A NALYTICS A Presentation by Meg Monsen, Michael Leonard, and Eric Zeng.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
Understand The Use Of Technologies In Fashion Merchandising And Marketing FM 3.02.
BIG DATA BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools.
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Data Analytics (CS40003) Introduction to Data Lecture #1
CNIT131 Internet Basics & Beginning HTML
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Understanding Big Data
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
Big Data.
BIG Data 25 Need-to-Know Facts.
BIG DATA IN ENGINEERING APPLICATIONS
Mohammad J. Mansourzadeh
The Contemporary Firm 550 By: Beatriz Guzman
Ministry of Higher Education
Challenges and Opportunities in a Data-Driven World
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
Big Data.
Big Data Young Lee BUS 550.
Zoie Barrett and Brian Lam
Big Data Analysis in Digital Marketing
Big DATA.
Presentation transcript:

Big Data

What is Big Data? Analog starage vs digital. The FOUR V’s of Big Data. Who’s Generating Big Data The importance of Big Data. Optimalization HDFC

Definition Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization.

The FOUR V’s of Big Data From traffic patterns and music downloads to web history and medical records, data is recorded, stored, and analyzed to enable that technology and services that the world relies on every day. But what exactly is big data be used? According to IBM scientists big data can be break into four dimensions: Volume, Velocity, Variety and Veracity.

The FOUR V’s of Big Data

Volume. Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to-machine data being collected. In the past, excessive data volume was a storage issue. But with decreasing storage costs, other issues emerge, including how to determine relevance within large data volumes and how to use analytics to create value from relevant data.

The FOUR V’s of Big Data

Variety. Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, , video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.

The FOUR V’s of Big Data

Velocity. Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations.

The FOUR V’s of Big Data

Veracity - Big Data Veracity refers to the biases, noise and abnormality in data. Is the data that is being stored, and mined meaningful to the problem being analyzed. Inderpal feel veracity in data analysis is the biggest challenge when compares to things like volume and velocity. In scoping out your big data strategy you need to have your team and partners work to help keep your data clean and processes to keep ‘dirty data’ from accumulating in your systems.

Who’s Generating Big Data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data) The progress and innovation is no longer hindered by the ability to collect data But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 15

The importance of Big Data The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyze it to find answers that enable: Cost reductions Time reductions New product development and optimized offerings Smarter business decision making

The importance of Big Data For instance, by combining big data and high-powered analytics, it is possible to: Determine root causes of failures, issues and defects in near-real time, potentially saving billions of dollars annually. Optimize routes for many thousands of package delivery vehicles while they are on the road. Analyze millions of SKUs to determine prices that maximize profit and clear inventory. Generate retail coupons at the point of sale based on the customer's current and past purchases. Send tailored recommendations to mobile devices while customers are in the right area to take advantage of offers. Recalculate entire risk portfolios in minutes. Quickly identify customers who matter the most. Use clickstream analysis and data mining to detect fraudulent behavior

HDFS / Hadoop Data in a HDFS cluster is broken down into smaller pieces (called blocks) and distributed throughout the cluster. In this way, the map and reduce functions can be executed on smaller subsets of your larger data sets, and this provides the scalability that is needed for big data processing. The goal of Hadoop is to use commonly available servers in a very large cluster, where each server has a set of inexpensive internal disk drives.

PROS OF HDFS Scalable – New nodes can be added as needed, and added without needing to change data formats, how data is loaded, how jobs are written, or the applications on top. Cost effective – Hadoop brings massively parallel computing to commodity servers. The result is a sizeable decrease in the cost per terabyte of storage, which in turn makes it affordable to model all your data. Flexible – Hadoop is schema-less, and can absorb any type of data, structured or not, from any number of sources. Data from multiple sources can be joined and aggregated in arbitrary ways enabling deeper analyses than any one system can provide. Fault tolerant – When you lose a node, the system redirects work to another location of the data and continues processing without missing a beat.

Sources McKinsey Global Institute Cisco Gartner EMC, SAS IBM MEPTEC

Thank you for your attention. Authors: Tomasz Wis Krzysztof Rudnicki