CNIT131 Internet Basics & Beginning HTML

Slides:



Advertisements
Similar presentations
R and HDInsight in Microsoft Azure
Advertisements

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.
Chapter 14 The Second Component: The Database.
GROUP 1 : DATO’ NABIL ABD KADIR SAYNUL ISLAM MOHAMMAD GHAZALI MOHD DAUD.
Lecture-8/ T. Nouf Almujally
Big Data. What is Big Data? Analog starage vs digital. The FOUR V’s of Big Data. Who’s Generating Big Data The importance of Big Data. Optimalization.
Big Data A big step towards innovation, competition and productivity.
Basic Marketing Research Customer Insights and Managerial Action
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
Charles Tappert Seidenberg School of CSIS, Pace University
Big Data Bijan Barikbin Denisa Teme Matthew Joseph.
MULTIMEDIA DATABASES -Define data -Define databases.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
IoT Meets Big Data Standardization Considerations
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Big Data Yuan Xue CS 292 Special topics on.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
INTRODUCTION TO INFORMATION SYSTEMS LECTURE 9: DATABASE FEATURES, FUNCTIONS AND ARCHITECTURES PART (2) أ/ غدير عاشور 1.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Big Data and Official Statistics: Philippine Context Erniel B. Barrios.
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Data Analytics (CS40003) Introduction to Data Lecture #1
Information Retrieval in Practice
Jaclyn Hansberry MIS2502: Data Analytics The Things You Can Do With Data The Information Architecture of an Organization Jaclyn.
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Big Data Analytics on Large Scale Shared Storage System
SNS COLLEGE OF TECHNOLOGY
Understanding Big Data
An Open Source Project Commonly Used for Processing Big Data Sets
Components of information systems
Feet on the Ground, Head in the Cloud Presentation for the Gartner Master Data Management Summit, April 4, 2014 John Parker.
WELCOME Mobile Applications Testing
Virtualization, Cloud Computing and Big Data
Mohammad J. Mansourzadeh
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Week 01 Comp 7780 – Class Overview.
CNIT131 Internet Basics & Beginning HTML
Big Data The huge amount of data being collected and stored about individuals, items, and activities and to the process of drawing useful information from.
Week 02 Big Data.
Big Data - in Performance Engineering
American Brush Manufactures Association
Data Warehousing and Data Mining
Use of Electronic and Internet advertising options
Big Data.
Cloud computing mechanisms
Course Introduction CSC 576: Data Mining.
Zoie Barrett and Brian Lam
Business Intelligence
Dep. of Information Technology By: Raz Dara Mohammad Amin
Big Data: Four Vs Salhuldin Alqarghuli.
Big Data Analysis in Digital Marketing
Dark Data Are we at risk?.
Big DATA.
V. Uddameri Texas Tech University
UNIT 6 RECENT TRENDS.
Big Data.
Presentation transcript:

CNIT131 Internet Basics & Beginning HTML Week 15 – Big Data http://fog.ccsf.edu/~hyip

What is Big Data? Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using on-hand data management tools or traditional data processing applications. – (From Wikipedia) Big Data refers to extremely vast amounts of multi-structured data that typically has been cost prohibitive to store and analyze. (My view) NOTE: However, big data is only referring to digital data, not the paper files stored in the basement at FBI headquarters, or piles of magnetic tapes in our data center.

Big Data – a Brief History

Types of Big Data In the simplest terms, Big Data can be broken down into two basic types, structured and unstructured data. Structured – Predefined data type Spreadsheets and Oracle Relational database Unstructured – is non pre-defined data model or is not organized in a pre-defined manner. Video, Audio, Images, Metadata, etc… Semi-structured – Structured data embedded with some unstructured data Email, Text Messaging

A Big Data Platform must: Analyze a variety of different types of information This information by itself could be unrelated, but when paired with other information can illustrate a causation for an various events that the business can take advantage.  Analyze information in motion Various data types will be streaming and contain a large amount of "bursts".  Ad hoc analysis needs to be done on the streaming data to search for relevant events. Cover extremely large volumes of data Due to the proliferation of devices in the network, how they are used, along with customer interactions on smartphones and the web, a cost efficient process to analyze the petabytes of information is required

A Big Data Platform must: (2) Cover varying types of data sources Data can be streaming, batch, structured, unstructured, and semi- structured, depending on the information type, where it comes from and its primary use.  Big Data must be able to accommodate all of these various types of data on a very large scale. Analytics Big Data must provide the mechanisms to allow ad-hoc queries, data discovery and experimentation on the large data sets to effectively correlate various events and data types to get an understanding of the data that is useful and addresses business needs. 

Five Characteristics of Big Data Big Data is defined by five characteristics: Volume: Data created by and moving through today’s services may describe tens of millions of customers, hundreds of millions of devices, and billions of transactions or statistical records. Such scale requires careful engineering, as it is necessary to carefully conserve even the number of CPU instructions or operating system events and network messages per data items. Parallel processing is a powerful tool to cope with scale. MapReduce computing frameworks like Hadoop and storage systems like HBASE and Cassandra provide low-cost, practical system foundations. Analysis also requires efficient algorithms, because “data in flight” may only be observed one time, so conventional storage-based approaches may not work. Large volumes of data may require a mix of “move the data to the processing” and “move the processing to the data” architectural styles. Velocity: Timeliness is often critical to the value of Big Data. For example, online customers may expect promotions (coupons) received on a mobile device to reflect their current location, or they may expect recommendations to reflect their most recent purchases or media that was accessed. The business value of some data decays rapidly. Because raw data is often delivered in streams, or in small batches in near real-time, the requirement to deliver rapid results can be demanding and does not mesh well with conventional data warehouse technology.

Five Characteristics of Big Data (2) Variety: Big Data often means integrating and processing multiple types of data. We can consider most data sources as structured, semi-structured, or unstructured. Structured data refers to records with fixed fields and types. Unstructured data includes text, speech, and other multimedia. Semi-structured data may be a mixture of the above, such as web documents, or sparse records with many variants, such as personal medical records with well defined but complex types. Veracity: Data sources (even in the same domain) are of   widely differing qualities, with significant differences   in the coverage, accuracy and timeliness of data provided.  Per IBM's Big Data website, one in three business leaders   don't trust the information they use to make decisions.  Establishing trust in big data presents a huge challenge as the variety and number of sources grows.. Variability: Beyond the immediate implications of having many types of data, the variety of data may also be reflected in the frequency with which new data sources and types are introduced. NOTE: Big Data generally includes data sets with sizes beyond the ability of commonly-used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target, from hundreds of terabytes to many petabytes of data in a single data set. With this difficulty, a new tool sets has arisen to handle making sense over these large quantities of data.  Big data is difficult to work with using relational databases, desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers".

References Discovering the Internet: Complete, Jennifer Campbell, Course Technology, Cengage Learning, 5th Edition-2015, ISBN 978-1-285- 84540-1. Basics of Web Design HTML5 & CSS3, Second Edition, by Terry Felke- Morris, Peason, ISBN 978-0-13-312891-8. A Very Short History of Big Data.