Download presentation
Presentation is loading. Please wait.
Published byWilla Mills Modified over 8 years ago
1
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Drake University Educational Technology Guest Seminar An Overview of Big Data Analytics: Challenges and Selected Applications Saturday, 20 September 2013 William H. Hsu Department of Computing and Information Sciences, KSU Speaker home page: http://www.cis.ksu.edu/~bhsuhttp://www.cis.ksu.edu/~bhsu Research group page of speaker: http://www.kddresearch.orghttp://www.kddresearch.org Twitter channel: http://twitter.com/kstate_bigdatahttp://twitter.com/kstate_bigdata Reference Material “How Big Data is Different”, MIT Sloan Management Review, 30 July 2012, T. H. Davenport
2
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Outline What Is Big Data? Volume, rate of collection: terabytes (trillions) to petabytes Applications and tools: analytics, trends, predictions, insights Key Kinds of Data and How They Are Different Three V’s: Volume, Velocity, Variety Sources of uncertainty Spatial data and time series Social data What to Use Big Data For: Applications Analytics Data mining Information visualization Decision support Where to Get Big Data Tools
3
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech What Is Big Data?: Moving Parts (Tools, Methods, Goals) Figure © 2011, D. Hinchcliffe http://bit.ly/what-is-big-data-zdnet
4
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Characterizing Big Data: Three V’s – Volume, Velocity, Variety Adapted from slide © 2012, M. Grobelnik http://bit.ly/bigdata-tutorial-grobelnik
5
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How Big Is Big? Data at Scale in 2014 Big Data (2014)
6
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How Much Is There? A Growing Torrent (2012) Adapted from slide © 2012, M. Grobelnik http://bit.ly/bigdata-tutorial-grobelnik
7
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Sources: Web & Mobile Content, Messaging, Search, Social Media © 2013, Hewlett-Packard http://bit.ly/what-is-big-data-hp
8
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech © 2011 – 2012 TextMap.org Analytics: Scientific, Data, & Info Visualization
9
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Predictive Analytics (Google Trends) © 2012, Google Trends http://bit.ly/trend-analysis-example
10
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Information Retrieval (IR): Topic Modeling – Basic Task Elshamy (2012) http://hdl.handle.net/2097/15176
11
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Text Mining & Natural Language Processing (NLP)
12
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Data Mining: Sentiment Analysis & Crowdsourcing http://twitrratr.com/search/EuroHCIR © 2012 Twitrratr
13
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. Information Visualization: Spatiotemporal Thematic Mapping
14
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012) Civic Application for Social Good: Mapping Meth Labs in Kansas
15
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Applied Data Science: Question Answering (QA) & Informatics © 2011 National Broadcasting Company http://youtu.be/TFe2pJETNuw Reflexive Reasoning: Knowledge Quality Human-Comprehensible Explanations Not Always Guaranteed Deep Blue (1997)Blue Gene (1999 - present)
16
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech What are the Issues? “Four Vs” Figure © 2011, IBM http://bit.ly/big-data-howto-ibm
17
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How does it Work? Big Data Landscape (Forbes, 2012) Adapted from figure © 2011, D. Feinleib http://bit.ly/what-is-big-data-forbes
18
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech MapReduce in a nutshell 18 Task1 Task 2 Task 3 Output data Aggregated Result © Sven Schlarb
19
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech MapReduce (MR): Heart of Big Data Processing Map Accepts input key/value pair Emits intermediate key/value pair Reduce Accepts intermediate key/value* pair Emits output key/value pair Very Big Data Result MAPMAP REDUCEREDUCE Partitioning Function Adapted from slide © 2006, H. Setiawan, National University of Singapore http://bit.ly/mapreduce-intro-setiawan
20
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech What is MapReduce? A programming model (& its associated implementation) For processing large data set Exploits large set of commodity computers Executes process in distributed manner Offers high degree of transparencies In other words: simple and maybe suitable for your tasks © 2006, H. Setiawan, National University of Singapore http://bit.ly/mapreduce-intro-setiawan
21
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How to BYOD Have and understand your application domain Consider task (integration, transformation, modeling) before scale! Review challenges of big data: http://KDNuggets.comhttp://KDNuggets.com Consult data scientists and purveyors of data Three States of Data: In Use, In Motion, At Rest Data Acquisition in General: Crawling, Crowdsourcing Bring Your Own Data (BYOD) Data at Rest © 2011, Wikipedia http://bit.ly/wikipedia-data-at-rest
22
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Data Transformation Customer Relationship Management (CRM) Finance Sales Human Resources Enterprise Resource Planning (ERP) Multiple Data Sources ETL System Extract Import Process - Relational queries - Automation Transform Prepare Data - Data cleaning - Lookup tasks Load Consolidation - Aggregation - Staging MetadataSummary Data Processed Data Client Applications Analytics Data Mining Visualization Data Mart(s) Data Warehouse Adapted from figure © 2012 B. Silva, The Omega Group http://bit.ly/etl-omega-group
23
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Summary “What Is Big Data?” – Three V’s, Current Scale (Tera to Peta) Building Blocks of Data Science Model-building: statistics & machine learning Data mining & knowledge discovery in databases (KDD) Visualization Working with unstructured data, natural language (human language) Data transformations: integration & warehousing Data Science Application Areas Analytics: trends (topics, social patterns, sentiment, etc.) Information retrieval: search, structured queries, QA User modeling, personalization, adaptation; recommender systems Aspects and Applications of Analytics Types: text analytics, predictive analytics, link mining Decision support: social good, differentiated instruction, etc.
24
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Terminology Big Data – Data at Scale (Currently Terabytes to Petabytes) Building Blocks Visualization – displaying data, info, processes for understanding Data mining – analyzing data to build novel, valid, useful models Analytics – deriving insights from data (data mining, visualization) Three Vs: Massive, High-Bandwidth, Heterogeneous Data Sets/Streams Volume – amount of data to be processed, stored Velocity – rate of arrival of data relative to real-time requirements Variety – differences in data type, format, organization Veracity – Data Quality, Reliability (Quantitative and Qualitative Aspects) Sources of Data Crawling: traversing, scraping sites using web crawler program Crowdsourcing: polls & other online voting, rating systems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.