Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University.

Similar presentations


Presentation on theme: "Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University."— Presentation transcript:

1 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Drake University Educational Technology Guest Seminar An Overview of Big Data Analytics: Challenges and Selected Applications Saturday, 20 September 2013 William H. Hsu Department of Computing and Information Sciences, KSU Speaker home page: http://www.cis.ksu.edu/~bhsuhttp://www.cis.ksu.edu/~bhsu Research group page of speaker: http://www.kddresearch.orghttp://www.kddresearch.org Twitter channel: http://twitter.com/kstate_bigdatahttp://twitter.com/kstate_bigdata Reference Material “How Big Data is Different”, MIT Sloan Management Review, 30 July 2012, T. H. Davenport

2 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Outline What Is Big Data?  Volume, rate of collection: terabytes (trillions) to petabytes  Applications and tools: analytics, trends, predictions, insights Key Kinds of Data and How They Are Different  Three V’s: Volume, Velocity, Variety  Sources of uncertainty  Spatial data and time series  Social data What to Use Big Data For: Applications  Analytics  Data mining  Information visualization  Decision support Where to Get Big Data Tools

3 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech What Is Big Data?: Moving Parts (Tools, Methods, Goals) Figure © 2011, D. Hinchcliffe http://bit.ly/what-is-big-data-zdnet

4 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Characterizing Big Data: Three V’s – Volume, Velocity, Variety Adapted from slide © 2012, M. Grobelnik http://bit.ly/bigdata-tutorial-grobelnik

5 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How Big Is Big? Data at Scale in 2014 Big Data (2014)

6 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How Much Is There? A Growing Torrent (2012) Adapted from slide © 2012, M. Grobelnik http://bit.ly/bigdata-tutorial-grobelnik

7 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Sources: Web & Mobile Content, Messaging, Search, Social Media © 2013, Hewlett-Packard http://bit.ly/what-is-big-data-hp

8 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech © 2011 – 2012 TextMap.org Analytics: Scientific, Data, & Info Visualization

9 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Predictive Analytics (Google Trends) © 2012, Google Trends http://bit.ly/trend-analysis-example

10 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Information Retrieval (IR): Topic Modeling – Basic Task Elshamy (2012) http://hdl.handle.net/2097/15176

11 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Text Mining & Natural Language Processing (NLP)

12 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Data Mining: Sentiment Analysis & Crowdsourcing http://twitrratr.com/search/EuroHCIR © 2012 Twitrratr

13 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. Information Visualization: Spatiotemporal Thematic Mapping

14 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012) Civic Application for Social Good: Mapping Meth Labs in Kansas

15 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Applied Data Science: Question Answering (QA) & Informatics © 2011 National Broadcasting Company http://youtu.be/TFe2pJETNuw Reflexive Reasoning: Knowledge Quality Human-Comprehensible Explanations Not Always Guaranteed Deep Blue (1997)Blue Gene (1999 - present)

16 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech What are the Issues? “Four Vs” Figure © 2011, IBM http://bit.ly/big-data-howto-ibm

17 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How does it Work? Big Data Landscape (Forbes, 2012) Adapted from figure © 2011, D. Feinleib http://bit.ly/what-is-big-data-forbes

18 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech MapReduce in a nutshell 18 Task1 Task 2 Task 3 Output data Aggregated Result © Sven Schlarb

19 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech MapReduce (MR): Heart of Big Data Processing Map   Accepts input key/value pair   Emits intermediate key/value pair Reduce   Accepts intermediate key/value* pair   Emits output key/value pair Very Big Data Result MAPMAP REDUCEREDUCE Partitioning Function Adapted from slide © 2006, H. Setiawan, National University of Singapore http://bit.ly/mapreduce-intro-setiawan

20 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech What is MapReduce? A programming model (& its associated implementation) For processing large data set Exploits large set of commodity computers Executes process in distributed manner Offers high degree of transparencies In other words:  simple and maybe suitable for your tasks © 2006, H. Setiawan, National University of Singapore http://bit.ly/mapreduce-intro-setiawan

21 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech How to BYOD  Have and understand your application domain  Consider task (integration, transformation, modeling) before scale!  Review challenges of big data: http://KDNuggets.comhttp://KDNuggets.com  Consult data scientists and purveyors of data Three States of Data: In Use, In Motion, At Rest Data Acquisition in General: Crawling, Crowdsourcing Bring Your Own Data (BYOD) Data at Rest © 2011, Wikipedia http://bit.ly/wikipedia-data-at-rest

22 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Data Transformation Customer Relationship Management (CRM) Finance Sales Human Resources Enterprise Resource Planning (ERP) Multiple Data Sources ETL System Extract Import Process - Relational queries - Automation Transform Prepare Data - Data cleaning - Lookup tasks Load Consolidation - Aggregation - Staging MetadataSummary Data Processed Data Client Applications Analytics Data Mining Visualization Data Mart(s) Data Warehouse Adapted from figure © 2012 B. Silva, The Omega Group http://bit.ly/etl-omega-group

23 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Summary “What Is Big Data?” – Three V’s, Current Scale (Tera to Peta) Building Blocks of Data Science  Model-building: statistics & machine learning  Data mining & knowledge discovery in databases (KDD)  Visualization  Working with unstructured data, natural language (human language)  Data transformations: integration & warehousing Data Science Application Areas  Analytics: trends (topics, social patterns, sentiment, etc.)  Information retrieval: search, structured queries, QA  User modeling, personalization, adaptation; recommender systems Aspects and Applications of Analytics  Types: text analytics, predictive analytics, link mining  Decision support: social good, differentiated instruction, etc.

24 Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University Ed Tech Terminology Big Data – Data at Scale (Currently Terabytes to Petabytes) Building Blocks  Visualization – displaying data, info, processes for understanding  Data mining – analyzing data to build novel, valid, useful models  Analytics – deriving insights from data (data mining, visualization) Three Vs: Massive, High-Bandwidth, Heterogeneous Data Sets/Streams  Volume – amount of data to be processed, stored  Velocity – rate of arrival of data relative to real-time requirements  Variety – differences in data type, format, organization Veracity – Data Quality, Reliability (Quantitative and Qualitative Aspects) Sources of Data  Crawling: traversing, scraping sites using web crawler program  Crowdsourcing: polls & other online voting, rating systems


Download ppt "Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University."

Similar presentations


Ads by Google