Ch 6. The Evolution of Analytic Tools and Methods Taming The Big Data Tidal Wave 31 May 2012 SNU IDB Lab. Sengyu Rim.

Slides:



Advertisements
Similar presentations
OptiShip ® Multi-carrier Shipping System. OptiShip ® customers save on average 13.6% of parcel shipping costs… OptiShip ® is a comprehensive system that.
Advertisements

Ultimate Bundle Overview Products Benefits Technical Requirements Licensing Pricing Valid until 01-Sep-2010.
CS0004: Introduction to Programming Visual Studio 2010 and Controls.
Computer Systems Lab TJHSST Current Projects In-House, pt 6.
More data on this topic available from:: How to Get More Site Traffic to Convert to Free Trial Subscriptions with Credit Card Rick Anderson President &
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
CS 3500 SE - 1 Software Engineering: It’s Much More Than Programming! Sources: “Software Engineering: A Practitioner’s Approach - Fourth Edition” Pressman,
MATLAB Presented By: Nathalie Tacconi Presented By: Nathalie Tacconi Originally Prepared By: Sheridan Saint-Michel Originally Prepared By: Sheridan Saint-Michel.
Java.  Java is an object-oriented programming language.  Java is important to us because Android programming uses Java.  However, Java is much more.
Decision Tree under MapReduce Week 14 Part II. Decision Tree.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
CS 501: Software Engineering
Data Mining.
Copyright © 2001 Bolton Institute Faculty of Technology Multimedia Integration and Applications Introduction to Multimedia Integration and Applications.
DATA WAREHOUSING.
Clementine Server Clementine Server A data mining software for business solution.
Microsoft Visio is diagramming software for Microsoft Windows. It uses vector graphics to create diagrams. The 2007 Standard and Professional editions.
Digital Planet: Tomorrow’s Technology and You
DECISION SUPPORT SYSTEM DEVELOPMENT
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
Huseyin Ergin Advisor: Dr. Eugene Syriani University of Alabama Software Modeling Lab Software Engineering Group Department of Computer Science College.
Your Interactive Guide to the Digital World Discovering Computers 2012.
Xenios Papademetris Departments of Diagnostic Radiology and Biomedical Engineering Yale University School of Medicine.
What is R By: Wase Siddiqui. Introduction R is a programming language which is used for statistical computing and graphics. “R is a language and environment.
Ch 4. The Evolution of Analytic Scalability
1 Integrated Development Environment Building Your First Project (A Step-By-Step Approach)
CSE 501N Fall ‘09 00: Introduction 27 August 2009 Nick Leidenfrost.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Ch 5. The Evolution of Analytic Processes
Team Skill 6: Building the Right System From Use Cases to Implementation (25)
Introduction to Programming Lecture 1 – Overview
Managing Organizations Informed decision making as a prerequisite for success Action Vision Mission Organizational Context Policies, Goals, and Objectives.
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
Chapter 1 Introduction to SAS ® Enterprise Guide ®
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Satish Ramanan April 16, AGENDA Context Why - Integrate Search with BI? How - do we get there? - Tool Strategy What - is in it for me ? - Outcomes.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
Introduction of Geoprocessing Topic 7a 4/10/2007.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 8 Personal Productivity and Problem Solving.
Trends in Business Intelligence & Analytics Keynote at Silicon India Rajgopal Kishore Vice President and Global Head of BI & Analytics, HCL Technologies.
Introduction to Software Development. Systems Life Cycle Analysis  Collect and examine data  Analyze current system and data flow Design  Plan your.
MBA7020_01.ppt/June 13, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Introduction - Why Business Analysis.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Systems Analysis and Design in a Changing World, Fourth Edition
Industrial Data Modeling with DataModeler Mark Kotanchek Evolved Analytics
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
Chapter 11 Managing Application Development. Agenda Application management framework Application management issues Criteria for development approach Development.
CSE 303 – Software Design and Architecture
Information and the Internet Joseph Tragert EBSCO Publishing September 20, 2004.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Program design and algorithm development We will consider the design of your own toolbox to be included among the toolboxes already available with your.
Introduction of Geoprocessing Lecture 9 3/24/2008.
Lectures 2 & 3: Software Process Models Neelam Gupta.
If you have a transaction processing system, John Meisenbacher
Visual Basic.NET Comprehensive Concepts and Techniques Chapter 1 An Introduction to Visual Basic.NET and Program Design.
Pinellas County Schools
Saskatoon SAS user group
4D Simulation Impacts on Construction Management
A451 Theory – 7 Programming 7A, B - Algorithms.
Introduction to R Programming with AzureML
How do the Process Model, WBS, SPMP, Goal Statement,
Machine Learning & Data Science
Learning to Program in Python
Ch 4. The Evolution of Analytic Scalability
Delivering IIoT Solutions - Easy, Fast, Smart
How to stop Fortran programming problems at the source
How to stop C programming problems at the source
Title of Project Joseph Hallahan Computer Systems Lab
ReadWriteThink: ESSAYMAP
Presentation transcript:

Ch 6. The Evolution of Analytic Tools and Methods Taming The Big Data Tidal Wave 31 May 2012 SNU IDB Lab. Sengyu Rim

Outline  Introduction  The Evolution of Analytic Methods  The Evolution of Analytic Tools 2

Introduction  Analytic professionals have used a range of tools over the years – Execute analytic algorithms – Assess the results 3

Outline  Introduction  The Evolution of Analytic Methods  The Evolution of Analytic Tools 4

The Evolution of Analytic Methods (1/7) The Evolution of Analytic Methods  Until the advent of computers, it wasn’t feasible to run – Many iterations of a model – Highly advanced methods – Large dataset 5 DATA Naïve algorithm output DATA output Sophisticated algorithm NOW

The Evolution of Analytic Methods (2/7) Ensemble Methods  Ensemble methods are built using multiple techniques – go beyond individual performer 6 linear regression logistic regression decision tree neutral network aggregator final result

The Evolution of Analytic Methods (3/7) The Wisdom of Crowds  One reason for ensemble models are gaining traction is – The Wisdom of Crowds 7

The Evolution of Analytic Methods (4/7) Commodity Model  Commodity model has been produced rapidly  A commodity modeling process stops when something good enough is found

The Evolution of Analytic Methods (5/7) Uses for Commodity Models  Traditionally, building models was a time-intensive and expensive – Modeling for low-value problems doesn’t make sense  Commodity model provides an option for low-value problems 9

The Evolution of Analytic Methods (6/7) Text Analysis  Analysis of text and other unstructured data sources is growing rapidly  Unstructured data is applied to some structure after being processed – Structured results are what is analyzed 10 parser Structured data

The Evolution of Analytic Methods (7/7) Ambiguity  Applying context to the text is no easy task – read a book vs book a ticket  Emphasis can change the meaning 11 Varying the emphasisChanges the meaning I didn’t say Bill’s book stinksBut my buddy Bob did! I didn’t say Bill’s book stinksHow dare you accuse me of such a thing I didn’t say Bill’s book stinksBut I admit that I did write it in an I didn’t say Bill’s book stinksIt’s that other guy’s book that stinks I didn’t say Bill’s book stinksI said his blog stinks I didn’t say Bill’s book stinksI simply said it wasn’t my favorite

Outline  Introduction  The Evolution of Analytic Methods  The Evolution of Analytic Tools 12

The Evolution of Analytic Tools (1/7) Previous Tools  Analytics work was done against a mainframe in 1980s – Not user-friendly – Directly program code to do analytics 13

The Evolution of Analytic Tools(2/7) Graphical User Interface  Graphical user interfaces can accelerate the generation of code while ensuring it is bug-free and optimized – Point-and-click environment – Generate the code automatically – Users still should understand the code to validate the intention 14

The Evolution of Analytic Tools(3/7) The Explosion of Point Solutions  Analytic point solutions are software package that address a set of specific problems – Price optimization applications – Fraud applications – Demand forecasting applications  One downside of point solutions is the high price – Can be $10 million – Implementing point solutions in a serial way is preferred 15

The Evolution of Analytic Tools(4/7) Open Source  Open-source software have been around for some time – In many cases, open-source products are outside the mainstream  Many individuals are contributing to improving the functionality – Bugs can be patched soon 16

The Evolution of Analytic Tools(5/7) The R Project for Statistical Computing  R Project is open source for statistical computing  Features of R Project 17 More object-oriented Integrate new features faster Free for charge Programming is intensive

The Evolution of Analytic Tools(6/7) Data Visualization  An effective visualization can make a pattern jump right off the page at you  Today’s visualization tools allows – Multiple tabs – Link the graphs and charts with underlying data  New idea for data visualization – 3-D 18

The Evolution of Analytic Tools(7/7) Importance of Data Visualization  Appropriate visualization will increase an audience’s comprehension  Understanding how to visualize data will help analytic professionals become better 19