By Matt Goliber and Jim Hougas Data Mining and Knowledge Discovery.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Overview of Data Mining & The Knowledge Discovery Process Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
C SC 421: Artificial Intelligence …or Computational Intelligence Alex Thomo
Chapter 9 Business Intelligence Systems
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Week 9 Data Mining System (Knowledge Data Discovery)
Chapter Extension 14 Database Marketing © 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke.
Data Mining.
DATA WAREHOUSING.
Business Intelligence Andrew Davis Andria Zippler Jana Krinsky Tiffany Ferris.
Data Mining By Archana Ketkar.
CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective.
Data Mining – Intro.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data mining By Aung Oo.
EFFECTIVE PREDICTIVE MODELING- DATA,ANALYTICS AND PRACTICE MANAGEMENT Richard A. Derrig Ph.D. OPAL Consulting LLC Karthik Balakrishnan Ph.D. ISO Innovative.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Business Intelligence
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
Mining Large Data at SDSC Natasha Balac, Ph.D.. A Deluge of Data Astronomy Life Sciences Modeling and Simulation Data Management and Mining Geosciences.
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Data Mining Techniques
Business Intelligence, Data Mining and Data Analytics/Predictive Analytics By: Asela Thomason IS 495 Summer 2015.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Customer Relationship Management Key Concepts. Customer Relationship Management Strategy Link all processes of the company from its customers through.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining By : Tung, Sze Ming ( Leo ) CS 157B. Definition A class of database application that analyze data in a database using tools which look for.
Data Mining By Dave Maung.
The CRISP Data Mining Process. August 28, 2004Data Mining2 The Data Mining Process Business understanding Data evaluation Data preparation Modeling Evaluation.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
Artificial Intelligence is the field of computer science that studies how machines can be made to act intelligently. The benefit of using The benefit of.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Pharmaceutical companies put forth great effort when identifying their customer needs and wants They then invest in R & D hoping to discover and launch.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DATA MINING Using Association Rules by Andrew Williamson.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
CS507 Information Systems. Lesson # 11 Online Analytical Processing.
Academic Year 2014 Spring Academic Year 2014 Spring.
DATA MINING IN THE CORPORATE WORLD BY RYANN A. WARD.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
ICT in Product Manufacture Electronic Information Handling.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
KNOWLEDGE MANAGEMENT (KM) Session # 33. Corporate Intranet A Conceptual Model INTRANET Production Team— New Product Budget Director— New Product Knowledge.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining: Confluence of Multiple Disciplines Data Mining Database Systems Statistics Other Disciplines Algorithm Machine Learning Visualization.
Supplemental Chapter: Business Intelligence Information Systems Development.
01-Business intelligence
Data Mining.
MIS 451 Building Business Intelligence Systems
Data and Applications Security Introduction to Data Mining
Data Mining: Concepts and Techniques Course Outline
©Jiawei Han and Micheline Kamber
Supporting End-User Access
Identifying Business Opportunities
Promising “Newer” Technologies to Cope with the
Presentation transcript:

By Matt Goliber and Jim Hougas Data Mining and Knowledge Discovery

What is Data Mining? Not like gold or diamond mining Mining of knowledge from data Important to many different fields A Part of Knowledge Discovery in Databases (KDD)

The Process of Knowledge Discovery Raw data Data Warehouse Patterns KNOWLEDGE! Data cleaning and integration Data transformation, selection, and mining Pattern evaluation and knowledge presentation

Why is Data Mining useful? We are data rich but information poor -Internet -Intelligence Humans often lack the ability to comprehend and manage the immense amount of available and sometime seemingly unrelated data

How long has this idea been around? Late 60’s and Early 70’s Stanford’s Meta-DENDRAL ( ) -Extension of DENDRAL Doug Lenat with AM (1976)

Meta-DENDRAL Extension of the DENDRAL (1965) program -One of the first expert systems -Interpreted mass spectra Meta-DENDRAL took the mass spectra of compound of known 3- D structure and formulated rules about the interpretation of the spectra Came up with known rules and some new ones!

Sample Mass Spec ethyl 3-oxy-3-phenylpropanoate (ethyl benzoylacetate)

AM Doug Lenat, 1976 Name means nothing, stand alone AM was given sets, bags, ordered sets, and lists AM was also given operations to perform on these data sets -Union, Intersection, ect… Came up with ideas about counting, addition, multiplication, prime numbers, and Goldbach’s conjecture AM thought that these were all uninteresting Liked maximally divisible numbers though…

What next? Not a whole lot… Databases were not prevalent enough, no great demand Did benefit from machine learning research Beginning of the 1990’s, “The next area…” -Ranked as one of the most promising research areas (NSF) -Information explosion Early commercial systems -Farm Journal -GM

Next Generation Techniques Decision Trees –Each branch is a classification question –Allows businesses to segment customers, products, and sales regions –Questions organize the data Rule Induction –All patterns are pulled from the data –Accuracy and Significance are then added to them –Help the user know how strong pattern is and likelihood of it occurring again –Ex: If bagels are purchased then cream cheese is purchased 90% of the time and this pattern occurs in 3% of all shopping baskets

Decision Trees vs. Rule Induction Decision Trees –Many rules to cover same instance or –no rule to cover an instance Rule Induction –Always and only one rule Example –Decision Trees use height and shoe size to determine size of person –Rule Induction uses one or the other

Examples of Significant Developments Stock Market Advances (1991) –Astrophysicists Doyne Farmer and Norman Packard –Prediction company could predict stock market trends Bell Atlantic (1996) –Consumer phone buying trends –Rule Induction Advanced Scout (1997) –Inderpal Bhandari assists NBA coaches –Rule Induction Persuade 400,000 undecided voters (2004) –MoveOn attemps to influence the election –Decision Tree

Challenges Large Data Sets with High Complexity - One or the other is currently possible, but not both Expensive - Costs of Bell Atlantic (Experts are needed) - Cost for a two-day course in Las Vegas ($1,300) - Software ($100,000)

Research DARPA –Defense Advance Research Projects Agency –ACLU claims this is an invasion of privacy –Decision Tree Uncovering Terrorists in public chat rooms –Tracks the times that messages are sent Advanced Scout –Bhandari is working on Advanced Scout for the NHL –Rule Induction

Current State Out of the Lab –Into Fortune 500 companies Automate Model Scoring –Fingers are currently crossed in hopes that scoring by IT personnel is done correctly

Future States Utilizing Company Warehouses –Data miners must take advantage of a million dollar warehouse that a company builds Effort Knob –Low for quick model, high for quality model Computed Target Columns –User could create a new target variable –Ex: finance information that a business has

Sources rithmetic%22+computer+data+mining+prove&hl=en Data Mining: Concepts and Techniques. Han J. and Kamber M.