AIMachine learning Neural networks Deductive detabases.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

Rerun of machine learning Clustering and pattern recognition.
Naïve-Bayes Classifiers Business Intelligence for Managers.
Decision Tree Approach in Data Mining
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Intelligent Environments1 Computer Science and Engineering University of Texas at Arlington.
An Overview of Machine Learning
Chapter 9 Business Intelligence Systems
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Three kinds of learning
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.
Data Mining – Intro.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Engineering Group in ACL.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Dr. Awad Khalil Computer Science Department AUC
Data Mining Techniques
Overview DM for Business Intelligence.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Anomaly detection with Bayesian networks Website: John Sandiford.
COMP3503 Intro to Inductive Modeling
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Basic Data Mining Technique
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Machine Learning.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Data Mining and Decision Support
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
Data Mining – Intro.
DATA MINING © Prentice Hall.
Data Warehousing and Data Mining
Machine Learning with Weka
Overview of Machine Learning
Classification and Prediction
3.1.1 Introduction to Machine Learning
Data Mining: Introduction
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Presentation transcript:

AIMachine learning Neural networks Deductive detabases

Detecting regularities in data (bird flue cases) Detecting rare occurrences, rare events Finding “causal” relationships

Opportunities Collecting vast amounts of data has become possible. Ex1: Astromomy: petabytes of information are collected Laboratory for Cosmological Data Mining (LCDM) 1 petabyte (PB) = 2 50 bytes = 1,125,899,906,842,624 bytes. 1 petabyte = 1,024 terabytes 1 terabyte (TB) = 1,024 gigabytes => The armchair astronomer

Ex2: Biology: huge sequences of nucleotides have been collected. (The human genome contains more than 3.2 billion base pairs and more than genes). Very little of that has been interpreted yet.

Ex: Physics, Geography, weather data, … Business, … numerical discrete continuous categorical raw data cleaned data complete records Incomplete records (missing data) formatted data unformatted data

Tasks Fit data to model –Descriptive –Predictive Finding the “best” model ??? –Beware of model overfitting! Interpreting results Evaluating models (ex: lift charts) => Usually a lot of going back and forth between model(s) and data

Another complementary tack: Interactive visual data exploration Remarkable properties of the human visual system. (ex: analysis of a pseudo random number generator) Various visual representation schemes –Simultaneous viewing –(fast) sequential viewing Animating data (dynamic queries) Other possibilities: converting data to sounds, etc.

Two broad approaches to Learning Supervised learning ex: want to discover a model to help classify stars, based on emission spectra. In the “training set” the correct classification of the stars is known. The resulting model is used to predict the class of a new star (not in the training set) Unsupervised learning ex: want to group a set of stars into a small number sufficiently homogenous sub-groups of stars

Many techniques Fast evolving field Statistical –Descriptive stats, graphics,.. –Regression analysis –Principal components analysis –Time series analysis –Cluster analysis (use of a distance measure) –Naïve Bayse classifiers Artificial intelligence –Rule induction (Machine Learning) –Various inference techniques (various logics, deductive databases,…)

–Pattern matching (speech recognition) –Neural networks (many approaches) –Genetic algorithms –Baysian networks (probably the best approach to model complex causal structures) Information retrieval –Many specialized models (vector model,…) –Concepts of Precision and Recall Many ad hoc techniques –Co-occurrence analysis –MK generality analysis –Association analysis

One famous technique Ross Quinlan’s ID3 algorithm

The weather data ObjectOutlookTemperatureHumidityWindyClass 1sunnyhothighFALSEN 2sunnyhothighTRUEN 3overcasthothighFALSEP 4rainmildhighFALSEP 5raincoolnormalFALSEP 6raincoolnormalTRUEN 7overcastcoolnormalTRUEP 8sunnymildhighFALSEN 9sunnycoolnormalFALSEP 10rainmildnormalFALSEP 11sunnymildnormalTRUEP 12overcastmildhighTRUEP 13overcasthotnormalFALSEP 14rainmildhighTRUEN

From decision trees to rules Reading rules from a tree –Unambiguous –Rule order not counting –Alternative rules for the same conclusion are ORed –But too complex rules

Rules can be much more compact than trees Ex: if x=1 and y = 1 then class=a if z=1and w=1 then class=a Otherwise class=b

From rules to decision trees Rule disjunction result in too complex trees. Ex: write as a tree –If a and b then x –If c and d then x (Fig. 3.2)Fig. 3.2 (replicated sub-tree problem) Ex: tree and rules of equivalent complexityEx Ex: tree much more complex than rules

To learn from examples, the examples must be rich enough Ex: sister-of relation (fig 2-1)fig 2-1 Denormalization (fig 2-3)fig 2-3 Importance of data preparation

Attributes An attribute may be irrelevant in a given context (ex: number of wheels for a ship in a database of transportation vehicles => Create value “irrelevant”

Software tools Many commercial software –CART ( –SPSS modules –WEKA (free) ( –For a larger list: Many field specific software –In the context of GRID computing Demonstrating WEKA

Ad hoc methods Co-occurrence analysis MK generality analysisgenerality analysis

Term Co-occurrence Analysis The following approach measures the strength of association between a term i and a term j of the set of documents by: e(i,j) 2 = (C ij ) 2 /(C i * C j ) Where: Ci : is the number of documents indexed by term i Cj : is the number of documents indexed by term j Cij : is the number of documents indexed both by terms i and j

Interactive Data Visualization Fish eye views Hyperbolic trees Linear Visual data sequences Dynamic queries

Tree Maps Financial Data Data

Conclusion Current state of the art (Graphic Models – Markov networks) Still an art Ethical issues

Baysian Networks Objective: determine probability estimates that a given sample belongs to a class Probability(x  Class | attribute values) Baysian network: –One node for each attribute –Nodes connected in an acyclic graph –Conditional independance

Learning a baysian network from data Function for evaluating a given network based on the data Function for searching through the space of possible networks K1 and TAN algorithms

Baysian Networks   Graphical Models = Markov models undirected edges