These slides are additional material for TIES4451 Lecture 5 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
Perceptron Learning Rule
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Supervised Learning Recap
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Lecture Notes for Chapter 4 Introduction to Data Mining
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Machine Learning Neural Networks
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
x – independent variable (input)
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Learning From Data Chichang Jou Tamkang University.
An Illustrative Example
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Collaborative Filtering Matrix Factorization Approach
Neural Networks Lecture 8: Two simple learning algorithms
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
Radial Basis Function Networks
This week: overview on pattern recognition (related to machine learning)
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Multi-Layer Perceptrons Michael J. Watts
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Mathematical Models & Optimization?
Non-Bayes classifiers. Linear discriminants, neural networks.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Neural Nets: Something you can use and something to think about Cris Koutsougeras What are Neural Nets What are they good for Pointers to some models and.
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.8: Clustering Rodney Nielsen Many of these.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
CS621 : Artificial Intelligence
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Data Mining and Decision Support
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Computacion Inteligente Least-Square Methods for System Identification.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Machine Learning Lecture 4: Unsupervised Learning (clustering) 1.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
CS 9633 Machine Learning Support Vector Machines
Deep Feedforward Networks
Artificial Neural Networks I
One-layer neural networks Approximation problems
A Simple Artificial Neuron
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Data Mining Lecture 11.
Chapter 15 QUERY EXECUTION.
Machine Learning Today: Reading: Maria Florina Balcan
Goodfellow: Chap 6 Deep Feedforward Networks
Collaborative Filtering Matrix Factorization Approach
Artificial Intelligence 10. Neural Networks
Presentation transcript:

These slides are additional material for TIES4451 Lecture 5 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö

These slides are additional material for TIES4452 l ”A data mining algorithm is a well-defined procedure that takes data as input and produces output in the form of models and patterns” –”Well-defined” indicate that the procedure can be precisely encoded as a finite set of rules –”Algorithm”, a procedure that always terminates after some finite number of of steps and produces an output –”Computational method” has all the properties of an algorithm except a method for guaranteeing that the procedure will terminate in a finite number of steps  (Computational method is usually described more abstactly than algorithm, e.g., steepest descent is a computational method) A data mining algorithm

These slides are additional material for TIES4453 Data mining tasks l Explorative (visualization) l Descriptive (clustering, rule finding,…) l Predictive (classification, regression,…)

These slides are additional material for TIES4454 l Data mining task l Structure of the model or pattern l Score function l Search/optimization method l Data management technique Elements of data mining algorithms

These slides are additional material for TIES4455 Structure l Structure (functional form) of the model or pattern that will be fitted to the data l Defines the boundaries of what can be approximated or learned l Within these boundaries, the data guide us to a particular model or pattern l E.g., hierarchical clustering model, linear regression model, mixture model

These slides are additional material for TIES4456 Structure: decision tree Figure from the book ”Tan,Steinbach, Kumar, Introduction to Data Mining, Addision Wesley, 2006.”

These slides are additional material for TIES4457 Structure: MLP Figures by Tommi Kärkkäinen

These slides are additional material for TIES4458 Score function l Judge the quality of the fitted models or patterns based on observed data l Minimized/maximized when fitting parameters to our models and patterns l Critical for learning and generalization –Goodness-of-fitness vs. generalization  e.g., the number of neurons in neural network l E.g., misclassification error, squared error,support/accuracy

These slides are additional material for TIES4459 α = 2, q=2 → K-means α = 1, q=2 → K-spatialmedians α = 1, q=1 → K-coord.medians Score functions: Prototype-based clustering Different staticical properties of the cluster models Different algorithms and computational methods for solving

These slides are additional material for TIES44510 Score function: Overfitting vs. generalization Figures by Tommi Kärkkäinen

These slides are additional material for TIES44511 Search/optimization method l Used to search over parameters and structures l Computational procedures and algorithms used to find the maximum/minimum of the score function for particular models or patterns –Includes:  Computational methods used to optimize the score function, e.g., steepest descent  Search-related parameters, e.g., the maximum number of iterations or convergence specification for an iterative algorithm l Single-fixed structure (e.g., kth order polynomial function of the inputs) or family of different structures (i.e., search over both structures and their associated parameters spaces)

These slides are additional material for TIES44512 Search/optimization: K-means-like clustering 1. Initialize the cluster prototypes 2. Assign each data point to the closest cluster prototype 3. Compute the new estimates (may require another iterative algorithm) for the cluster prototypes 4. Termination: stop if termination criteria are satisfied (usually no changes in I )

These slides are additional material for TIES44513 Data management technique l Storing, indexing, and retrieving data l Not usually specified by statistical or machine learning algorithms –A common assumption is that the data set is small enough to reside in the main memory so that random access of any data point is free relative to actual computational costs l Massive data sets may exceed the capacity of available main memory –The physical location of the data and the manner in which data it is accessed can be critically important in terms of algorithm efficiency

These slides are additional material for TIES44514 Data management technique: memory l A general categorization of different memory structures 1.Registers of processors: direct acces, no slowdown 2.On-processor or on-board cache: fast semiconductor memory on the same chip as the processor 3.Main memory: Normal semiconductor memory (up to several gigabytes) 4.Disk cache: intermediate storage between main memory and disks 5.Disk memory: Terabytes. Access time milliseconds. 6.Magnetic tape: Access time even minutes.

These slides are additional material for TIES44515 Data management: index structures l B-trees l Hash indices l Kd-trees l Multidimensional indexing l Relational datatables

These slides are additional material for TIES44516 Examples CARTBackpropagationAPriori TaskClassification and regression RegressionRule pattern discovery StructureDecision treeNeural network (non-linear function) Association rules Score functionCross-validated loss function Squared errorSupport/accuracy Search methodGreedy search over structures Gradient descent on parameters Breadth-First search Data management technique Unspecified Linear scans