Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas.

Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas

Overview Pattern Recognition, Machine Learning, Data Mining: How do they fit together? Example Techniques K-Nearest Neighbor Explained

Data Mining Searching through electronically stored data in an automatic way Solving problems with already known data Essentially, discovering patterns in data Has several subsets from statistics to machine learning

Machine Learning Construct computer programs that improve with use A methodology Draws from many fields: Statistics, Information Theory, Biology, Philosophy, Computer Science... Several sub-disciplines: Feature Extraction, Pattern Recognition

Pattern recognition Operation and design of systems that detect patterns in data The algorithmic process Applications include image analysis, character recognition, speech analysis, and machine diagnostics.

Pattern Recognition Process Gather data Determine features to use Extract features Train your recognition engine Classify new instances

Artificial Neural Networks A type of artificial intelligence that attempts to imitate the way a human brain works. Creates connections between processing elements, computer equivalent of neurons Supervised technique

ANN continued Tolerant of errors in data Many applications: speech recognition, analyze visual scenes, robot control Best at interpreting complex real world sensor data.

The Brain Human brain has about 10 11 neurons Each connect to about 10 4 other neurons Switch in about 10 -3 seconds Slow compared to a computers at 10 -13 seconds Brain recognizes a familiar face in about 10 -1 seconds Only 200-300 cycles at its switch rate Brain utilizes MASSIVE parallel processing, considers many factors at once.

Neural Network Diagram

Stuttgart Neural Network Simulator

Bayesian Theory Deals with statistical probabilities One of the best for classifying text Require prior knowledge about the expected probabilities

Conveyor Belt Example Want to sort apples and orange on conveyor belt. Notice 80% are orange, therefore 80% are oranges. Bayesian theory says Decide w org if P(w org |x) > P(w app |x); otherwise decide w app

Clustering A process of partitioning data into meaningful sub-classes (clusters). Most techniques are unsupervised. Two main categories: Hierarchical: Nested classes displayed as a dendrogram Non-Hierarchical: Each class in one and only one cluster – not related.

Phylogenetic Tree - Hierarchical Rattus norvegicus Mus musculus Homo sapiens Equus caballus Gallus gallus Oryctolagus cuniculus Macaca mulatta Ciliary Neurotrophic Factor

Non-Hierarchical

K-Means Method Initial cluster seeds Initial cluster boundaries

After one iteration. New cluster assignments

Decision Tree A flow-chart-like tree structure Internal node denotes a test on an attribute (feature) Branch represents an outcome of the test All records in a branch have the same value for the tested attribute Leaf node represents class label or class label distribution

Example of Decision Tree outlook humiditywindy P PN P N sunny overcast rain highnormal true false Decision Tree forecast for playing golf

Instance Based Learning Training consists of simply storing data No generalizations are made All calculations occur at classification Referred to as “lazy learning” Can be very accurate, but computationally expensive

Instance Based Methods Locally weighted regression Case based reasoning Nearest Neighbor

Advantages Training stage is trivial therefore it is easily adaptable to new instances Very accurate Different “features” may be used for each classification Able to model complex data with less complex approximations

Difficulties All processing done at query time: computationally expensive Determining appropriate distance metric for retrieving related instances Irrelevant features may have a negative impact

Case Based Reasoning Does not use Euclidean space Represented as complex logical descriptions Examples Retrieve help desk information Legal reasoning Conceptual design of mechanical devices

Case Based Process Based on idea that current problems similar to past problems Apply matching algorithms to past problem-solution pairs

Nearest Neighbor Assumes all instances correspond to points in n-dimensional space Nearest neighbor defined as instance closest in Euclidean space (A x -B x ) 2 +(A y -B y ) 2 … D=

Feature Extraction Features: unique characteristics that define an object Features used depend on the problem you are trying to solve Developing a good feature set is more art than science

Sample Case – Identify Flower Species Consider two features: Petal count: range from 3-15 Color: range from 0-255 Assumptions: No two species have exactly the same color Multiple species have same petal count

Graph of Instances Species A Species B Query Petal count Color

Calculate Distances Petal count Color Species A Species B Query

Species is Closest Neighbor Petal count Color Species A Species B Query Nearest Neighbor

Problems Data range for each feature is different Noisy data may lead to wrong conclusion One attribute may hold more importance

Without Normalization 255 Petal count Color 0 3 15

Normalized Normalize by subtracting smallest value from all then divide by largest All values range from 0-1 Petal count Color 0 0 1 1

Noise Strategies Take an average of the k closest instances K-nearest neighbor Prune noisy instances

K-Nearest Neighbors Petal count Color K = 5 Identify as majority of k nearest neighbors Species A Species B Query

Prune “Noisy” Instances Keep track of how often an instance correctly predicts a new instance When the value drops below a certain threshold, remove it from the graph

“Pruned” Graph Petal count Color Species A Species B Query

Avoid Over Fitting - Occams Razor A: Poor but simple B: Good but less simple C: Excellent but too data specific

Weights Weights are added to features more significant than others in producing accurate predictions. Multiply the feature value by the weight. Petal Count Color 0 01 2

Validation Used to calculate error rates and overall accuracy of recognition engine Leave One out: Use n-1 instances in classifier, test, repeat n times. Holdout: Divide data into n groups, use n-1 groups in classifier, repeat n times Bootstrapping: Test with a randomly sampled subset of instances.

Potential Pattern Recognition Problems Are there adequate features to distinguish different classes. Are the features highly correlated. Are there distinct subclasses in the data. Is the feature space too complex.

Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas.

Similar presentations

Presentation on theme: "Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas.

Similar presentations

Presentation on theme: "Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas."— Presentation transcript:

Similar presentations

About project

Feedback