Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas.

Similar presentations


Presentation on theme: "Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas."— Presentation transcript:

1 Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas

2 Overview Pattern Recognition, Machine Learning, Data Mining: How do they fit together? Example Techniques K-Nearest Neighbor Explained

3 Data Mining Searching through electronically stored data in an automatic way Solving problems with already known data Essentially, discovering patterns in data Has several subsets from statistics to machine learning

4 Machine Learning Construct computer programs that improve with use A methodology Draws from many fields: Statistics, Information Theory, Biology, Philosophy, Computer Science... Several sub-disciplines: Feature Extraction, Pattern Recognition

5 Pattern recognition Operation and design of systems that detect patterns in data The algorithmic process Applications include image analysis, character recognition, speech analysis, and machine diagnostics.

6 Pattern Recognition Process Gather data Determine features to use Extract features Train your recognition engine Classify new instances

7 Artificial Neural Networks A type of artificial intelligence that attempts to imitate the way a human brain works. Creates connections between processing elements, computer equivalent of neurons Supervised technique

8 ANN continued Tolerant of errors in data Many applications: speech recognition, analyze visual scenes, robot control Best at interpreting complex real world sensor data.

9 The Brain Human brain has about 10 11 neurons Each connect to about 10 4 other neurons Switch in about 10 -3 seconds Slow compared to a computers at 10 -13 seconds Brain recognizes a familiar face in about 10 -1 seconds Only 200-300 cycles at its switch rate Brain utilizes MASSIVE parallel processing, considers many factors at once.

10 Neural Network Diagram

11 Stuttgart Neural Network Simulator

12 Bayesian Theory Deals with statistical probabilities One of the best for classifying text Require prior knowledge about the expected probabilities

13 Conveyor Belt Example Want to sort apples and orange on conveyor belt. Notice 80% are orange, therefore 80% are oranges. Bayesian theory says Decide w org if P(w org |x) > P(w app |x); otherwise decide w app

14 Clustering A process of partitioning data into meaningful sub-classes (clusters). Most techniques are unsupervised. Two main categories: Hierarchical: Nested classes displayed as a dendrogram Non-Hierarchical: Each class in one and only one cluster – not related.

15 Phylogenetic Tree - Hierarchical Rattus norvegicus Mus musculus Homo sapiens Equus caballus Gallus gallus Oryctolagus cuniculus Macaca mulatta Ciliary Neurotrophic Factor

16 Non-Hierarchical

17 K-Means Method Initial cluster seeds Initial cluster boundaries

18 After one iteration. New cluster assignments

19 Decision Tree A flow-chart-like tree structure Internal node denotes a test on an attribute (feature) Branch represents an outcome of the test All records in a branch have the same value for the tested attribute Leaf node represents class label or class label distribution

20 Example of Decision Tree outlook humiditywindy P PN P N sunny overcast rain highnormal true false Decision Tree forecast for playing golf

21 Instance Based Learning Training consists of simply storing data No generalizations are made All calculations occur at classification Referred to as “lazy learning” Can be very accurate, but computationally expensive

22 Instance Based Methods Locally weighted regression Case based reasoning Nearest Neighbor

23 Advantages Training stage is trivial therefore it is easily adaptable to new instances Very accurate Different “features” may be used for each classification Able to model complex data with less complex approximations

24 Difficulties All processing done at query time: computationally expensive Determining appropriate distance metric for retrieving related instances Irrelevant features may have a negative impact

25 Case Based Reasoning Does not use Euclidean space Represented as complex logical descriptions Examples Retrieve help desk information Legal reasoning Conceptual design of mechanical devices

26 Case Based Process Based on idea that current problems similar to past problems Apply matching algorithms to past problem-solution pairs

27 Nearest Neighbor Assumes all instances correspond to points in n-dimensional space Nearest neighbor defined as instance closest in Euclidean space (A x -B x ) 2 +(A y -B y ) 2 … D=

28 Feature Extraction Features: unique characteristics that define an object Features used depend on the problem you are trying to solve Developing a good feature set is more art than science

29 Sample Case – Identify Flower Species Consider two features: Petal count: range from 3-15 Color: range from 0-255 Assumptions: No two species have exactly the same color Multiple species have same petal count

30 Graph of Instances Species A Species B Query Petal count Color

31 Calculate Distances Petal count Color Species A Species B Query

32 Species is Closest Neighbor Petal count Color Species A Species B Query Nearest Neighbor

33 Problems Data range for each feature is different Noisy data may lead to wrong conclusion One attribute may hold more importance

34 Without Normalization 255 Petal count Color 0 3 15

35 Normalized Normalize by subtracting smallest value from all then divide by largest All values range from 0-1 Petal count Color 0 0 1 1

36 Noise Strategies Take an average of the k closest instances K-nearest neighbor Prune noisy instances

37 K-Nearest Neighbors Petal count Color K = 5 Identify as majority of k nearest neighbors Species A Species B Query

38 Prune “Noisy” Instances Keep track of how often an instance correctly predicts a new instance When the value drops below a certain threshold, remove it from the graph

39 “Pruned” Graph Petal count Color Species A Species B Query

40 Avoid Over Fitting - Occams Razor A: Poor but simple B: Good but less simple C: Excellent but too data specific

41 Weights Weights are added to features more significant than others in producing accurate predictions. Multiply the feature value by the weight. Petal Count Color 0 01 2

42 Validation Used to calculate error rates and overall accuracy of recognition engine Leave One out: Use n-1 instances in classifier, test, repeat n times. Holdout: Divide data into n groups, use n-1 groups in classifier, repeat n times Bootstrapping: Test with a randomly sampled subset of instances.

43 Potential Pattern Recognition Problems Are there adequate features to distinguish different classes. Are the features highly correlated. Are there distinct subclasses in the data. Is the feature space too complex.

44


Download ppt "Pattern Recognition K-Nearest Neighbor Explained By Arthur Evans John Sikorski Patricia Thomas."

Similar presentations


Ads by Google