Download presentation
Presentation is loading. Please wait.
Published byTheodore Nash Modified over 9 years ago
1
24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford
2
24 October 2002Data Mining & Visualization2 Tools l NAG Data Mining Components n Statistical & machine learning routines (written in C) n Data cleaning n Data transformation n Model building s classification, clustering, prediction l IRIS Explorer n Modular visualization environment, n Visual programming interface n Extensible - users can add new modules
3
24 October 2002Data Mining & Visualization3 Example l Image from Landsat Multi-Spectral Scanner n Each pixel = 80 m 2 l 36 independent variables per region n Each region = 3 by 3 array of pixels n 4 spectral bands per pixel l Each pixel is in one of six classes n types of land use l Want to extrapolate for class values elsewhere
4
24 October 2002Data Mining & Visualization4 Treatment (1) l Use principal component analysis n reduces 36 dimensions to 2 s explains ~ 85% of variance l Choose three classes n 2 - cotton crop n 4 - damp grey soil n 5 - soil with vegetation stubble l 2 independent variables, 1 class variable n 1364 points
5
24 October 2002Data Mining & Visualization5 Treatment (2) l Model data using a decision tree (Quinlan’s C4.5) n Each node splits data into two sets n Aims to maximise separation of classes n Splitting continues recursively l Classify original data using decision tree n each point is assigned to a class n 92% agreement with original classes l Classify new data using decision tree n e.g. points on a grid n establish boundaries of classes
6
24 October 2002Data Mining & Visualization6 Visualization
7
24 October 2002Data Mining & Visualization7 Original class values
8
24 October 2002Data Mining & Visualization8 Predicted class values
9
24 October 2002Data Mining & Visualization9 Visualization
10
24 October 2002Data Mining & Visualization10 Lessons learnt / next steps l Visual programming interface l Easy to link components & IRIS Explorer n building modules using C interface l Extensible to other data sets? l What about larger data sets?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.