Presentation is loading. Please wait.

Presentation is loading. Please wait.

24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford.

Similar presentations


Presentation on theme: "24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford."— Presentation transcript:

1 24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford

2 24 October 2002Data Mining & Visualization2 Tools l NAG Data Mining Components n Statistical & machine learning routines (written in C) n Data cleaning n Data transformation n Model building s classification, clustering, prediction l IRIS Explorer n Modular visualization environment, n Visual programming interface n Extensible - users can add new modules

3 24 October 2002Data Mining & Visualization3 Example l Image from Landsat Multi-Spectral Scanner n Each pixel = 80 m 2 l 36 independent variables per region n Each region = 3 by 3 array of pixels n 4 spectral bands per pixel l Each pixel is in one of six classes n types of land use l Want to extrapolate for class values elsewhere

4 24 October 2002Data Mining & Visualization4 Treatment (1) l Use principal component analysis n reduces 36 dimensions to 2 s explains ~ 85% of variance l Choose three classes n 2 - cotton crop n 4 - damp grey soil n 5 - soil with vegetation stubble l 2 independent variables, 1 class variable n 1364 points

5 24 October 2002Data Mining & Visualization5 Treatment (2) l Model data using a decision tree (Quinlan’s C4.5) n Each node splits data into two sets n Aims to maximise separation of classes n Splitting continues recursively l Classify original data using decision tree n each point is assigned to a class n 92% agreement with original classes l Classify new data using decision tree n e.g. points on a grid n establish boundaries of classes

6 24 October 2002Data Mining & Visualization6 Visualization

7 24 October 2002Data Mining & Visualization7 Original class values

8 24 October 2002Data Mining & Visualization8 Predicted class values

9 24 October 2002Data Mining & Visualization9 Visualization

10 24 October 2002Data Mining & Visualization10 Lessons learnt / next steps l Visual programming interface l Easy to link components & IRIS Explorer n building modules using C interface l Extensible to other data sets? l What about larger data sets?


Download ppt "24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford."

Similar presentations


Ads by Google