Presentation is loading. Please wait.

Presentation is loading. Please wait.

Active Learning on Spatial Data Christine Körner Fraunhofer AIS, Uni Bonn.

Similar presentations


Presentation on theme: "Active Learning on Spatial Data Christine Körner Fraunhofer AIS, Uni Bonn."— Presentation transcript:

1 Active Learning on Spatial Data Christine Körner Fraunhofer AIS, Uni Bonn

2 2 Outline Active Learning FAW-Project Spatial Data Experiment Outline

3 3 Active Learning Difficult / expensive to obtain labelled data –manual preparation of documents for text mining –analysis of drugs or molecules Active learning strategies actively select which data points to query in order to –minimize the number of training examples for a given classification quality –maximize the quality of results for a given number of data points

4 4 Selective Sampling Which Instance to choose next? Where we have no data? perform poorly? have a low confidence? expect our model to change? previously found data that improved quality? ORACLE Instance Label? add to training set

5 5 The FAW-Project FAW:Association to regulate outdoor commercials Goal:Prediction of traffic frequencies for 82 major German cities Samples:~ 400-1500 poster sites measured per city

6 6 Data Characteristics, Prediction street name, segment ID speed class street type sidewalks one-way-road POIs no. restaurants no. public buildings … spatial coordinates KNN: similarity calculated based on scalar attributes and spatial coordinates applies weights according to (spatial) distance of neighbors

7 7 Spatial Data Spatial Data: spatial covariance between data points high autocorrelation and concentrated linkage* on street name bias test accuracy –1:n relationship between street name and segments –frequencies within one street are alike here: complete instance space is known (all street segments of a city) *David Jensen, Jennifer Neville: Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners NordstraßeRiesenweg Streets Segments 2000 1500 1000 500 0 Frequency

8 8 Active Learning in FAW Usage: additional samples at ~50 places per city KNN needs cross product of street segments with all poster places –Cologne: 50 GB, 5 days Strategy: Data density mean distance of next k neighbors Model differences Build Model Tree with predicted frequencies Disagreement between models?

9 9 Experiment Outline Test Training Oracle Samples Model Tree Distance Ranking for AL KNN Frequencies Iterations Comparison of accuracy-increase using Ranking vs Random order of added samples Alternatives iterative ranking (reality?, greedy search optimal?) rank once, remove similar objects (eg: exclude segments of same street, …) Possible Problems: KNN not very stable few samples, Oracle has little choice to provide requested data sets

10 10 Thank you! Suggestions Ideas Questions


Download ppt "Active Learning on Spatial Data Christine Körner Fraunhofer AIS, Uni Bonn."

Similar presentations


Ads by Google