Download presentation
Presentation is loading. Please wait.
Published byDustin Scott Modified over 9 years ago
1
Introduction to IEEE ICDM Data Mining Contest (ICDM DMC 2007) guofeng314@163.com
2
Main Parts Introduction to ICDM DMC 2007 The work of our team
3
Introduction to ICDM DMC 2007 This year’contest is the first IEEE ICDM Data Mining Contest,which will be held in conjunction with the 2007 IEEE International Conference on Data Mining. http://www.cse.ust.hk/~qyang/ICDMDMC 07/
4
What is the Problem? This year's contest is about indoor location estimation from radio signal strengths received by a client device from various WiFi Access Points (APs)WiFi
5
What is the AP? Access Points are base stations for the wireless network. They transmit and receive radio frequencies for wireless enabled devices to communicate with.Access Pointswireless network
6
The client device (which can be a PDA) is equipped with a wireless card that can receive signals from many surrounding wireless access points (APs). Each of these APs is identifiable with a unique ID. Based on the collection of signal strength values (RSS values), a data mining algorithm running on the client device tries to figure out the current location of the user.
7
RSS Vectors RSS Vector = The ID of AP is an integer between 0 and 100. The value is also an interger between 0 and –99. The number k is different in difference RSS The WiFi data are very noisy due to the so-called multi-path effect in indoor environments
8
Location Label All WiFi data are collected in 247 locations, where each location is a grid. A grid has a size of about 1.5m×1.5m. Location label is an integer between 1 and 247.
9
Task 1. Indoor Location Estimation All the WiFi data (training data and test data) are collected by the same device in the same time period. There are two types of data provided in this task: 1 trace data 2 non-trace data.
10
Task1. trace data
11
Some statistical information of task1.trace data 40 traces 1404 collections, 130 collections labeled 11881 pairs of APID and value Average 8.5 pairs of APID and value per collection, the minimum is 1,maximum is 19
12
Task1. non-trace data
13
Some statistical information of task1.non-trace data 1792 collections of RSS values 375 collections labeled Average 8.5 pairs of APID and value per collection, the minimum is 1,maximum is 19 15256 pairs of APID and value
14
Task_2_training_data
15
Some statistical information of Task_2_training_data 2322 collections of RSS values 621 collections labeled 2.5 collections labeled per class. Min is 1 and max is 8 Average 8.6 pairs of APID and value per collection, the minimum is 2,maximum is 19
16
Task2 Test Dataset
17
Task2 Landmark Dataset
18
Evaluation Criterion For Task 1, baseline is precision=60%. For Task 2, baseline is precision=30%.
19
The algorithm of our team for task2
20
Step1:sieve out the collections labeled
21
Step2:Get Differences of Arbitrary Two Collections labeled Number of the pairs of APID – value which are only in one collection Sum of absolute of such RSS value above with - 100 Number of the pairs of APID – value which are in two collection Sum of absolute of such RSS value above Is or is not same location, 1 is same and –1 is not
22
An example collectionA:11918:-9623:-87 66:-69 collectionB: 5418:-9483:-62 85:-7686:-7289:-85 The Five number is 6,149,1,2,-1
23
Step3:Get coefficients by Linear Fitting e=dlmread('distance_matrix.txt'); b=e(:,5); x=e(:,7:9); x(:,2)=[]; [x1,y1] = find(b>0); x_pos =x(x1,:); b_pos=b(x1,1); x_append = x; b_append = b; for i = 1:floor(length(b)/length(b_pos)) x_append=cat(1,x_append,x_pos); b_append = cat(1,b_append,b_pos); end a=x_append\b_append; c=(x*a).*b; accuracy = sum(c>0)/length(b); display(accuracy);
24
Remainder Steps: Step4: Get centers of per class( the collections of the same location) Step5: Testing. Our highest precision=28.30%
25
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.