Download presentation
Presentation is loading. Please wait.
Published byMercy Little Modified over 9 years ago
1
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1
2
Introduction Network data Collective Classification ICA Problem Algorithm For Collective Inference With Noise Experiments Conclusions 2
3
traditional data: instances are independent of each other network data: instances are maybe related of each other application: email web pages papers citation independent related 3
4
classify interrelated instances using relational features. Related instances: The instances to be classified are related. Classifier: The base classifier uses content features and relational features. Collective inference: update the class labels recompute the relational feature values 4
5
ICA : Iterative Collective Algorithm 5 Initial : Training local classifier use content features predict unlabel instances Iterative { for predict each unlabel instance { set unlabel instance ’s relational feature use local classifier predict unlabel instance } } step1 step2
6
6 1 2 2 A E 1 1 B 1 C D F 1 G 3 H Class label : 1 2 3 Initial : use content features predict unlabel instances Iterative 1 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances 2 1 2 3 3 Iterative 2 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances unlabel data: training data:
7
label the wrong class make a mistake difficult to judge 7 2 1 1 A B 2 2 D C E F G content feature a.b. womanmanage≤20age>20 0101 class non-smokingsmoking 12 1 1 or 2 ? 1
8
8 1 1 1 A B unlabel data: training data: 1 1 1 C 1 - noise 2 22 A True label: B ICA: 1 C Initial : use content features predict unlabel instances Iterative 1 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances 1 Iterative 2 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances relational feature Class 1Class 2Class 3 Iteration 1 2/3 1/3 0 Iteration 2 2/3 1/3 0 2 A : D
9
ACIN : Algorithm For Collective Inference With Noise 9 Initial : Training local classifier use content features predict unlabel instances Iterative { for predict each unlabel instance { for nb unlabel instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s relational feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5
10
Iterative 2 : 1. repredict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances 10 1 1 1 A B unlabel data: training data: 1 1 1 C 1 - noise 2 22 A True label: B ACIN: 1 C Initial : use content features predict unlabel instances Iterative 1 : 1. predict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances 1 relational feature Class 1Class 2Class 3 Iteration 1 70/130 60/130 0 Iteration 2 60/120 60/120 0 ( 2, 60%) ( 1, 70%) 2 ( 2, 60%) ( 1, 70%) ( 2, 60%) ( 1, 90%) predict again ( 1, 60%) ( 2,60%) A : ( 1, 60%) D 2
11
Compare with ICA 1. different method for compute relational feature 2. predict unlabel instance ’s neighbors again 3. retrain local classifier 11
12
compute relational feature use probability 12 A 1 2 3 ( 1, 80%) ( 2, 60%) ( 3, 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3
13
predict unlabel instance ’s neighbors again first iteration need to predict again different between original and predict label : ▪ Next iteration need predict again ▪ This iteration not to adopt similar between original and predict label : ▪ Next iteration not need predict again ▪ Average the probability 13 A 12 ( 1, 80%) ( 2, 60%) ( 2, 80%)( 2, 60%) ( 2, 70%) ( 2, 60%) 1 predict again B C Example:
14
14 2 A B C - noise 3 D 1 ( 1, 60%) ( 2, 70%) ( 3, 60%) ( 2, 80%) ( 2, 70%) predict again ( 2, 75%) ( 3, 60%) ( ?, ??%) relational feature Class 1Class 2Class 3 original60/19070/19060/190 Method 150/18575/18560/185 Method 20/195135/19560/195 Method 30/13575/13560/135 B: Method 1 : (1, 50%) Method 2 : (2, 60%) Method 3 : (1, 0%) 2 B’s True label : 2 B is noise: Method 2 >Method 3> Method 1 B is not noise: Method 1 >Method 3> Method 2 Method 1 & Method 2 is too extreme. So we choose the Method 3.
15
retrain local classifier 15 ( 3, 70%) 1 + A B C D E 2 1 1 +2 A 3 B C D E 2 1 ( 1, 90%) ( 2, 60%) ( 2, 70%) ( 1, 80%) retrain Initial ( ICA )
16
Data sets : Cora CiteSeer WebKB 16
17
17
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.