Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :
Introduction Network data Collective Classification ICA Problem Algorithm For Collective Inference With Noise Experiments Conclusions 2
traditional data: instances are independent of each other network data: instances are maybe related of each other application: web pages papers citation independent related 3
classify interrelated instances using relational features. Related instances: The instances to be classified are related. Classifier: The base classifier uses content features and relational features. Collective inference: update the class labels recompute the relational feature values 4
ICA : Iterative Collective Algorithm 5 Initial : Training local classifier use content features predict unlabel instances Iterative { for predict each unlabel instance { set unlabel instance ’s relational feature use local classifier predict unlabel instance } } step1 step2
A E 1 1 B 1 C D F 1 G 3 H Class label : Initial : use content features predict unlabel instances Iterative 1 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances Iterative 2 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances unlabel data: training data:
label the wrong class make a mistake difficult to judge A B 2 2 D C E F G content feature a.b. womanmanage≤20age> class non-smokingsmoking or 2 ? 1
A B unlabel data: training data: C 1 - noise 2 22 A True label: B ICA: 1 C Initial : use content features predict unlabel instances Iterative 1 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances 1 Iterative 2 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances relational feature Class 1Class 2Class 3 Iteration 1 2/3 1/3 0 Iteration 2 2/3 1/3 0 2 A : D
ACIN : Algorithm For Collective Inference With Noise 9 Initial : Training local classifier use content features predict unlabel instances Iterative { for predict each unlabel instance { for nb unlabel instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s relational feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5
Iterative 2 : 1. repredict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances A B unlabel data: training data: C 1 - noise 2 22 A True label: B ACIN: 1 C Initial : use content features predict unlabel instances Iterative 1 : 1. predict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances 1 relational feature Class 1Class 2Class 3 Iteration 1 70/130 60/130 0 Iteration 2 60/120 60/120 0 ( 2, 60%) ( 1, 70%) 2 ( 2, 60%) ( 1, 70%) ( 2, 60%) ( 1, 90%) predict again ( 1, 60%) ( 2,60%) A : ( 1, 60%) D 2
Compare with ICA 1. different method for compute relational feature 2. predict unlabel instance ’s neighbors again 3. retrain local classifier 11
compute relational feature use probability 12 A ( 1, 80%) ( 2, 60%) ( 3, 70%) Our method: Class 1 : 80/( ) Class 2 : 60/( ) Class 3 : 70/( ) General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3
predict unlabel instance ’s neighbors again first iteration need to predict again different between original and predict label : ▪ Next iteration need predict again ▪ This iteration not to adopt similar between original and predict label : ▪ Next iteration not need predict again ▪ Average the probability 13 A 12 ( 1, 80%) ( 2, 60%) ( 2, 80%)( 2, 60%) ( 2, 70%) ( 2, 60%) 1 predict again B C Example:
14 2 A B C - noise 3 D 1 ( 1, 60%) ( 2, 70%) ( 3, 60%) ( 2, 80%) ( 2, 70%) predict again ( 2, 75%) ( 3, 60%) ( ?, ??%) relational feature Class 1Class 2Class 3 original60/19070/19060/190 Method 150/18575/18560/185 Method 20/195135/19560/195 Method 30/13575/13560/135 B: Method 1 : (1, 50%) Method 2 : (2, 60%) Method 3 : (1, 0%) 2 B’s True label : 2 B is noise: Method 2 >Method 3> Method 1 B is not noise: Method 1 >Method 3> Method 2 Method 1 & Method 2 is too extreme. So we choose the Method 3.
retrain local classifier 15 ( 3, 70%) 1 + A B C D E A 3 B C D E 2 1 ( 1, 90%) ( 2, 60%) ( 2, 70%) ( 1, 80%) retrain Initial ( ICA )
Data sets : Cora CiteSeer WebKB 16
17