Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1.

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1

 Introduction  Network data  Collective Classification  ICA  Problem  Algorithm For Collective Inference With Noise  Experiments  Conclusions 2

 traditional data:  instances are independent of each other  network data:  instances are maybe related of each other  application:  email  web pages  papers citation independent related 3

 classify interrelated instances using relational features.  Related instances:  The instances to be classified are related.  Classifier:  The base classifier uses content features and relational features.  Collective inference:  update the class labels  recompute the relational feature values 4

 ICA : Iterative Collective Algorithm 5 Initial : Training local classifier use content features predict unlabel instances Iterative { for predict each unlabel instance { set unlabel instance ’s relational feature use local classifier predict unlabel instance } } step1 step2

6 1 2 2 A E 1 1 B 1 C D F 1 G 3 H Class label : 1 2 3 Initial : use content features predict unlabel instances Iterative 1 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances 2 1 2 3 3 Iterative 2 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances unlabel data: training data:

 label the wrong class  make a mistake  difficult to judge 7 2 1 1 A B 2 2 D C E F G content feature a.b. womanmanage≤20age>20 0101 class non-smokingsmoking 12 1 1 or 2 ? 1

8 1 1 1 A B unlabel data: training data: 1 1 1 C 1 - noise 2 22 A True label: B ICA: 1 C Initial : use content features predict unlabel instances Iterative 1 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances 1 Iterative 2 : 1. set unlabel instance ’s relational feature 2. use local classifier predict unlabel instances relational feature Class 1Class 2Class 3 Iteration 1 2/3 1/3 0 Iteration 2 2/3 1/3 0 2 A : D

 ACIN : Algorithm For Collective Inference With Noise 9 Initial : Training local classifier use content features predict unlabel instances Iterative { for predict each unlabel instance { for nb unlabel instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s relational feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5

Iterative 2 : 1. repredict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances 10 1 1 1 A B unlabel data: training data: 1 1 1 C 1 - noise 2 22 A True label: B ACIN: 1 C Initial : use content features predict unlabel instances Iterative 1 : 1. predict unlabel instance ’s neighbors 2. set unlabel instance ’s relational feature 3. use local classifier predict unlabel instances 1 relational feature Class 1Class 2Class 3 Iteration 1 70/130 60/130 0 Iteration 2 60/120 60/120 0 ( 2, 60%) ( 1, 70%) 2 ( 2, 60%) ( 1, 70%) ( 2, 60%) ( 1, 90%) predict again ( 1, 60%) ( 2,60%) A : ( 1, 60%) D 2

 Compare with ICA  1. different method for compute relational feature  2. predict unlabel instance ’s neighbors again  3. retrain local classifier 11

 compute relational feature  use probability 12 A 1 2 3 ( 1, 80%) ( 2, 60%) ( 3, 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3

 predict unlabel instance ’s neighbors again  first iteration need to predict again  different between original and predict label : ▪ Next iteration need predict again ▪ This iteration not to adopt  similar between original and predict label : ▪ Next iteration not need predict again ▪ Average the probability 13 A 12 ( 1, 80%) ( 2, 60%) ( 2, 80%)( 2, 60%) ( 2, 70%) ( 2, 60%) 1 predict again B C Example:

14 2 A B C - noise 3 D 1 ( 1, 60%) ( 2, 70%) ( 3, 60%) ( 2, 80%) ( 2, 70%) predict again ( 2, 75%) ( 3, 60%) ( ?, ??%) relational feature Class 1Class 2Class 3 original60/19070/19060/190 Method 150/18575/18560/185 Method 20/195135/19560/195 Method 30/13575/13560/135 B: Method 1 : (1, 50%) Method 2 : (2, 60%) Method 3 : (1, 0%) 2 B’s True label : 2 B is noise: Method 2 >Method 3> Method 1 B is not noise: Method 1 >Method 3> Method 2 Method 1 & Method 2 is too extreme. So we choose the Method 3.

 retrain local classifier 15 ( 3, 70%) 1 + A B C D E 2 1 1 +2 A 3 B C D E 2 1 ( 1, 90%) ( 2, 60%) ( 2, 70%) ( 1, 80%) retrain Initial ( ICA )

 Data sets :  Cora  CiteSeer  WebKB 16

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1.

Similar presentations

Presentation on theme: "Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1.

Similar presentations

Presentation on theme: "Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2012.10.11 1."— Presentation transcript:

Similar presentations

About project

Feedback