Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences.

Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences

1.Challenge Result 2.Environment and Tools 3.Approach and Strategy 4.Summary

Global Score 0.628873 Score 0.862212

 Linux  Python+Java  Weka+Our Extension based on Weka

Approach and Strategy - Procedure 1st submit 3rd submit 2nd submit …… Final submit Predict with Clustering Method 1st submit Predict with Ensemble Method Kth (K>1) submit

Approach and Strategy - Method in 1 st Submit Clustering into 2 groups The Whole Dataset Cluster(+1)Cluster(-1) Label Judging Only 1 given label

Approach and Strategy - Query Sample Selection The Whole DatasetTest DatasetTrain Dataset Prediction committee (SVM, J48, Bayes) Disagree Agree Query Sample Instances with biggest differences between prediction results from different predictors in committee

 SVM  Higher Accuracy  J48  Based on Classification Tree  More Quickly With Fewer Features  Dataset A has the least features in 5 final datasets  Bayes  Fast with good accuracy

Train Data (Query Results) Test Data SVMJ48Bayes Weighted Voting Adjustment Prediction

 At the first time, with only one label, we use a clustering method.  And then, at the second, third, fourth,.... time, we use an ensemble method including SVM, J48, Bayes.  Every time, we choose those instances with biggest differences between prediction results from different predictors(SVM, J48, Bayes) as query samples, because, in the ensemble method, if every "committee", actually a predictor, said: i think the result is +1, or the result is -1, it is not necessary for us to take it as a query sample first, we should place more attentions on such a condition: one "committee" called SVM stood out and said: I donot agree with the prediction result of Bayes. Such a kind of instances should be taken as query samples first.

 There is an method called "must link" and "cannot link". Its main idea is to transpose the matrix of test dataset, view instances as features and view features as instances, then we can apply association analysis on the transposed matrix. Actually, the result of analysis can be use to improve the final ensemble voting results before every submit.  We think those predictors are not as suitable as each other on predicting some datasets of a special topic, for example, handwriting, and so on. So a predictor in the committee should be a weighted predictor. We can do cross evaluation on labels provided by the system after several queries, and get prediction accuracy of different predictor (SVM J48 Bayes).

Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences.

Similar presentations

Presentation on theme: "Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences.

Similar presentations

Presentation on theme: "Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences."— Presentation transcript:

Similar presentations

About project

Feedback