Download presentation
Presentation is loading. Please wait.
Published byRolf Washington Modified over 8 years ago
1
Team: flyingsky Reporter: YanJie Fu & ChuanRen Liu Institution: Chinese Academy of Sciences
2
1.Challenge Result 2.Environment and Tools 3.Approach and Strategy 4.Summary
3
Global Score 0.628873 Score 0.862212
4
Linux Python+Java Weka+Our Extension based on Weka
5
Approach and Strategy - Procedure 1st submit 3rd submit 2nd submit …… Final submit Predict with Clustering Method 1st submit Predict with Ensemble Method Kth (K>1) submit
6
Approach and Strategy - Method in 1 st Submit Clustering into 2 groups The Whole Dataset Cluster(+1)Cluster(-1) Label Judging Only 1 given label
7
Approach and Strategy - Query Sample Selection The Whole DatasetTest DatasetTrain Dataset Prediction committee (SVM, J48, Bayes) Disagree Agree Query Sample Instances with biggest differences between prediction results from different predictors in committee
8
SVM Higher Accuracy J48 Based on Classification Tree More Quickly With Fewer Features Dataset A has the least features in 5 final datasets Bayes Fast with good accuracy
9
Train Data (Query Results) Test Data SVMJ48Bayes Weighted Voting Adjustment Prediction
10
At the first time, with only one label, we use a clustering method. And then, at the second, third, fourth,.... time, we use an ensemble method including SVM, J48, Bayes. Every time, we choose those instances with biggest differences between prediction results from different predictors(SVM, J48, Bayes) as query samples, because, in the ensemble method, if every "committee", actually a predictor, said: i think the result is +1, or the result is -1, it is not necessary for us to take it as a query sample first, we should place more attentions on such a condition: one "committee" called SVM stood out and said: I donot agree with the prediction result of Bayes. Such a kind of instances should be taken as query samples first.
11
There is an method called "must link" and "cannot link". Its main idea is to transpose the matrix of test dataset, view instances as features and view features as instances, then we can apply association analysis on the transposed matrix. Actually, the result of analysis can be use to improve the final ensemble voting results before every submit. We think those predictors are not as suitable as each other on predicting some datasets of a special topic, for example, handwriting, and so on. So a predictor in the committee should be a weighted predictor. We can do cross evaluation on labels provided by the system after several queries, and get prediction accuracy of different predictor (SVM J48 Bayes).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.