Download presentation
Presentation is loading. Please wait.
Published byJewel Melton Modified over 9 years ago
1
Wenyuan Dai, Ou Jin, Gui-Rong Xue, Qiang Yang and Yong Yu Shanghai Jiao Tong University & Hong Kong University of Science and Technology
2
Motivation Problem Formulation Graph Construction Simple Review on Spectral Analysis Learning from Graph Spectra Experiments Result Conclusion
4
A variety of transfer learning tasks have been investigated.
5
Difference ◦ Different tasks ◦ Different approaches & algorithms Common Common parts or relation
6
We can have a graph: Features Auxiliary Data Training Data Test Data Labels New Representation
7
We can get the new representation of Training Data and Test Data by Spectral Analysis. Then we can use our traditional non-transfer learner again.
9
Target Training Data: with labels Target Test Data: without labels Auxiliary Data: Task ◦ Cross-domain Learning ◦ Cross-category Learning ◦ Self-taught Learning
12
Cross-domain Learning -( )- -( 1 )-
13
Cross-category Learning -( )- -( 1 )-
14
Self-taught Learning -( )- -( 1 )-
15
Doc-Token MatrixAdjacency Matrix Token … Doc … FeatureLabel Doc? Feature?0 Label00
17
G is an undirected weighted graph with weight matrix W, where. D is a diagonal matrix, where Unnormalized graph Laplacian matrix: Normalized graph Laplacians:
18
Calculate the first k eigenvectors The New representation: New Feature Vector of the Node2
20
Graph G Adjacency matrix of G: Graph Laplacian of G: Solve the generalized eigenproblem: The first k eigenvectors form a new feature representation. Apply traditional learners such as NB, SVM
21
DocFeatureLabel Doc Feature Label DocFeatureLabel Doc Feature Label v1v2 Train Test Auxiliary Feature Label Trainv1v1 v2v2 Testv1v1 v2v2 Classifier
22
The only problem remain is the computation time. Which is lucky: ◦ Matrix L is sparse ◦ There are fast algorithms for sparse matrix for solving eigen-problem. (Lanczos) The final computational cost is linear to
24
Basic Progress Training Data Test Data Auxiliary Data New Training Data New Test Data 15 Positive Instances & 15 Negative Instances Baseline Result Repeat 10 times Calculate average Sample Classifier (NB/SVM/TSVM) CV
25
Cross-domain Learning Data ◦ SRAA ◦ 20 Newsgroups (Lang, 1995) ◦ Reuters-21578 Target data and auxiliary data share the same categories(top directories), but belong to different domains(sub-directories).
26
Cross-domain result with NB
27
Cross-domain result with SVM
28
Cross-domain result with TSVM
29
Cross-domain result on average Non-TransferSimple CombineEigenTransfer NB0.250±0.0360.239±0.0000.134±0.031 SVM0.190±0.0390.213±0.0000.095±0.018 TSVM0.140±0.0380.145±0.0000.101±0.019
30
Cross-category Learning Data ◦ 20 Newsgroups (Lang, 1995) ◦ Ohscal data set from OHSUMED (Hersh et al. 1994) Random select two categories as target data. Take the other categories as auxiliary labeled data.
31
Cross-category result with NB
32
Cross-category result with SVM
33
Cross-category result with TSVM
34
Cross-category result on average Non-TransferEigenTransfer NB0.186±0.0380.099±0.025 SVM0.131±0.0320.065±0.016 TSVM0.104±0.0100.091±0.013
35
Self-taught Learning Data ◦ 20 Newsgroups (Lang, 1995) ◦ Ohscal data set from OHSUMED (Hersh et al. 1994) Random select two categories as target data. Take the other categories as auxiliary without labeled data.
36
Self-taught result with NB
37
Self-taught result with SVM
38
Self-taught result with TSVM
39
Self-taught result on average Non-TransferEigenTransfer NB0.189±0.0380.107±0.032 SVM0.126±0.0300.070±0.017 TSVM0.106±0.0110.098±0.024
40
Effect of the number of Eigenvectors
41
Labeled Target Data
42
We proposed a general transfer learning framework. It can model a variety of existing transfer learning problems and solutions. Our experimental results show that it can greatly outperform non-transfer learners in many experiments.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.