Presentation is loading. Please wait.

Presentation is loading. Please wait.

Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1.

Similar presentations


Presentation on theme: "Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1."— Presentation transcript:

1 Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1 Sun Yet-Sun University, China 2 IBM T.J.Watson 3 University of Illinois at Chicago

2 What is sample selection bias? Inductive learning: training data (x,y) is sampled from the universe of examples. In many applications: training data (x,y) is not sampled randomly. Insurance and mortgage data: you only know those people you give a policy. School data: self-select There are different possibilities of how (x,y) is selected (Zadrozny04) S=1 denotes (x,y) is chosen. S is independent from x and y. Total random sample. S is dependent on y not x. Class bias S is dependent on x not on y. Feature bias. S is dependent on both x and y. Both class and feature. Ubiquitous: Loan Approval, Drug screening, Weather forecasting, Ad Campaign, Fraud Detection, User Profiling, Biomedical Informatics, Intrusion Detection Insurance, etc

3 Our method Key ideas: Original DatasetStructural DiscoveryStructural RebalanceCorrected Dataset Automatic Clustering Advantages: 1. Type Independent 2. Model Independent 3. Straightforward 2. Select trustful ones 3. Label by neighbors 1. The same proportion


Download ppt "Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1."

Similar presentations


Ads by Google