Scalable Multi-Label Annotation Jia Deng Olga Russakovsky Jonathan Krause, Michael Bernstein Alexander Berg Li Fei-Fei
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird Task: Crowdsource object labels for images. Generalization: musical attributes of songs actions in movies sentiments in documents Application: Benchmarking, training, modeling Multi-label annotation Data Item Labels
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Current focus: 200 Category Detection (~100,000 fully labeled images) Large-Scale Visual Recognition Challenge ILSVRC
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird ?????? Answer Question MachineCrowd Is there a table? Yes
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird +????? Answer Question MachineCrowd Is there a table? Yes
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird ++???? Answer Question MachineCrowd Is there a chair? Yes
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird ++-??? Answer Question MachineCrowd Is there a horse? No
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird ++--?? Answer Question MachineCrowd Is there a dog? No
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird ++---? Answer Question MachineCrowd Is there a cat? No
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Naïve approach: ask for each object cost: estimation: use the crowd-machine diagram show UI. TableChairHorseDogCatBird Answer Question MachineCrowd Is there a bird? No
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird Naïve approach: ask for each object Cost: O(NK) for N images and K objects
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird Furniture Mammal Animal Hierarchy
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird Sparsity Correlation Furniture Mammal Animal Hierarchy
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird ?????? Answer Question Machine Crowd Furniture Mammal Animal Better approach: exploit label structure
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird ?????? Answer Question Machine Crowd Is there an animal? No Furniture Mammal Animal Better approach: exploit label structure
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird ??---- Answer Question Machine Crowd Is there an animal? No Furniture Mammal Animal Better approach: exploit label structure
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird ??---- Answer Question Machine Crowd Is there furniture? Yes Mammal Better approach: exploit label structure Animal Furniture
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Answer Question Machine Crowd Is there a table? Yes Mammal Better approach: exploit label structure Animal TableChairHorseDogCatBird ??---- Furniture
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird +?---- Answer Question Machine Crowd Is there a chair? Yes Mammal Better approach: exploit label structure Animal Furniture
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Answer Question Machine Crowd Is there a chair? Yes Mammal Better approach: exploit label structure Animal TableChairHorseDogCatBird Furniture
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Selecting the Right Question Goal: Get as much utility (new labels) as possible, for as little cost (worker time) as possible, given a desired level of accuracy
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Accuracy constraint User-specified accuracy threshold, e.g., 95% Majority voting assuming uniform worker quality [GAL: Sheng, Provost, Ipeirotis KDD ‘08] Might require only one worker, might require several based on the task
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Cost: worker time (time = money) Question (is there …)Cost (second) a thing used to open cans/bottles14.4 an item that runs on electricity (plugged in or using batteries) 12.6 a stringed instrument3.4 a canine2.0 expected human time to get an answer with 95% accuracy
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Utility: expected # of new labels TableChairHorseDogCatBird ?????? Is there a table? Yes No TableChairHorseDogCatBird +????? TableChairHorseDogCatBird -????? utility = 1
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Utility: expected # of new labels TableChairHorseDogCatBird ?????? Is there a table? Yes No TableChairHorseDogCatBird +????? TableChairHorseDogCatBird -????? TableChairHorseDogCatBird ?????? Is there an animal? TableChairHorseDogCatBird ?????? TableChairHorseDogCatBird ??---- utility = 1 utility = 0.5 * * 4 = 2 Pr(Y) = 0.5 Pr(N) = 0.5
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Pick the question with the most labels per second Query: Is there a... Utility (num labels) Cost (worker time in secs) Utility-Cost Ratio (labels per sec) mammal with claws or fingers living organism mammal creature without legs land or avian creature Selecting the Right Question
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Dataset: 20K images from ImageNet Challenge Labels: 200 basic categories (dog, cat, table…), 64 internal nodes in hierarchy Results
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Dataset: 20K images from ImageNet Challenge Labels: 200 basic categories (dog, cat, table…), 64 internal nodes in hierarchy Setup: – training test split – Estimate parameters on training, simulate on test – Future work: online estimation Results
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Results: accuracy Accuracy Threshold per question (parameter) Accuracy (F1 score) Naïve approach Accuracy (F1 score) Our approach (75.67)99.75 (76.97) (60.17)99.62 (60.69) Annotating 10K images with 200 objects
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Results: cost Accuracy Threshold per question (parameter) Cost saving (our approach compared to naïve approach) x x Annotating 10K images with 200 objects
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Results: cost Accuracy Threshold per question (parameter) Cost saving (our approach compared to naïve approach) x x Annotating 10K images with 200 objects 6 times more labels per second
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Speeds up crowdsourced multi-label annotation by exploiting the structure and distribution of labels. Could be a bargain for you! TableChairHorseDogCatBird Furniture Mamma l Animal Hierarchy Correlation Sparsity Conclusions
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei
TableChairHorseDogCatBird ?????? Answer Question MachineCrowd Is there an animal? No Furniture Mammal Animal Better approach: exploiting hierarchy, sparsity and correlation
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird ??---- Answer Question MachineCrowd Is there an animal? No Furniture Mammal Animal Better Approach
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird ??---- Answer Question MachineCrowd Is there furniture? Yes Furniture Mammal Animal Better Approach
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird +?---- Answer Question MachineCrowd Is there a table? Yes Furniture Mammal Animal Better Approach
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei TableChairHorseDogCatBird Answer Question MachineCrowd Is there a chair? Yes Furniture Mammal Animal Better Approach
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Algorithm Initialize all labels to ‘missing’ While there are missing labels do – Select a question Q from all possible questions – Obtain an answer A to question Q from the crowd – Set values of some labels to +1 or -1
Deng, Russakovsky, Krause, Bernstein, Berg, Fei-Fei Algorithm Initialize all labels to ‘missing’ While there are missing labels do – Select a question Q from all possible questions – Obtain an answer A to question Q from the crowd – Set values of some labels to +1 or -1