Beyond Nouns Exploiting Preposition and Comparative adjectives for learning visual classifiers
Nouns Nouns Images Widely used Object detection Tag recommendation etc Co-occurrence over large data -> relations
Co-occurrence Examples Jack and Jill Laurel and Hardy
Co-occurrence
Co-occurrence Co-occurrence is good To detect whether the objects are there Eg. Suggesting tags Co-occurrence is bad When we want to disambiguate We might never be able to tell Jack and Jill apart
Relationships How do they help? Additional constraints limit possibilities Other intuitive examples Aptitude questions Like ILP? “Since cars are typically found on streets, it is difficult to resolve the correspondence using co- occurrence alone”
Relationships
Relationships How do we learn them? Manually annotate Learn model Use relationship to predict in testing …. Or …
The Problem We have a weakly labeled dataset (tags only) Relationship model helps us label it strongly Strong labeling helps us derive relationship model Therefore, EM. The labeling (assignment) is treated as the missing data.
Feature Representation
Model
EM
EM M For the noun assignment done earlier, we learn relationship and object classifiers The relationship classifier is modeled on a single feature … GOTO E Boot strapping can be done using any image annotation approach
Model
Inference
Inference
Testing Likelihood models Nouns: Nearest Neighbor Relationships: Decision Stump Evaluation on: subset of Corel5k Training on: 850 images with (173) nouns and (19) hand labeled relationships
Evaluation – resolution of correspondence ambiguities Metrics range of semantics number of unique nouns correct (?) frequency correct number of total nouns correct (?) Compared with Image annotation algorithms Human assisted annotation (are machines better at relationships than us?)
Results
Examples
Evaluation – labeling new images Tested on random subset of Corel5k, based on learnt vocabulary Labels verified by “us” (Corel is misleading) Precision/Recall While recall rates are reported with respect to corel annotations, precision rates are reported with respect to correctness defined by human observers.
Results
Results
Examples