What Did We See? & WikiGIS Chris Pal University of Massachusetts A Talk for Memex Day MSR Redmond, July 19, 2006
Research Questions 1.How do personal and community photo- journals and blogs interact? Spectrum from personal blogs – community portals (bliki’s) – Wiki articles (most public) User Interface & Social Computing Research 2.Can we ‘mine’ information in Blogs ? Find Blog entries that look like Wiki entries, extract information, encourage contributions? Document and Text Processing Research 3.What is the role of computer vision for location and object recognition? Can we use these methods to provide the user with relevant information?
Search Blogs and Wiki Entries
Questions About Observations
Search and Social Computing I Discover that my friend Justin also found an interesting mushroom Have I been here as well?
1. Object Recognition From Images and Text 2. Location Recognition From Images and Text Object and Location Recognition
Conditional Random Fields y t-1 y t x t y t+1 x t +1 x t y t+2 x t +2 y t+3 x t +3 said Ling a Microsoft VP … OTHER PERSON OTHER ORG TITLE … Named Entities (SFSM states) Binary Features Input Sequence Widely applicable, many positive results e.g. speech recognition Fact Extraction (from Blogs and Wikis) Address extraction Information Extraction Example
Research Result - Training a CRF Define the vector of feature values a time t Define the global feature function as The gradient of the conditional log likelihood Model expectation, i.e.Empirical expectation
Results: CRF Training NetTalk text-to-speech: Linear-chain CRF training using sparse inference 75% less training time than exact training, with no loss in accuracy Accuracy: Fixed: 85.7 KL: 91.6 Exact: 91.6
SenseCam Enhanced Blogs Produce Lots of Data for Location Recognition
Multi-Conditional Learning Motivation - Simple GMM Example Joint Conditional Multi-Conditional
Multi-Conditional Learning One motivation: Conditional Random Fields can be derived from a traditional joint model But, there are many other conditional distributions that could be defined What do we gain if we model those as well? Other combinations possible
Image Segmentation/Pixel Classification MSR Cambridge / Berkeley Data
Mixtures of Factor Analyzers Generative model for simultaneous dimensionality reduction and clustering We wish to obtain a discriminative version of this type of model discriminatively
Performance vs. Model Complexity Interesting ? Joint Optimization benefits more substantially from additional data.
Performance with More Data Training Set AccuracyTest Set Accuracy hmm…
Search Blogs of Friends
Detect and Find Expert Knowledge
Simple Exponential Family Models for Documents
Results: Document Classification
New Graphical Models for and Blogs xbxb y NbNb xsxs NsNs xrxr N r-1 Body Title Friends Words Words discussed Predicted Recipient NrNr - function - random variable - N replications N Model: Nb words in the body, Ns words in the subject, Nr recipients The graph describes the joint distribution of random variables in term of the product of local functions Scenario: Predict which friends might be interested in your new Blog entry New Idea: Plated Factor Graphs
Detect Quality Content and Encourage Knowledge Contributions
Conclusions, Present & Future Work WikiGIS – Merged Blogs, Blikis and Wikis with Microsoft Virtual Earth Merge the SenseCam with a smart Phone - Enable Intelligent Digital Assistants - Output to the television Next Steps: Location and object recognition enabling information retrieval Other Uses: Assistive Technology for the Elderly
References & Results so Far with Charles Sutton and Andrew McCallum. Sparse Forward-Backward using Minimum Divergence Beams for Fast Training of Conditional Random Fields. In proceedings of ICASSP 2006.Sparse Forward-Backward using Minimum Divergence Beams for Fast Training of Conditional Random Fields with Michael Kelm and Andrew McCallum. Combining Generative and Discriminative Methods for Pixel Classification with Multi-Conditional Learning To appear in the proceedings of ICPR 2006.Combining Generative and Discriminative Methods for Pixel Classification with Multi-Conditional Learning with Andrew McCallum, Greg Druck and Xuerui Wang. Multi-Conditional Learning: Generative/ Discriminative Training for Clustering and Classification To appear in the proceedings of AAAI Multi-Conditional Learning: Generative/ Discriminative Training for Clustering and Classification CC Prediction with graphical models To appear in the proceedings of CEAS 2006.CC Prediction with graphical models