Task 1 (I1.1): Fundamentals of Context-aware Real-time Data Fusion.

Task 1 (I1.1): Fundamentals of Context-aware Real-time Data Fusion

Fundamental of Multi-modal Data Fusion on Multimedia Information Networks Principal Investigator: Thomas Huang Post Doctor: Xi Zhou PhD Student: Guo-Jun Qi Electrical and Computer Engineering University of Illinois at Urbana-Champaign

Project Team Principal Investigator: Thomas Huang Collaborators: – IBM: Charu Aggarval (QoI and sensor networks) – IBM: Zhen Wen (social networks) – UIUC: Tarek Abdelzaher (communication networks) – CUNY: Heng Ji (natural language processing) Post doctorate researcher: Xi Zhou PhD student: Guo-Jun Qi, Mert Dickman and Zhaowen Wang Undergraduate student: Shiyu Chang

Motivation Structured information networks – Can handle heterogeneous structure with various input types – can effectively model large structured ontological network at semantic level – Structure is a way to represent context Utilization – efficient and effective inference engine – Information and knowledge extraction from ontological networks

Contributions to I1.1 Connections to constrained conditional model (CCM) – Discover constraint links between heterogeneous objects Between concept nodes Connections to latency analysis – Reveal cross-media redundancy/relationship – Trade-off between low-latency and high quality

Multimedia Information Network Is graph with both data nodes and concept nodes – Edges linking concepts: ontology – Edges linking data nodes: similarity, association and co-occurrence – Edges linking concept and data: attachment of concept to data

Multimedia Information Network (MINet) Data nodes: heterogeneous networks with cross- media contents – Videos/Images/Speech – Surrounding text/user tags – GPS meta-data Concept nodes: Ontological Networks with correlated categories – Non-flat concept structure – Example links between concepts A is a subclass of B C is a part of D X attacks Y

Network Structure

Potential Army Impact Construct large scale MINets combining – Cross-media heterogeneous data networks Examples – Battlefield videos/images – Satellite images – Acoustics Sensor signal – Ontological concept networks Military-related concepts and their links Make better military decisions – More timely and more accurately – More robust with missing information

Technical Contributions Cross-Domain Knowledge Propagation – Propagating Knowledge in surrounding text to visual data – Published in WWW’11, collaboration with Dr. Charu Aggarwal, IBM Cross-Category Knowledge Sharing – Exploring the concept correlations to enhance the inference accuracy – To appear in CVPR’11, collaboration with Dr. Charu Aggarwal, IBM Modeling Context-Aware Image Similarity – Using Hierarchical Gaussianization (HG), ICCV’09 – Applications into Disaster Assessment (Collaboration with Prof. Tarek) – KDD’11, submitted

Cross-Domain Knowledge Propagation: Two Steps How to bridge the domain gap between text and image? – Our approach: We construct a translator function between text and images that establishes “virtual” links between them. How can we annotate image labels from text labels? – Our approach: The labels of text can be propagated into that of images via the learned translator.

Challenges The model can – Work in constrained environment Missing links between text and images Learn translation function to link text and images – Be resistant to noisy cross-media links, improved QoI Misleading related text surrounding images Use a compact intermediate representation to remove nonessential and noisy links – Low-rank principle with fewest topics for across-domain translation

Cross-Domain Label Propagation Label Propagation:

Cross-Domain Label Propagation Source labels Label Propagation:

Cross-Domain Label Propagation Cross-domain translator Label Propagation:

Cross-Domain Label Propagation Prediction function Label Propagation:

Learning Optimal Translator Learning formulation via optimizing translator function: The first term: maximize across-domain association from a set of co-occurrence pairs of source-target instances. The second term: minimize the training loss The third term: regularizer for preference of concise translator to tedious one Improve QoI : remove nonessential and noisy observation from translation process

Constructing Cross-Domain Translator Source instances (text) Target instances (images) Bridge the cross-domain gap?

Constructing Cross-Domain Translator Source instances (text) Target instances (images) W (s) W (t) Common Latent Space Inner product in latent space as translator

Constructing Cross-Domain Translator A low dimensional latent space is preferred – Impose Normal l 2 regularizer to improve the prediction accuracy Trace norm – Equivalent to a low-rank prior on latent space – Indicate Principle of concise cross-domain translation: “fewer latent topics (dimensionality) are preferred!”

Experiments: Cross-Domain Dataset Text corpus and associated images are crawled from Flickr.com and wikipedia.com. We extract and spam all tokens in each text document, whose frequencies are used as text features. For each image, visual words are extracted with a size of 500 codebook.

Dataset Statistics CategoryNumber of crawled pairs CategoryNumber of crawled pairs Birds930Horses654 Buildings9216Mountain4153 Cars728Plane1356 Cat229Train457 Dog486Waterfall22006 The number of text and image pairs for each category

Dataset (cont’d)

Compared Algorithms Image only – only the visual features are used for modeling classifiers on the target image domain. Translated Learning by minimizing Risk (TLRisk) – Transfer text labels in the source domain to the target image domain via a Markovian chain. Heterogeneous Transfer Learing (HTL) – Implicitly construct a distance function between images by a matrix factorization between images and text documents

Results Average error rates with respect to different number of training samples in image domain.

Results Average error rates with respect to different number of text/image co-occurrence pairs with five training examples)

Results Number of Topics in latent space for establishing cross-domain translator Category# topics Birds11 Buildings88 Cars19 Cat18 Dog7 Horses4 Mountain6 Plane15 Train6 Waterfall21 Too many building variants!

Revisit Technical Contributions Cross-Domain Knowledge Propagation – Propagating Knowledge in surrounding text to visual data – Published in WWW’11, collaboration with Dr. Charu Aggarwal, IBM Cross-Category Knowledge Sharing – Exploring the concept correlations to enhance the inference accuracy – To appear in CVPR’11, collaboration with Dr. Charu Aggarwal, IBM Modeling Image Similarity – Hierarchical Gaussianization (HG), ICCV’09 – Applications into Disaster Assessment (Collaboration with Prof. Tarek)

Future Work (Q3) Resource allocation based on heterogeneous links for communication – Low-redundancy: In base station, send the most informative message (text/multimedia data) – High-quality: In data center, recover the lost information based on redundancy in cross-media links Effective linkage analysis with constraints in CCM

Future Work (Q4) Develop the stochastic and dynamic model and theory for MINet – The effect of structural changes in MINet For latency analysis in communication networks For constrained linkage discovery in CCM – The changes of QoI in a dynamic MINet

Path Ahead: Theory and Algorithm Construct Cross-Media Analysis (CMA) Theory – Stochastic model for cross-media relation and redundancy QoI theory in cross-media networks Information recovery based on cross-media redundancy Dynamic model for cross-media networks Analyze constrained links for CCM Practical algorithms for sharing and transmitting information in cross-media links Improve low latency and high quality in communication networks based on cross-media analysis Applications into CCM for robust constrained link discovery Cross-media knowledge sharing and discovery

Collaboration Summary INARC 1.1: Prof. Tarek Abdelzaher – Cross-media analysis for communication networks – Trade-off between Low latency and high quality INARC 1.2: Dr. Charu Aggarwal – Cross-domain knowledge propagation – Cross-Category knowledge sharing – Quality of Information

Publications Collaboration with Dr. Charu Aggarwal (IBM) – Guo-Jun Qi, Charu Aggarwal and Thomas Huang, Towards Cross-Domain Knowledge Propagation from Text Corpus to Web Images, to appear in Proc. of International World Wide Web conference (WWW 2011), Hyderabad, India, March 28-April 1, 2011. – Guo-Jun Qi, Charu Aggarwal, Yong Rui, Qi Tian, Shiyu Chang and Thomas Huang. Towards Cross-Category Knowledge Propagation for Learning Visual Concepts. To appear in IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011), Colorado Springs, Colorado, June 21-23, 2011. – Guo-Jun Qi, Charu Aggarwal, Thomas Huang. Transfer learning with distance functions between text and web images. Submitted to the ACM KDD Conference, 2011. Collaboration with Prof. Tarek Abdelzaher – Md Y. S. Uddin, Guo-Jun Qi, and Tarek Abdelzaher, Thomas Huang, Guohong Cao, “PhotoNet: A Similarity-aware Image Delivery Service for Situation Awareness,” IPSN Demo, April 2011

Thanks! Q&A

Dataset in Target Domain The number of images for each category in target domain Category# Positive examples # Negative examples Birds338349 Buildings23012388 Cars120125 Cat6772 Dog132142 Horses263268 Mountain9271065 Plane509549 Train5253 Waterfall51535737

Learning Optimal Translator Learning formulation via optimizing translator function: The first term: measuring the consistency between the observed occurrence of text and images. Occurrence set is monotonically decreasing function, so that a pair with larger occurrence number c k,l will be weighted more. Co-occurring pairs of source and target samples probably share the same labels, and the translator T shall have larger response to propagate the labels between them.

Learning Optimal Predictor Learning formulation via optimizing translator function: The second term: the loss function of predictor f T on training set (e.g., logistic loss). encode the discriminative knowledge in the training set. Large margin principle: it can reduces the noisy information in the occurring set for the classification task.

Learning Optimal Predictor Learning formulation via optimizing translator function: The third term: encoding the preference of concise semantic translation to the tedious one. The Principle of constructing “Cross-Domain translator.” Nonessential and noisy observation can be filtered out from translation process

Results Number of Topics in latent space for establishing cross-domain translator CategoryTwo Trn. Ex.Ten Trn. Ex. Birds119 Buildings88102 Cars193 Cat182 Dog75 Horses41 Mountain61 Plane1525 Train63 Waterfall2126 Too many building variants!

Modeling Context-Aware Image Similarity Current method – Image visual similarity – Hierarchical Gaussianization ICCV’09 (Zhou, Huang etc.) – Hard to model image similarity at semantic level Model image semantic similarity – Link images to text documents by translator – Compare associated text similarity for comparing image semantics – Advantage ``Semantic gap” in text documents are smaller Such similarity reflects semantic level information

Diagram Image Similarity (target domain) Text-image Association by learned translator T (x, y) Text Similarity (source domain)

Path ahead Improve the Quality of Information (QoI) transmitted across domains. – In some cases, the transmitted information may make a negative effect on classification task (negative information transfer). – Construct a new model which allows to predict upon target domain itself when the cross-domain information is detected to be noise.

Future Work Semantic Level Image similarity in heterogeneous networks – Different sources of heterogeneous sensors, e.g., cameras, human annotations and textual descriptions – Fusing heterogeneous sources in the networks to learn a more descriptive image similarity – Collaboration with Dr. Charu Aggarwal in IBM on sensor networks and Prof. Tarek Abdelzaher in UIUC on Fact Finder

Future Work Social Media Networks – Explore social relationship in social media networks Track abnormal events and topic evolution based on images and videos – Collaboration with Zhen Wen in IBM on social networks

Linked to INARC Projects Collaborator – Prof. Tarek Abdelzaher in CS, UIUC (I 1.1) Fact Finder: Compare the image similarity at semantic level for discovering trustful sources – Dr. Charu Aggarwal in IBM (I 1.2) Sensor networks: comparing the signal similarity with cross-domain knowledge

Task 1 (I1.1): Fundamentals of Context-aware Real-time Data Fusion.

Similar presentations

Presentation on theme: "Task 1 (I1.1): Fundamentals of Context-aware Real-time Data Fusion."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Task 1 (I1.1): Fundamentals of Context-aware Real-time Data Fusion.

Similar presentations

Presentation on theme: "Task 1 (I1.1): Fundamentals of Context-aware Real-time Data Fusion."— Presentation transcript:

Similar presentations

About project

Feedback