Motivation The subjects/objects are correlated to each other under semantic relationships.

Slides:



Advertisements
Similar presentations
1 Query Processing in Spatial Network Databases presented by Hao Hong Dimitris Papadias Jun Zhang Hong Kong University of Science and Technology Nikos.
Advertisements

Article review by Alexander Backus Distributed representations meeting article review.
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.
Patch to the Future: Unsupervised Visual Prediction
INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.
Large-Scale Object Recognition with Weak Supervision
Modeling Scene and Object Contexts for Human Action Retrieval with Few Examples Yu-Gang Jiang Zhenguo Li Shih-Fu Chang IEEE Transactions on CSVT 2011.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
On the Relationship between Visual Attributes and Convolutional Networks Paper ID - 52.
SWE 423: Multimedia Systems Chapter 4: Graphics and Images (4)
Advance Security term paper Proposal Wei Huang Spring 2005.
A probabilistic approach to semantic representation Paper by Thomas L. Griffiths and Mark Steyvers.
Spatial Pyramid Pooling in Deep Convolutional
From R-CNN to Fast R-CNN
Information Retrieval in Practice
A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Why Categorize in Computer Vision ?. Why Use Categories? People love categories!
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Research Projects 6v81 Multimedia Database Yohan Jin, T.A.
Joseph Jaquish 1/10. - Spatial Data Types - Spatial Database Architectures - Geographic Information Systems 2/10.
1 1. Representing and Parameterizing Agent Behaviors Jan Allbeck and Norm Badler 연세대학교 컴퓨터과학과 로봇 공학 특강 학기 유 지 오.
MPEG-4: Multimedia Coding Standard Supporting Mobile Multimedia System Lian Mo, Alan Jiang, Junhua Ding April, 2001.
Feedforward semantic segmentation with zoom-out features
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Generating and Using a Qualified Medical Knowledge Graph for Patient Cohort Retrieval from Big Clinical Electroencephalography (EEG) Data Sanda Harabagiu,
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
1.Learn appearance based models for concepts 2.Compute posterior probabilities or Semantic Multinomial (SMN) under appearance models. -But, suffers from.
Graphs help us visualize numerical data.
Hierarchical Motion Evolution for Action Recognition Authors: Hongsong Wang, Wei Wang, Liang Wang Center for Research on Intelligent Perception and Computing,
NEIL: Extracting Visual Knowledge from Web Data Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta Carnegie Mellon University CS381V Visual Recognition -
Top 10 Unsolved Information Visualization Problems Chaomei Chen Drexel University.
Distributed Representations for Natural Language Processing
Recent developments in object detection
Excel Charts and Graphs
Spatial Outlier Detection
Marked Point Processes for Crowd Counting
Deep Predictive Model for Autonomous Driving
Inference as a Feedforward Network
An Additive Latent Feature Model
Journal of Vision. 2017;17(4):9. doi: / Figure Legend:
Preparing and Interpreting Tables, Graphs and Figures
Rotational Rectification Network for Robust Pedestrian Detection
Huazhong University of Science and Technology
Presenter: Hajar Emami
Adversarially Tuned Scene Generation
Paper Presentation Aryeh Zapinsky
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Overview Proposed Approach Experiments Compositional inference
Part-based visual tracking with online latent structural learning -Rui Yao et al. ICCV 2013 Cvlab Jung ilchae.
Outline Background Motivation Proposed Model Experimental Results
Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences 1, Zhizhong.
RCNN, Fast-RCNN, Faster-RCNN
Learning Object Context for Dense Captioning
Word2Vec.
Objective - To order whole numbers.
GRAPHING LINEAR EQUATIONS
Human-object interaction
Motivation It can effectively mine multi-modal knowledge with structured textural and visual relationships from web automatically. We propose BC-DNN method.
Motivation Semantic Transformation Module Most of the existing works neglect the semantic relationship between the visual feature and linguistic knowledge,
Line Graphs.
Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.
Object Detection Implementations
Background Task Fashion image inpainting Some conceptions
Learning to Detect Human-Object Interactions with Knowledge
Learning to Cluster Faces on an Affinity Graph
Visual Grounding.
ICCV 2019.
CVPR 2019 Poster.
Presentation transcript:

Motivation The subjects/objects are correlated to each other under semantic relationships

Contributions We propose a context-dependent diffusion network (CDDN) framework to deal with visual relationship detection Semantic graphs to encapsule semantic priors of subjects/objects Visual scene graphs to capture the surrounding context Graph diffusion network to learn latent representations of objects

Architecture

Object Association Global Semantic Graph: 𝒢 1 = 𝒱 1 , ℰ 1 (Whole training set) 𝓋: object ℯ: Spatial Scene Graph: 𝒢 2 = 𝒱 2 , ℰ 2 (single picture) 𝓋: bounding box 𝑏 𝑖 , 𝑏 𝑗

Diffusion Layer 𝒢= 𝒱,𝐴,𝑋 , 𝒱= 𝑣 1 , 𝑣 2 ,… 𝑣 𝑁 , A∈ ℝ 𝑁×𝑁 , 𝑋= 𝑥 1 , 𝑥 2 ,… 𝑥 𝑁 ∈ ℝ 𝑁×𝑑 𝐴 ∈ ℝ 𝑁×𝐻×𝑁 : H power series of A 𝑊∈ ℝ 𝑁×𝐻×𝑁

Ranking Loss Triplet 𝑟= 𝑠, 𝑝, 𝑜

Experiment