Xilai Li, Tianfu Wu, and Xi Song CVPR 2019 Presented by Dingquan Li

Slides:



Advertisements
Similar presentations
Limin Wang, Yu Qiao, and Xiaoou Tang
Advertisements

Good morning, everyone, thank you for coming to my presentation.
From R-CNN to Fast R-CNN
The Three R’s of Vision Jitendra Malik.
Skeleton Based Action Recognition with Convolutional Neural Network
Deep Residual Learning for Image Recognition
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Comparing TensorFlow Deep Learning Performance Using CPUs, GPUs, Local PCs and Cloud Pace University, Research Day, May 5, 2017 John Lawrence, Jonas Malmsten,
Recent developments in object detection
Deep Residual Learning for Image Recognition
Presented by: Mi Tian, Deepan Sanghavi, Dhaval Dholakia
Deep Residual Networks
Object Detection based on Segment Masks
Object detection with deformable part-based models
基于多核加速计算平台的深度神经网络 分割与重训练技术
Object Classification through Deconvolutional Neural Networks
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Inference as a Feedforward Network
Distance Computation “Efficient Distance Computation Between Non-Convex Objects” Sean Quinlan Stanford, 1994 Presentation by Julie Letchner.
Regularizing Face Verification Nets To Discrete-Valued Pain Regression
YOLO9000:Better, Faster, Stronger
Rotational Rectification Network for Robust Pedestrian Detection
Compositional Human Pose Regression
Ajita Rattani and Reza Derakhshani,
Modular Neural Networks for Pattern Classification Using LabVIEW®
Single Image Super-Resolution
Training Techniques for Deep Neural Networks
CVPR 2017 (in submission) Genetic CNN
Machine Learning: The Connectionist
Deep Residual Learning for Image Recognition
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
Introduction to Neural Networks
Image Classification.
VALSE Webinar ICCV Pre-conference SORT & Genetic CNN
A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,
Solving maximum flows on distribution networks:
Netscope: Traffic Engineering for IP Networks
Fine-Grained Visual Categorization
Neural network training
Lecture: Deep Convolutional Neural Networks
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
Use 3D Convolutional Neural Network to Inspect Solder Ball Defects
YOLO-LITE: A Real-Time Object Detection Web Implementation
Outline Background Motivation Proposed Model Experimental Results
Object Tracking: Comparison of
Designing Neural Network Architectures Using Reinforcement Learning
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
Problems with CNNs and recent innovations 2/13/19
ImageNet Classification with Deep Convolutional Neural Networks
Heterogeneous convolutional neural networks for visual recognition
Course Recap and What’s Next?
Neural Architecture Search: Basic Approach, Acceleration and Tricks
Human-object interaction
Deep Object Co-Segmentation
Natalie Lang Tomer Malach
Feature Selective Anchor-Free Module for Single-Shot Object Detection
Learning and Memorization
Object Detection Implementations
Search-Based Approaches to Accelerate Deep Learning
Noah’s Ark Lab, Huawei Inc. (华为诺亚方舟实验室)
Jiahe Li
Laso: Label-Set Operations Networks for Multi-label Few-shot Learning
Point Set Representation for Object Detection and Beyond
Adrian E. Gonzalez , David Parra Department of Computer Science
CRCV REU 2019 Aaron Honculada.
SDSEN: Self-Refining Deep Symmetry Enhanced Network
Deployment Optimization of IoT Devices through Attack Graph Analysis
CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.
Presentation transcript:

Xilai Li, Tianfu Wu, and Xi Song CVPR 2019 Presented by Dingquan Li Learning Deep Compositional Grammatical Architectures for Visual Recognition Xilai Li, Tianfu Wu, and Xi Song CVPR 2019 Presented by Dingquan Li Xilai Li李曦来, EE->CS and-or graph Tianfu Wu, PhD of Song-Chun Zhu Xi Song, student of Yunde Jia, CVPR 2013 and-or graph with Song-Chun Zhu

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Contributions The first work that utilizes (AND-OR) grammar models in network engineering, which facilitates both feature exploration and exploitation in a hierarchical and compositional way. Better performance than state-of-the-art networks in image classification and object detection.

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Motivation and Objective DLA, … Unify the best practices developed in the popular networks? Generate building blocks thus networks in a principled way? By compositional grammatical architectures!

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

AOG Building Block The phrase structure grammar: Terminal, AND, OR The dependency grammar: model lateral connections The hierarchy facilitates gradual increase of feature channels as in Deep Pyramid ResNets [20], and also leads to good balance between depth and width of networks. The compositional structure provides much more flexible information flows than DPN [7] and the DLA [69]. The lateral connections induce feature diversity and increase the effective depth of nodes along the path without introducing extra parameters.

Nodes in AOG Building Block Terminal-nodes implement split-transform heuristic AND-nodes implement DenseNet-like aggregation (i.e., concatenation) for feature exploration. OR-nodes implement ResNet-like aggregation (i.e., summation) for feature exploitation.

Nodes Operations in AOG Block

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

AOGNet

Simplifying AOG Building Blocks

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

CIFAR-10 and CIFAR-100 AOGNet-PrimitiveSize-(#AOG blocks per stage)-[OutputFeatDim] floating point operations per second (FLOPS, flops or flop/s) In the table, FLOPs may indicate floating point operations. Pooling contains no parameters but has floating point operations; Conv has larger FLOPs/#Params than FC. FLOPs/#Params: Pooling/ReLU/…>Conv>FC

ImageNet-1K (cloud platforms)

ImageNet-1K (mobile platforms)

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

PASCAL VOC 2007 and 2012

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

Ablation Study RS: Removing Symmetric child nodes of OR-nodes in the pruned AOG building blocks, LC: adding Lateral Connections for dependency grammars.

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Conclusions A method of learning deep compositional grammatical architectures which are capable of harnessing the best of grammars and deep neural networks for visual recognition An implementation with AND-OR Grammars, called AOGNets Promise performance on three image classification datasets (CIFAR-10, CIFAR-100, and ImageNet-1K) and two object detection datasets (PASCAL VOC 2007 and 2012)

Update! AOGNets: Compositional Grammatical Architectures for Deep Learning (1711.05847v3, maybe camera-ready) Model Interpretability Adversarial Defense Object Detection and Segmentation in COCO

Model Interpretability

Adversarial Defense