Xilai Li, Tianfu Wu, and Xi Song CVPR 2019 Presented by Dingquan Li

Slides:

Advertisements

Similar presentations

Limin Wang, Yu Qiao, and Xiaoou Tang

Advertisements

Good morning, everyone, thank you for coming to my presentation.

From R-CNN to Fast R-CNN

The Three R’s of Vision Jitendra Malik.

Skeleton Based Action Recognition with Convolutional Neural Network

Deep Residual Learning for Image Recognition

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Comparing TensorFlow Deep Learning Performance Using CPUs, GPUs, Local PCs and Cloud Pace University, Research Day, May 5, 2017 John Lawrence, Jonas Malmsten,

Recent developments in object detection

Deep Residual Learning for Image Recognition

Presented by: Mi Tian, Deepan Sanghavi, Dhaval Dholakia

Deep Residual Networks

Object Detection based on Segment Masks

Object detection with deformable part-based models

基于多核加速计算平台的深度神经网络分割与重训练技术

Object Classification through Deconvolutional Neural Networks

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Inference as a Feedforward Network

Distance Computation “Efficient Distance Computation Between Non-Convex Objects” Sean Quinlan Stanford, 1994 Presentation by Julie Letchner.

Regularizing Face Verification Nets To Discrete-Valued Pain Regression

YOLO9000:Better, Faster, Stronger

Rotational Rectification Network for Robust Pedestrian Detection

Compositional Human Pose Regression

Ajita Rattani and Reza Derakhshani,

Modular Neural Networks for Pattern Classification Using LabVIEW®

Single Image Super-Resolution

Training Techniques for Deep Neural Networks

CVPR 2017 (in submission) Genetic CNN

Machine Learning: The Connectionist

Deep Residual Learning for Image Recognition

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks

Introduction to Neural Networks

Image Classification.

VALSE Webinar ICCV Pre-conference SORT & Genetic CNN

A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,

Solving maximum flows on distribution networks:

Netscope: Traffic Engineering for IP Networks

Fine-Grained Visual Categorization

Neural network training

Lecture: Deep Convolutional Neural Networks

المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen

Use 3D Convolutional Neural Network to Inspect Solder Ball Defects

YOLO-LITE: A Real-Time Object Detection Web Implementation

Outline Background Motivation Proposed Model Experimental Results

Object Tracking: Comparison of

Designing Neural Network Architectures Using Reinforcement Learning

View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.

Problems with CNNs and recent innovations 2/13/19

ImageNet Classification with Deep Convolutional Neural Networks

Heterogeneous convolutional neural networks for visual recognition

Course Recap and What’s Next?

Neural Architecture Search: Basic Approach, Acceleration and Tricks

Human-object interaction

Deep Object Co-Segmentation

Natalie Lang Tomer Malach

Feature Selective Anchor-Free Module for Single-Shot Object Detection

Learning and Memorization

Object Detection Implementations

Search-Based Approaches to Accelerate Deep Learning

Noah’s Ark Lab, Huawei Inc. (华为诺亚方舟实验室)

Laso: Label-Set Operations Networks for Multi-label Few-shot Learning

Point Set Representation for Object Detection and Beyond

Adrian E. Gonzalez , David Parra Department of Computer Science

CRCV REU 2019 Aaron Honculada.

SDSEN: Self-Refining Deep Symmetry Enhanced Network

Deployment Optimization of IoT Devices through Attack Graph Analysis

CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.

Presentation transcript:

Xilai Li, Tianfu Wu, and Xi Song CVPR 2019 Presented by Dingquan Li Learning Deep Compositional Grammatical Architectures for Visual Recognition Xilai Li, Tianfu Wu, and Xi Song CVPR 2019 Presented by Dingquan Li Xilai Li李曦来, EE->CS and-or graph Tianfu Wu, PhD of Song-Chun Zhu Xi Song, student of Yunde Jia, CVPR 2013 and-or graph with Song-Chun Zhu

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Contributions The first work that utilizes (AND-OR) grammar models in network engineering, which facilitates both feature exploration and exploitation in a hierarchical and compositional way. Better performance than state-of-the-art networks in image classification and object detection.

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Motivation and Objective DLA, … Unify the best practices developed in the popular networks? Generate building blocks thus networks in a principled way? By compositional grammatical architectures!

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

AOG Building Block The phrase structure grammar: Terminal, AND, OR The dependency grammar: model lateral connections The hierarchy facilitates gradual increase of feature channels as in Deep Pyramid ResNets [20], and also leads to good balance between depth and width of networks. The compositional structure provides much more flexible information flows than DPN [7] and the DLA [69]. The lateral connections induce feature diversity and increase the effective depth of nodes along the path without introducing extra parameters.

Nodes in AOG Building Block Terminal-nodes implement split-transform heuristic AND-nodes implement DenseNet-like aggregation (i.e., concatenation) for feature exploration. OR-nodes implement ResNet-like aggregation (i.e., summation) for feature exploitation.

Nodes Operations in AOG Block

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

AOGNet

Simplifying AOG Building Blocks

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

CIFAR-10 and CIFAR-100 AOGNet-PrimitiveSize-(#AOG blocks per stage)-[OutputFeatDim] floating point operations per second (FLOPS, flops or flop/s) In the table, FLOPs may indicate floating point operations. Pooling contains no parameters but has floating point operations; Conv has larger FLOPs/#Params than FC. FLOPs/#Params: Pooling/ReLU/…>Conv>FC

ImageNet-1K (cloud platforms)

ImageNet-1K (mobile platforms)

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

PASCAL VOC 2007 and 2012

Experiments Image Classification Object Detection Ablation Study CIFAR-10 CIFAR-100 ImageNet-1K Object Detection PASCAL VOC 2007 PASCAL VOC 2012 Ablation Study

Ablation Study RS: Removing Symmetric child nodes of OR-nodes in the pruned AOG building blocks, LC: adding Lateral Connections for dependency grammars.

Outline Contributions Motivation and Objective Method Overview AOGNets Experiments Conclusions

Conclusions A method of learning deep compositional grammatical architectures which are capable of harnessing the best of grammars and deep neural networks for visual recognition An implementation with AND-OR Grammars, called AOGNets Promise performance on three image classification datasets (CIFAR-10, CIFAR-100, and ImageNet-1K) and two object detection datasets (PASCAL VOC 2007 and 2012)

Update! AOGNets: Compositional Grammatical Architectures for Deep Learning (1711.05847v3, maybe camera-ready) Model Interpretability Adversarial Defense Object Detection and Segmentation in COCO

Model Interpretability

Adversarial Defense