Structured Predictions with Deep Learning

Slides:

Advertisements

Similar presentations

Classification spotlights

Advertisements

Generic object detection with deformable part-based models

Detection, Segmentation and Fine-grained Localization

ECE 6504: Deep Learning for Perception

Fully Convolutional Networks for Semantic Segmentation

Feedforward semantic segmentation with zoom-out features

Unsupervised Visual Representation Learning by Context Prediction

Cascade Region Regression for Robust Object Detection

Deep Residual Learning for Image Recognition

Gaussian Conditional Random Field Network for Semantic Segmentation

NEIL: Extracting Visual Knowledge from Web Data Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta Carnegie Mellon University CS381V Visual Recognition -

Machine learning & object recognition Cordelia Schmid Jakob Verbeek.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Recent developments in object detection

Deep Residual Learning for Image Recognition

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

CNN-RNN: A Uniﬁed Framework for Multi-label Image Classiﬁcation

The Relationship between Deep Learning and Brain Function

Object Detection based on Segment Masks

Deep Learning Amin Sobhani.

Compact Bilinear Pooling

Object detection with deformable part-based models

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

The Problem: Classification

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Announcements Project proposal due tomorrow

Combining CNN with RNN for scene labeling (segmentation)

Compositional Human Pose Regression

Part-Based Room Categorization for Household Service Robots

Inception and Residual Architecture in Deep Convolutional Networks

Efficient Deep Model for Monocular Road Segmentation

CS6890 Deep Learning Weizhen Cai

Deep Residual Learning for Image Recognition

Project Implementation for ITCS4122

Adversarially Tuned Scene Generation

Object detection.

Fully Convolutional Networks for Semantic Segmentation

Normalized Cut Loss for Weakly-supervised CNN Segmentation

Computer Vision James Hays

Image Classification.

A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,

Object Detection + Deep Learning

KFC: Keypoints, Features and Correspondences

A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE

Semantic segmentation

Neural network training

Lecture: Deep Convolutional Neural Networks

Visualizing CNNs and Deeper Deep Architectures

Outline Background Motivation Proposed Model Experimental Results

Object Tracking: Comparison of

RCNN, Fast-RCNN, Faster-RCNN

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Going Deeper with Convolutions

Problems with CNNs and recent innovations 2/13/19

Heterogeneous convolutional neural networks for visual recognition

Convolutional Neural Network

Course Recap and What’s Next?

Department of Computer Science Ben-Gurion University of the Negev

Dynamic Neural Networks Joseph E. Gonzalez

Human-object interaction

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Semantic Segmentation

Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation

Presented By: Harshul Gupta

Learning Deconvolution Network for Semantic Segmentation

End-to-End Facial Alignment and Recognition

Appearance Transformer (AT)

Report 2 Brandon Silva.

An introduction to Machine Learning (ML)

Presentation transcript:

Structured Predictions with Deep Learning James Hays

Recap of previous lecture COCO dataset. Instance segmentation of 80 categories. Keypoints + Language + other annotations, as well. Deeper deep models VGG networks GoogLeNet built from “Inception” modules ResNet Deeper networks seem to work better than the equivalent shallow network with the same number of parameters, but they aren’t trivial to train.

a classification network “tabby cat” note omissions “activations” fixed size input, single label output desire: efficient per-pixel output Fully Convolutional Networks for Semantic Segmentation. Jon Long, Evan Shelhamer, Trevor Darrell. CVPR 2015

becoming fully convolutional Note: “Fully Convolutional” and “Fully Connected” aren’t the same thing. They’re almost opposites, in fact.

becoming fully convolutional

upsampling output

end-to-end, pixels-to-pixels network

What if we want other types of outputs? Easy: Predict any number of labels (with classification, there will be just one best answer, but for other labels like attributes dozens could be appropriate for an image)

What if we want other types of outputs? Easy*: Predict any fixed dimensional output, whether a feature (embedding networks) or an image. Scribbler: Controlling Deep Image Synthesis with Sketch and Color. Sangkloy, Lu, Chen Yu, and Hays. CVPR 2017 *easy to design an architecture. Not necessarily easy to get working.

What if we want other types of outputs? Hard: Outputs with varying dimensionality or cardinality A natural language image caption An arbitrary number of human keypoints (17 points each) An arbitrary number of bounding boxes (4 parameters each) Today we will examine state-of-the-art methods for keypoint prediction and object detection

Convolutional Pose Machines Variant of Convolutional Pose Machines that won the inaugural COCO keypoint challenge. http://image-net.org/challenges/talks/2016/Multi- person%20pose%20estimation-CMU.pdf Videos: https://www.youtube.com/playlist?list=PLNh5A7Ht LRcpsMfvyG0DED-Dr4zW5Lpcg Multi-Person Pose Estimation using Part Affinity Fields Zhe Cao, Shih-En Wei, Tomas Simon, Yaser Sheikh Carnegie Mellon University

SSDBox Object Detector that is very nearly state-of-the-art accuracy and very, very fast http://www.cs.unc.edu/~wliu/papers/ssd_eccv201 6_slide.pdf