© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,

Slides:

Advertisements

Similar presentations

Michele Merler Jacquilene Jacob.  Applications online are inherently insecure  Growing rate of hackers  Confidentiality of online systems should be.

Advertisements

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

Three things everyone should know to improve object retrieval

Presented by Xinyu Chang

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Face Alignment with Part-Based Modeling

Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.

Robust Object Tracking via Sparsity-based Collaborative Model

Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

UPM, Faculty of Computer Science & IT, A robust automated attendance system using face recognition techniques PhD proposal; May 2009 Gawed Nagi.

5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.

Spatial Pyramid Pooling in Deep Convolutional

Predicting Matchability - CVPR 2014 Paper -

A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.

Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.

Bag-of-Words based Image Classification Joost van de Weijer.

Identifying Computer Graphics Using HSV Model And Statistical Moments Of Characteristic Functions Xiao Cai, Yuewen Wang.

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.

Action recognition with improved trajectories

Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.

Zhenbao Liu 1, Shaoguang Cheng 1, Shuhui Bu 1, Ke Li 2 1 Northwest Polytechnical University, Xi’an, China. 2 Information Engineering University, Zhengzhou,

Pedestrian Detection and Localization

Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison February 2, 2010 Acknowledgments:

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.

POSTER TEMPLATE BY: Background Objectives Psychophysical Experiment Smoothness Features Project Pipeline and outlines The purpose.

Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.

A Discriminatively Trained, Multiscale, Deformable Part Model Yeong-Jun Cho Computer Vision and Pattern Recognition,2008.

Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.

Does one size really fit all? Evaluating classifiers in a Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.

Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.

Experience Report: System Log Analysis for Anomaly Detection

Guillaume-Alexandre Bilodeau

M.A. Maraci, C.P. Bridge, R. Napolitano, A. Papageorghiou, J.A. Noble

Bag-of-Visual-Words Based Feature Extraction

Object Detection from Segmented Images

Data Driven Attributes for Action Detection

CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1.

Learning Mid-Level Features For Recognition

Hybrid Features based Gender Classification

Brain Hemorrhage Detection and Classification Steps

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Action Recognition in Temporally Untrimmed Videos

The Open World of Micro-Videos

Brief Review of Recognition + Context

Approaching an ML Problem

Xin Qi, Matthew Keally, Gang Zhou, Yantao Li, Zhen Ren

Classification Breakdown

THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU

Presentation transcript:

© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research, India

© 2013 IBM Corporation Point and Shoot  Smartphones enable easy image capture.  Growing number of smartphone users opens up possibility for real-world applications in image classification.

© 2013 IBM Corporation Overview of GHMC  GHMC has a vision to make Hyderabad a citizen friendly, well-governed and environmental friendly city by providing high quality services.  Ensure the city is clean by monitoring the trash collection on a daily basis.  Third party supervisors use smartphones to capture images of the dumpsters through mobile application and submit them to an online server where they are manually analyzed.  Need: Provides a transparent interface to the citizens. Allows for validation of submitted feedback and to take corrective actions if required.

© 2013 IBM Corporation The Task  We automate the process the of identifying the state of the dumpster bins.  Perform binary image classification over the dumpster images to classify into one of the following categories. –Clean (trash is not visible from the bin opening) –Unclean (trash is visible from the bin opening)  Unique Problem: Classification between the two states of the same object. In literature, focus is around retrieval and recognition tasks for mobile imagery. Challenging imaging conditions, background clutter in images and complex urban environment.

© 2013 IBM Corporation The Proposed Framework Region of Interest (ROI) Image Data Stage 1: Detection Feature Computation Train and Test with SVM Classifier Stage 2: Classification  We proposed a simple multi-stage pipeline to perform image classification.  Data Collection: Utilizing a web-crawler we downloaded the images from the publically accessible web portal. We excluded images that are ambiguous - contain multiple dumpsters - dumpster lid area is not visible A total of 1710 images were collected. Manual labels for the images served as ground truth.  Challenges: Varying illumination conditions Image background clutter Different scales and viewing angles of the dumpsters.

© 2013 IBM Corporation Cropped ImagesCompute SIFT Features Cluster SIFT Features Generate Visual Vocabulary Match Visual Words Frequent Visual Words Find Visual Words Extract Region of Interest (ROI) Step 1: Training to Generate Frequent Visual Words Step 2: Finding Frequent Visual Words to Extract ROI Generate Bounding Box Image Data Typically a sliding window approach is utilized for object localization (computationally expensive). We use Bag of Words (BoW) approach for detection (typically used for recognition/classification). Dumpster is present and identifiable in every image. Identify visual words that represent the dumpster. Match local features (SIFT) with the visual words to obtain the region of interest (ROI). The Detection Stage

© 2013 IBM Corporation The Classification Stage Image Data Train Data Test Data Detect ROI HOG Feature Computation k-Fold Cross Validation SVM Classifier Predict Labels Learning Training (Step 1) Testing (Step 2) Classifier Training: - identify and extract the ROI. - compute the HOG features over ROI. - reduce dimensionality with Fisher’s LDA. - train kernel SVM with a RBF kernel. - k-fold cross validation to estimate optimal classifier parameters. Classifier Testing: - classification performed using training parameters. - extract ROI -> compute HOG -> perform LDA -> classify with SVM. Data of 1710 images divided into training (90%) and testing (10%) set.

© 2013 IBM Corporation Results and Performance Evaluation  An accuracy of 80.59% was achieved on the unseen test data.  ROC curves were generated to assess the performance and AUC was computed.  Area under the curve (AUC) was computed to be  Comparison with conventional single stage classification pipelines  HOG, LBP and Haralick’s texture features were used in single stage SVM.  Proposed multi-stage approach outperforms all of the single stage variants.

© 2013 IBM Corporation Conclusion  Developed and implemented a multi-stage image classification system.  Efficient and robust for challenging imaging conditions in real-world mobile sensing applications.  Demonstrated the effectiveness for the real-world images of dumpsters captured with mobile phones.  Achieved an accuracy of 80.59% on a challenging (public) dataset.  Shown to outperform conventional single-stage image classification techniques.  The proposed pipeline can be extended to other real-world applications in mobile sensing by experimenting with other features suitable to the task at hand.

© 2013 IBM Corporation Thank You