Im2Calories: towards an automated mobile vision food diary

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
CSE 803 Fall 2008 Stockman1 Veggie Vision by IBM Ideas about a practical system to make more efficient the selling and inventory of produce in a grocery.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Presented by Zeehasham Rasheed
Recommender systems Ram Akella November 26 th 2008.
CSE 803 Fall 2008 Stockman1 Veggie Vision by IBM Ideas about a practical system to make more efficient the selling and inventory of produce in a grocery.
R-CNN By Zhang Liliang.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Principles of Control.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
CS378 - Mobile Computing What's Next?. Fragments Added in Android 3.0, a release aimed at tablets A fragment is a portion of the UI in an Activity multiple.
by B. Zadrozny and C. Elkan
Object Detection Sliding Window Based Approach Context Helps
Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.
Create your futurewww.utdallas.edu Data To determine whether there was a market need for healthier menu options at The Pub, a survey was developed to ask.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition.
Prediction of Molecular Bioactivity for Drug Design Experiences from the KDD Cup 2001 competition Sunita Sarawagi, IITB
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Random Sampling. Introduction Scientists cannot possibly count every organism in a population. One way to estimate the size of a population is to collect.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.
Support Vector Machines and Gene Function Prediction Brown et al PNAS. CS 466 Saurabh Sinha.
POSTER TEMPLATE BY: Background Objectives Psychophysical Experiment Smoothness Features Project Pipeline and outlines The purpose.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Machine Learning in Practice Lecture 6 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Go to enquos.com & click Register to register for your enquos account enquos.com Enter your activation code here.
Cancer Metastases Classification in Histological Whole Slide Images
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Semi-Supervised Clustering
Histograms CSE 6363 – Machine Learning Vassilis Athitsos
Object Detection based on Segment Masks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
An Artificial Intelligence Approach to Precision Oncology
Semi-supervised Machine Learning Gergana Lazarova
Automatic Lung Cancer Diagnosis from CT Scans (Week 4)
Project 4: Facial Image Analysis with Support Vector Machines
Action Recognition in the Presence of One
CS 698 | Current Topics in Data Science
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Project Implementation for ITCS4122
Adversarially Tuned Scene Generation
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
Multimedia Training Kit
Bird-species Recognition Using Convolutional Neural Network
Supervised Classification
Brief Review of Recognition + Context
On-going research on Object Detection *Some modification after seminar
Adaptive object recognition in RGBz images
Object Recognition Today we will move on to… April 12, 2018
Discriminative Frequent Pattern Analysis for Effective Classification
Dr. Borji Aisha Urooj Cecilia La Place
Panagiotis G. Ipeirotis Luis Gravano
RCNN, Fast-RCNN, Faster-RCNN
Machine Learning in Practice Lecture 6
Machine Learning in Practice Lecture 27
Topological Signatures For Fast Mobility Analysis
Object Detection Implementations
Overview: Chapter 2 Localization and Tracking
Sign Language Recognition With Unsupervised Feature Learning
Presentation transcript:

Im2Calories: towards an automated mobile vision food diary Graduate Presentation Assignment CS 674 Sara Davis

Relevance Obesity rates are climbing. Current calorie counting apps are time consuming and inaccurate. Some apps are expensive and rely on nutritionists. Some apps are not informative enough.

How can we make these apps better? Semi-Automation. Recognize what is recognizable. Prompt for help if it's not recognizable. In the real world.

What will this app look like? Take a photo in the app. System processes. Is there food, where are you?  If restaurant, classify for you and offer top 5 choices for user to select. Allow user to delete bad labels and add new ones.  If offline, store for later upload

Main contributions: Identify what is in a meal on a larger scale than previously achieved. Create a new dataset for image segmentation. Begin to map photos to calorie count in the real world.

Meal detection steps Determine if image is clear enough using data sets (food/not food). Rescale. Create "new" data set Food-101 Background. Train CNN to detect.

Analyzing the meal Identify what restaurant user is at, get menu or ask user for information. Multi-label classifier determines what is on plate (can have multiple things). Find food item in restaurant database. Estimate calories.

Two methods of computing calorie content Use MenuMatch dataset. Nutritionist + menu. SVM- 1 V all. Use GoogLeNet CNN pretrained model. Remove 1000-way softmax, replace with 101-way; train; replace with 41 nodes and fine tune.

Comparing study method (C-bar & C-hat) to others Mean Error Mean Absolute Error Baseline -37.3 ± 3.9 239.9  ±  1.4 Meal Snap -268.5  ±  13.3 330.9  ±  11.0 MenuMatch -21.0  ± 11.6 232.0  ±  7.2 C-hat -31.90  ±  28.10 163.42  ±  16.32 C-bar -25.35  ±  26.37 152.95  ±  15.61

Augmenting the data: Restaurant dataset Based on 646 images of 41 menu items from 3 restaurants. Download menus for top 25 US restaurants, create list of 4857 items. Search for non-promotional food photos. Verify photos. New set: 2517 menu items, 99,000 images.

Retraining with new data Harder to identify images used to retrain the CNN (75/25 train/test).  Error rate high- fix with clustering. 

Augmenting the data: Food201-Multilabel Take 50,000 images from Food101 set, combine with user named food items. Create a list of foods eaten together. Allow users to enter new items (2). Merge synonyms, prune terms occurring < 100 . New set with 201 labels. Train CNN using same process as before.

Retraining with new data Highest error for side and small items.  Newer images have higher error rates (quality).

Segmenting the images Separate sides from main course. Why? Quantify number or volume for nutritional analysis. Use DeepLab CNN model.

Results of segmentation Due to size of dataset Food-301, false positive problem.  Create a binary mask vector, multiply mask vector by label distribution and smooth. Results not as good as PASCAL VOC 2012 challenge due to size of set, outliers, and variation in size/shape of samples.

Estimate volume Create a depth map using CNN that predicts pixel distances.  Project distances into space, and create a 2D grid (voxel).  Compare voxel to segmentation mask. Good accuracy.

Calorie estimate Need to map volume to caloric content. Hard to do- Most databases inaccurate.  Focus on raw. This portion incomplete.

Issues Restricted settings. Sample size and type. Food analysis. Still has trouble with mixed and occluded foods. K= 1 in clustering. Volume performance dependent on image quality. Calorie estimates still being worked on.