Announcements Homework 5 due tonight (11:59pm) via Carmen Homework 6 out now: Due 12/1 @ 11:59pm.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.
Lecture 5 Template matching
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Machine Learning Case study. What is ML ?  The goal of machine learning is to build computer systems that can adapt and learn from their experience.”
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Information Retrieval in Practice
Multimedia Databases (MMDB)
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.
Artificial Intelligence
DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Week 1 - An Introduction to Machine Learning & Soft Computing
A RTIFICIAL I NTELLIGENCE Intelligent Agents 30 November
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
3D Motion Classification Partial Image Retrieval and Download Multimedia Project Multimedia and Network Lab, Department of Computer Science.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
Student Gesture Recognition System in Classroom 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee, and Sei-Wang Chen Department of Computer Science and Information.
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
Things about pattern recognition OGD. Pattern recognition ● Simplify the input ● Extract features ● Process ● Learn? ● Output results.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
Introduction to Machine Learning, its potential usage in network area,
Information Retrieval in Practice
Information Retrieval in Practice
Brief Intro to Machine Learning CS539
Machine Learning for Computer Security
CS262: Computer Vision Lect 06: Face Detection
Applying Deep Neural Network to Enhance EMPI Searching
Visual Information Retrieval
Chapter 11: Artificial Intelligence
3D Motion Classification Partial Image Retrieval and Download
Chapter 11: Artificial Intelligence
Introduction Machine Learning 14/02/2017.
Artificial Intelligence for Speech Recognition
CSSE463: Image Recognition Day 11
Tracking Objects with Dynamics
Motion Detection And Analysis
Announcements Office hours now Mon 3:30-4:30 and Thu 2:30-3:30.
Artificial Intelligence Lecture No. 5
CH. 1: Introduction 1.1 What is Machine Learning Example:
Intelligent Information System Lab
ARTIFICIAL INTELLIGENCE.
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
State-of-the-art face recognition systems
Text Detection in Images and Video
CSSE463: Image Recognition Day 11
Introduction Computer vision is the analysis of digital images
Brief Review of Recognition + Context
Overview of Machine Learning
Object Recognition Today we will move on to… April 12, 2018
Multimedia Information Retrieval
Creating Data Representations
CSE 635 Multimedia Information Retrieval
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
CSSE463: Image Recognition Day 11
Introduction Computer vision is the analysis of digital images
CSSE463: Image Recognition Day 11
Information Retrieval and Web Design
Information Retrieval
Presentation transcript:

Announcements Homework 5 due tonight (11:59pm) via Carmen Homework 6 out now: Due 12/1 @ 11:59pm

Today’s learning goals At the end of today, you should be able to Describe AI applications to image classification and gestural control Describe AI applications to problems in robotics

Information retrieval Flip side of information extraction Input Query (structured data or unstructured text) Output List of records (documents), ranked by relevance to query “Information about platypi” IR

Example

IR is more than just language Information retrieval can be mostly structured E.g., PageRank (Google’s early search algorithm) Uses page connections via hyperlinks to define “important” pages Does some keyword-based matching Or mostly unstructured Example medical document retrieval system: Query: string of disease names (”emphysema, breast cancer, diabetes”) Method: find documents that frequently mention these words Rank by date entered (recent first)

IR example Query: “emphysema AND diabetes” Extra knowledge base info: “chronic cough” is related to “emphysema” “insulin” is related to “diabetes” Document F(emphysema) F(diabetes) F(cough) F(insulin) D1 14 2 D2 3 1 4 D3 D4 10 7 12 6

IR example Query: “emphysema AND diabetes” Linear ranking model: 𝑅𝑎𝑛𝑘 𝐷 𝑖 =2𝐹 𝑒𝑚𝑝ℎ +𝐹 𝑑𝑖𝑎𝑏 +0.3𝐹 𝑐𝑜𝑢𝑔ℎ +0.7𝐹(𝑖𝑛𝑠𝑢𝑙𝑖𝑛) Document F(emphysema) F(diabetes) F(cough) F(insulin) Score D1 14 2 28.6 D2 3 1 4 6.1 D3 D4 10 7 12 6 34.9 Return documents in ranked order: [D4, D1, D2, D3]

Sample IR experimental questions How does a unigram (1-word) frequency model compare to a bigram (2-word) model? How important are hyperlinks vs text matches? Does my new way of representing documents give me better IR generalization to different kinds of documents? How do I efficiently rank tens of millions of documents?

Computer vision Video from Joshua Mosley

Vision data Data come in different forms Single images Video (image sequence) Grayscale/RGB/CMYK/etc Multiple cameras Other factors in data Lighting Lens shape etc. Distance to subject as

Vision applications Image classification Motion tracking Motion capture Gestural control Sports highlights/analysis …

Image classification Input Single image Classifier Output Label(s) describing image content Llama

Feature extraction – edge detection Use pixel value gradients to find sharp changes Thresholding for a certain value gives you edges Original Edges Overlaid Images from Jim Davis

Feature extraction – region segmentation Break the image into contiguous regions Can use clustering methods like k-means k-means (k=16) Original Segmented

Convolutional Neural Networks Say we want to learn to extract useful features for classification May be edges, faces, color patterns, etc Use a neural network to get arbitrary image features

Convolutional neural networks Normal fully-connected neural net Linear combination of every single pixel in the image Way too many parameters!

Convolutional neural networks Use local connections instead Important information w.r.t. one pixel is usually nearby Local statistics can be similar at different locations A nose is a nose Translation invariance So slide local “windows” around the image

Convolutional operation (single 3x3 filter) Images from Jim Davis

Convolutional operation (single 3x3 filter) Images from Jim Davis

Convolutional operation (single 3x3 filter) Images from Jim Davis

Convolutional operation (single 3x3 filter) Images from Jim Davis

Convolutional operation (single 3x3 filter) Images from Jim Davis

Convolutional operation (single 3x3 filter) Images from Jim Davis

Different filters for different features Use many different filters over the same image Each filter is just a linear combination of pixels Parameters are the same no matter where you apply Ideally, each learns different features Llama faces, color shifts, noses, etc.

Actually doing image classification General process: Extract features from labeled images Plug them into some classification model Logistic regression Neural network Support Vector Machine, etc Profit Features Classifier Llama

Experimental questions for image classification Which features work best for different classification tasks? Labeling animals, night vs day, city where image was taken, etc. Can I automatically find the best threshold for edge detection on pictures of faces? …

Gestural control Main idea: use hand gesture input to control computer applications Track user’s hand in 3D Pre-defined gestures act as discrete computer commands One or more cameras Need real-time processing! Slides adapted from Jim Davis

Simple example 3 kinds of gestures: Point, Reach, and Click Recognize using edges of hand against background Track how edge shape changes between video frames Example from Kumar and Segen (1999) Slides adapted from Jim Davis

Application 1: Flight simulator Example from Kumar and Segen (1999) Slides adapted from Jim Davis

Application 2: Doom

Modern application Toshiba concept demo (2009) Combination of intelligent heuristics (tracking the hand) and learning (classifying gestures)

Robotics More Youtube (new robot dog)

Robotics What kinds of problems do you have to deal with? Physical environment: partially observable, full of stuff Physical components: unreliable, may fail Sensors: limited ability to perceive environment Two examples of very different robotics settings: Industrial robotics and swarm robotics

Industrial robotics

Industrial environments Controlled environment Agent typically has limited or no movement Know what other agents (robots, humans) will be around Know what physical conditions should be

Industrial applications Applications/autonomy vary Highly specialized for a single task with hard input/output constraints Pre-defined routines, no agency Bottling, welding, etc Multiple tasks or variable environment; multi-agent Some conditional behavior, possibly learning routines Warehouse robots Reuters

Swarm robotics Hundreds of small robots operating together Each agent has Limited (and noisy) sensors Very limited compute capability Communication with other agents General idea: support complex environment interaction with minimal resources

Swarm robotics Demo

Very different problems from industrial Highly controlled environment Complex actuators Special-purpose sensors Open environment Very simple actuators Simple, general sensors

Multi-agent systems For each agent, environment includes many other agents All agents act concurrently, must predict what others will do One solution: single overall controller Problem: inflexible, hard to maintain Better solution: minimax-like Problem: too many other agents! And not turn-based. Best solution: robust to unexpected outcomes (bumps)

Application notes: Problems mix together Information retrieval gets easier and faster when you have good information extraction Errors in POS tagging mean errors farther down the line in information extraction or other applications Speech recognition requires good (text- based) language modeling

Many problems span multiple areas One popular task: image captioning Requires both good vision and language components

Robotics: combining everything A general-purpose, learning robot needs to combine many tasks Perception: vision, speech recognition Understanding/answering: information retrieval, language generation Acting: reinforcement learning, probabilistic modeling

Next time AI philosophy: Chinese room and society of mind