CS256 Intelligent Systems -Vision Systems Module Overview
Timetable Week(mode) 1 (2L) 2(2L) 3(2L) 4(LP) 5(LP) 6 (LP) 7 (LP) 8 (LP) 9 (LP) 10(2L) Topic Introduction to the module and vision systems Case studies and basic concepts Java and image Fundamentals Feature Extraction and Image Transforms Edge Detection and Segmentation Colour and Texture Recover 3D information System Architecture Knowledge and Reasoning Image Classification and Retrieval (including revision)
Coursework Develop a system that is able to identify key features in selected images. Write a report to describe the design, implementation and evaluation of the system. Please see details in separate document on coursework assignment. Questions will be asked during lab sessions Deadline: Monday 18th April, 2005
Assessment Examination –60% –three questions from four Coursework –40% –Report based on experiments
Recommended Texts Nick Efford, Digital Image Processing, A Practical Introduction using Java, Addison Wesley, ISBN , May 2000 Tim Morris (2004), Computer Vision and Image Processing, Palgrave MacMillan, ISBN Patrick H Winston, (1992), Artificial Intelligence (Third Edition), Addison Wesley Publishers Co. ISBN Rob Callan (2003), Artificial Intelligence, Palgrave MacMillan, ISBN Paul F Whelan and Dereck Molloy (2001), Machine Vision Algorithms in Java: Techniques and Implementation, Springer, ISBN
Objectives of the module Understand the fundamentals in machine intelligence –Focus on vision systems, but will relate to other domains Understand components in vision systems –Be familiar with common operations for processing images –Be able to implement simple image processing operations Evaluate a vision system additionally: encourage the students to practise more basic and advanced Java programming
Intelligence and Perception First to understand how we perceive the world then to teach the machine to interpret the world based on primitive data it has received Human Perceptual Modalities –Tactile – touch –Gustatory – taste –Visual – sight –Auditory – hearing –Olfactory – smell
Intelligent Systems intelligent robots and intelligent machines –With artificial intelligence principles –reason about the world and take appropriate actions by manipulating knowledge –sense the world directly Vision - computational perception –a diverse and interdisciplinary body of knowledge and techniques –to understand the principles behind the processes that interpret perceptual signals provided by various sensors.
Intelligent Systems In vision, software’s job is to process the input from the hardware or sensors Humans have the natural abilities to speak, to see, to think, to smell, to sense etc. Machines do not have such inborn abilities, but only have simple engines to follow logical algorithms. The procedure to have the computer obtain the similar natural abilities like speaking and vision, are closely related to building knowledge system, but it is also the combination of simulating the perception procedure and knowledge
Intelligent Systems Integrate different levels of processing for bridging different gaps – sensors, raw data, low level processing, high level processing and knowledge, for building a complete intelligent system Reflected in this module structure
Figure 5-10 image B S1.X5.4.jpg (above) and the its annotation window generated in I-Browse system
Applications Classical –robot –medical imaging –remote sensing –astronomy Today –DTV –image interpretation –biometry –GIS, (Earth/Planetary Observation, monitoring, exploration) –human genome project –Creative media and art, entertainment
Sample applications - Biometry Using personal characteristics to identify a person –fingerprints –face –iris –DNA –gait –etc
Iris Scan Striations on iris are individually unique Obvious applications –security –PIN
} fixed number of samples Locate the eye in the head image Radial resampling of iris Numerical description Analysis
Image Representation x n 1 1m y f(x,y) An array F:- A digital image consisting of an array of m x n pixels in the x th column and the y th row has an intensity equal to f(x,y). (r(x,y), g(x,y), b(x,y))
Colour image and video sequence colour can be conveyed by combining different colours of light, using three components (red, green and blue): R = r(x,y); G = g(x,y); B = b(x,y), where R, G, B are defined in a similar way to F. The vector (r(x,y), g(x,y), b(x,y)) defines the intensity and colour at the point (x,y) in the colour image. A video sequence is, in effect, a time-sampled representation of the original moving scene. Each frame in the sequence is a standard colour, or monochrome image and can be coded as such. a monochrome video sequence may be represented digitally as a sequence o 2-D arrays [F1, F2, F3..F N ].
Java example for image representation;-
The Difficulty in Vision Computing – Taking the Human Visual System for Granted The processing capability of human visual systems is often taken for granted The subtlety and difficulty of describing the exact operation of the subconscious functions presents significant difficulty in developing algorithms to emulate human visual behaviour If we are computer…
Difficulties in vision computing - the sensory gap The sensory gap is the gap between the object in the world and the information in a (computational) description derived from a recording of that scene. disambiguation processing
Difficulties in vision computing - The semantic gap The semantic gap is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation. (Arnold, 2000) The higher level interpretation, the more more domain knowledge and its management are required.