by Utku Tatlıdede Kemal Kaplan

AIBO VISION by Utku Tatlıdede Kemal Kaplan

2 AIBO ROBOT Specifications: ERS-210 CPU clock speed of 192MHZ
20 degrees of freedom Temperature,Infrared Distance, Acceleration, Pressure, Vibration Sensors CMOS Image Sensor Miniature Microphones, Miniature Speaker, LCD Display Dimensions and Weight: Size (WxHxL) 6.06" (W) x 10.47" (H) x 10.79" (L) Weight 3.3 lbs. (1.5kg)

3 AIBO VISION CCD camera 16.8 million colors (24 bit)
176x144 pixel image Field of view 57.6° wide and 47.8° tall Up to 25 fps Stores information in the YUV color space

4 PROCESSING OUTLINE Color Segmentation -Representations -Algorithms
Region Building and Merging - Region Growing - Edge detection Object Recognition - Classification - Template matching - Sanity check Bounding Box Creation

Color can be physically described as a spectrum which is the intensity of light at each wavelength. Radiance: Energy (W) from light source Luminance: Energy perceived by observer Brightness: Subjective descriptor CIE-XYZ RGB CMY, CMYK HSV, HSI, HLS YIQ YUV, YCbCr LUT

6 RGB (Red, Green, Blue) Additive color space
Three primaries: red, green, and blue Cannot always produce a color equivalent to any wavelength

7 HSI (Hue, Saturation, Intensity)
Similar to HSV (Hue, Saturation, Value) Represents colors similarly how the human eye senses colors.

8 YUV (Luminance, Chrominance)
Similar to YIQ and YCbCr Used for the PAL and SECAM broadcast television system Amount of information needed to define a color is greatly reduced

9 CONVERSIONS RGB TO YUV Y =  .299R G B U = -.147R G B V =  .615R G B RGB TO HSV H = cos-1([(R-B)+(R-G)]/2*[(R-G)2+(R-B)(G-B)]1/2) S = 1 – 3[min(R,G,B)]/(R+G+B) V = (R+G+B)/3 Can we reduce the color space by using unsupervised dimension reduction techniques (like PCA)? Can we use different domains?

For each object, find the most accurate subspaces of the color space to represent the object. YUV seems the most promising color representation for our real time applicaiton.

11 FINDING THE SUBSPACES First label images, then use supervised pattern recognition techniques. Most common ones: C4.5 MLP KNN

12 C4.5 Forms a decision tree for classification.
Uses the concept of information gain (effective decrease in entropy).

13 MLP (Multi-Layer Perceptron)
The MLP network is suited to a wide range of applications such as pattern classification and recognition, interpolation, prediction and forecasting.

14 KNN (K-Nearest Neighbor)
KNN is a simple algorithm that stores all available examples and classifies new instances of the example language based on a similarity measure.

15 CONDENSED KNN Condensed KNN: We can reduce the training set by removing the samples that introduce no extra information to the system.

16 PCA (Principal Component Analysis)
PCA is a mathematical procedure that converts a number of possibly correlated variables into a hopefully smaller number of uncorrelated variables called principal components.


18 RLE (Run Length Encoding)
RLE encodes multiple appearances of the same value.

19 REGION GROWING This method depends on the satisfactory selection of a number of seed pixels. This method may be performed before color segmentation.

Merging algorithms: in which neighboring regions are compared and merged if they are similar enough in some features. Splitting Algorithms: in which large non-uniform regions are broken up into smaller regions which is hopefully uniform.


22 CLASSIFICATION Already done by color segmentation.
Ball: The biggest blob with “ball color”, Beacons: Two adjecentblobs with beacon colors, etc. Unclassified blobs are discarded. Each object is classified with a certainty.

23 TEMPLATE MATCHING Accomplished by using convolution or correlation.
Only works for translation of the template. In case of rotation or size changes, it is ineffective. Also fails for partial views of objects.

24 SANITY CHECK A series of sanity check inspections are performed by the AIBO vision module to ensure the object classification is logically correct. Ball cannot be above the goal, Goals cannot be below the field, There cannot be two balls, etc.

25 BOUNDING BOX CREATION Requires affine transformations (translation, rotation, scaling) Required for calculating distance and position information The final products of the vision module are the bounding boxes of each visible object.

REFERENCES Cerberus RoboCup 2002 Team Report
rUNSWift RoboCup 2002 Team Report NUBots RoboCup 2002 Team Report CMPack RoboCup 2002 Team Report MACHINE LEARNING, Mitchell MACHINE VISION, Jain, Kasturi, Schunk

