Download presentation
Presentation is loading. Please wait.
1
1 Image Databases Conventional relational databases, the user types in a query and obtains an answer in response It is different in image databases a police officer may issue a query: “ Retrieve all pictures from the image database that are “similar” to this person and give the identities of the people.” This query is fundamentally different from ordinary queries for 2 reasons: 1. The query includes a picture as part of the query 2. The query asks about similar pictures and therefore uses a notion of “imprecise match”
2
2 Raw images the content of an image consists of all “interesting” objects in that image each object is characterized by a shape descriptor: that describes the shape/location of the region within which the object is located inside the image a property descriptor that describes the properties of individual pixels (e.g. RGB values of the pixel, RGB values aggregated over a group of pixels, grayscale levels) a property consists of F a property name, e.g., red, green, blue, texture F a property domain - range of values that a property can assume {0, 1,..7}
3
3 Images Every image is associated with a pair of positive integers (m,n), called grid-resolution, which divides the image into (m n) cells of equal size (called image grid) Each cell consists of a collection of pixels A cell property: (Name, Values, Method) Example F (bwcolor, {b,w}, bwalgo}, where the possible values are b(black) and w(white), and bwalgo is an algorithm that takes a cell as an input and returns either black or white by somehow combining the black/white levels of the pixels in the cell F (graylevel, [0,1], grayalgo), where the possible values are real numbers within the interval [0,1].
4
4 Image Database Image Database: (GI,Prop,Rec) GI is a set of gridded images (Image,m,n) Prop is a set of cell properties Rec is a mapping that associates with each image, a set of rectangles denoting objects (in fact this does not necessarily have to be rectangle)
5
5 Problems with image databases Images are often very large infeasible to explicitly store the properties on a pixel by pixel basis This led to a family of image “compression” techniques: attempt to compress the image into one containing fewer pixels There is a need to determine the “features” of the image (compressed or raw) done by “segmentation” : breaking up the image into a set of homogeneous rectangular regions called segments Need to support “match” operations that compare either a whole image or a segmented image against another
6
6 Image Compression Lossy Compression Image may contain details that human eye cannot recognize get rid of those details F DCT(Discrete Cosine Transform) F DFT(Discrete Fourier Transform) F DWT(Discrete Wavelet Transform) convert images from time domain(Spatial) to frequency domain get rid of the frequencies which do not contain information. Transforms DCT and DFT are similar concepts F From time domain to signal domain F Given a signal of length “n”, these transforms return a sequence of n frequencies. Sample1, Sample2,......., Sample n transforms to : Freq1, Freq2,........., Freq n.
7
7 Why do we use the transform Noise removal is easier in the frequency domain Various filters are easier to implement in frequency domain Compression (gathers similar values together)
8
8 Desirable Properties of Transforms DFT Invertibility: It is possible to get back the original image I from its DFT representation. (useful for decompression) Note: practical implementations of DFT often use DFT with other non-invertible operations: thus sacrifice invertibility Distance preservation: DFT preserves Euclidean distance. F This is important in image matching applications where we often use distance measures to represent similarity levels DCT DCT preserves all the above a given signal can be represented with fewer frequencies DWT DFT and DCT have no temporal locality F a change in one single part of data changes all frequencies wavelets introduce locality
9
9 Distance preservation
10
10 Distance preservation
11
11 Fractal Compression Transform-based approaches benefit from the difference in visual perception in different frequencies What else can we use for compression ? Self similarity F We can find self similarities in a given image and describe the image in terms of these similarities.
12
12 Fractal Compression
13
13 Image Processing: Segmentation A process of taking an image as input and cutting up the image into disjoint homogeneous regions Connected region (R): is a set of cells C 1.. C n in R such that the Euclidean distance between C i and C i+1 for all i < n is 1 Example R1,R2,R3 is connected R1 R2 is connected R2 R3 is connected R1 R2 R3 is connected R1 R3 is not connected Because the Euclidian distance between (2,3) and (3,4) is 2>1 R3 R1 R2 1 2 3 4 4 3 2 1
14
14 Measuring Homogeneity Homogeneity predicate: is a function H that takes any connected region as input and returns either true or false Example 1: Suppose is some real number between 0 and 1 H bw can be defined as H bw (R) is true if over (100* )% of cells in R have the same color Region # of black #of white cells cells R1 800 200 R2 900 100 R3 100 900 Region H 0.8 bw H 0.89 bw H 0.92 bw R1 true false false R2 true true false R3 true true false
15
15 Measuring Homogeneity Example 1: Suppose each cell has a real value between 0, 1, this value is bw-level Suppose f assigns a value between 0 and 1 to each cell Assume is the noise factor and a threshold H ,f, (R) is true if {(x,y)| |bwlevel(x,y)-f(x,y)|
16
16 Segmentation Given an image I with (m n) cells, a segmentation of I wrt a homogeneity predicate P is a set of R1,.Rk such that Ri Rj = for all 1 i j k I = R1 .. Rk H(Ri) = true for all i j k for all distinct i,j, 1 I, j n such that Ri Rj is a connected region, it is the case that H(Ri Rj) = false
17
17 An Example of Segmentation Row/Col 1 2 3 4 1 0.1 0.25 0.5 0.5 2 0.05 0.30 0.6 0.6 3 0.35 0.30 0.55 0.8 4 0.6 0.63 0.85 0.90 For H dyn,0.03 (R) of the following (4 4) image will yield the following segmentation R1 = {(1,1),(1,2)} R2 = {(1,3),(2,1),(2,2),(2,3)} R3 = {(3,1),(3,2),(3,3),(4,1),(4,2)} R4 = {(3,4),(4,3),(4,4)} R5 = {(1,4),(2,4)} Row/Col 1 2 3 4 1 0.1 0.25 0.5 0.5 2 0.05 0.30 0.6 0.6 3 0.35 0.30 0.55 0.8 4 0.6 0.63 0.85 0.90
18
18 Segmentation Algorithm Split: if the whole image is homogeneous, we are done otherwise, split the image into two parts and recursively repeat this process till we find a set of R1.. Rn such that each region is homogeneous Merge: check whether any of the Ri’s can be merged together at the end of this step, we obtain a valid segmentation R1,..Rk
19
19 Similarity Based Retrieval
20
20 Similarity Based Retrieval
21
21 Similarity Based Retrieval The Metric Approach: Uses a distance measure d that can compare tow images The smaller the distance, the more similar they are I.e., given an input image I, find the “nearest neighbor” of I in the image archive The Transformation Approach: The metric approach assumes that the notion of similarity is fixed Whereas the transformation approach computes the cost of transforming one image into another based on user-specified cost functions that may vary from one query to another
22
22 The Metric Approach We define a distance function on a k dimensional space (k=n+2) the distance function satisfies the following properties d(x,y) = d(y,x) d(x,z) d(x,z) + d(z,y) d(x,x) = 0 Example: Let the image object consists of (256 256) cells with 3 attributes (red,green,blue) each of which assumes a value from {0,…7} d i (o 1,o 2 ) = (diff r [i,j]+diff g [i,j]+diff b [i,j]) where diff r [i,j] = (o 1 [i,j].red - o 2 [i,j].red) 2 diff g [i,j] = (o 1 [i,j].green - o 2 [i,j].green) 2 diff b [i,j] = (o 1 [i,j].blue - o 2 [i,j].blue) 2 Such computations can be cumbersome (65536 expressions being computed inside the sum)
23
23 The Metric Approach How can this massive similarity computation be avoided? Through feature extraction! Use a good feature extraction function fe and use it to map objects into single points in a s-dimensional space where s would typically be pretty small compared to n+2 This leads to two reductions an object is originally is a set of points in an (n+2) dimensional space. In contrast, fe(0) is a single point fe(o) is a point in s-dimensional space where s << (n+2) The feature extraction mapping must preserve the distance relationships in the original space (n+2) dim space s-dim space indexing algorithm index object repository (could be quadtree, R-tree for s- dim data)
24
24 Searching Finding the best matches find the nearest neighbors of fe(o) in the tree using a nearest neighbor search technique. Finding sufficiently similar objects execute a range query in the tree with center fe(o) and radius
25
25 The Transformation Approach The main principle the level of dis-similarity between o 1,o 2 is proportional to the cost of transforming o 1 into o 2, or vice-versa Transformation operators translation rotation scaling (uniform and nonuniform) excision Transformation of o into o’ is a sequence of transformation operations (to 1,to 2,..to r ) such that to 1 (o) = o 1 …... To(o r ) = o’ Cost of transformation, cost(TS) = cost(to i )
26
26 Example
27
27 Example
28
28 Example
29
29 Transformation vs. Metric Advantages of the transformation model user can setup his own notion of similarity by specifying certain transformation operators user may associate a cost function with each transformation operator Advantages of the metric model by forcing the user to use only one similarity metric, the system can facilitate the indexing of data so as to optimize nearest neighbor search
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.