Spring 2008MMDB--MIRS1 Multimedia Information Retrieval Systems
Spring 2008MMDB--MIRS2 MIRS Architecture Should be flexible and extensible to support diverse applications, query types and contents (features). MIRSs consist functional modules - added to extend, deleted or replaced. Another characteristic of MIRSs is that they are normally distributed consisting of servers and clients results from the large size of multimedia data
Spring 2008MMDB--MIRS3 MIRS Architecture Figure User Interface Feature Extractor Communication Manager Indexing and Search Engine Storage Manager
Spring 2008MMDB--MIRS4 MIRS architecture user interface insertion of new multimedia items and retrieval. These items can be stored files or input from variant devices feature extractor The contents or features of multimedia items are extracted either automatically or semi-automatically communication managers These features and the original items are sent to the server or servers indexing and retrieval engine At the server(s), the features are organized according to a certain indexing scheme for efficient retrieval storage manager The indexing information and the original items are stored appropriately.
Spring 2008MMDB5 Multimedia Databases
Spring 2008MMDB6 MMDB Architectures The Principle of Autonomy: each media type (e.g. image, video, etc.) is organized in a media-specific manner suitable for that media type. The Principle of Uniformity: we represent the content of all the different media objects within a single data structure (“unified index”). The Principle of Hybrid Organization: certain media types use their own indexes, while others use the “unified” index.
Spring 2008MMDB7 Autonomy
Spring 2008MMDB8 Uniformity
Spring 2008MMDB9 Hybrid Organization
Spring 2008MMDB--Indexing10 Specialized Indexing Structures
Spring 2008MMDB--Indexing11 Multidimensional Data Structures A geographic information system (GIS) stores information about some physical region of the world. A map is just viewed as a 2-d image, and certain “points” on the map are considered to be interest. Specialized data structures: k-d Trees Point Quadtrees MX-Quadtrees R-Trees
Spring 2008MMDB--Indexing12 k -D Trees Used to store k dimensional point data. A 2-d tree stores 2-dimensional point data; a 3-d tree stores 3-dimensional point data, and so on. Node structure: nodetype = record INFO: infotype; // any user-defined type XVAL: real; // coordinate YVAL: real; // coordinate LLINK: nodetype; RLINK: nodetype; end
Spring 2008MMDB--Indexing13 2-d Tree Def: A 2-d tree is a binary tree satisfying the following condition. (with root a level 0) For node N with level(N) is even, then every node M under N.LLINK has the property that M.XVAL < N.XVAL, and every node P under N.RLINK has the property that P.XVAL N.XVAL. For node N with level(N) is odd, then every node M under N.LLINK has the property that M.YVAL < N.YVAL, and every node P under N.RLINK has the property that P.YVAL N.YVAL.
Spring 2008MMDB--Indexing14 Example: a map A (19, 45) D (54, 40) C (38, 38) B (40, 50) E (4, 4)
Spring 2008MMDB--Indexing15 Example 2-d Tree INFO (XVAL, YVAL) A (19, 45) B (40, 50) C (38, 38) D (54, 40) E (4, 4) A(19, 45) D(54, 40) C(38, 38) B(40, 50)E(4, 4) RLINKLLINK Level 0 Level 1 Level 2 Level 3
Spring 2008MMDB--Indexing16 Split Regions A (19, 45) D (54, 40) C (38, 38) B (40, 50) E (4, 4)
Spring 2008MMDB--Indexing17 Suppose T is a 2-d tree, and we wish to delete N(x, y). If N is a leaf node, then set link to NIL and delete N. Otherwise, Step 1: Find a “candidate replacement” node R that occurs either in T i for i {, r}. (T i under N) Step 2: Replace all of N’s non-link fields by those of R. Step 3: Recursively delete R from T i. Deletion in 2-d Trees
Spring 2008MMDB--Indexing18 If we find a replacement node from left subtree, it might be violated the definition of 2-d tree. (Two nodes have equal maximal values.) Find the replacement node R from right subtree (T r ). Every node M in T r is such that M.XVAL R.XVAL if level(N) is even, and M.YVAL R.YVAL if level(N) is odd. If T r is not empty, and level(N) is even (odd), then any node in T r that has the smallest possible XVAL (YVAL) field in T r is a candidate replacement node. Candidate Replacement Node
Spring 2008MMDB--Indexing19 If T r is empty – Find the node R’ in left subtree T with the smallest XVAL if level(N) is even, or the smallest YVAL if level(N) is odd. Replace all of N’s nonlink fields by those of R’. Set N.RLINK = N.LLINK and set N.LLINK = NIL. Recursively delete R’. Candidate Replacement Node
Spring 2008MMDB--Indexing20 Find all the points within distance r of point (x c, y c ). Each node N in a 2-d tree implicitly represents a region R N. If the circle specified in a query has no intersection with R N, then there is no point searching the subtree rooted at node N. Range Queries in 2-d Trees
Spring 2008MMDB--Indexing21 Example of Range Query A (19, 45) D (54, 40) C (38, 38) B (40, 50) E (4, 4) (51, 43) r=5 Query: point (51, 43) r = 5
Spring 2008MMDB--Indexing22 k -d Tree for k 2 For each node N Suppose level(N) mod k = i. For each node M in N’s left subtree, M.VAL[i] < N.VAL[i]. For each node P in N’s right subtree, P.VAL[i] N.VAL[i]. Note: when k=1, we get a standard binary search tree.
Spring 2008MMDB--Indexing23 Point Quadtrees Point quardtrees split regions into four parts. These four parts are called the NW (northwest), SW (southwest), NE (northeast), and SE (southeast) quadrants determined by node N. Node structure in a point quadtree: qtnodetype = record INFO: infotype; XVAL: real; YVAL: real; XLB, XUB, YLB, YUB: real {+ , - } NW, SW, NE, SE: qtnodetype; end
Spring 2008MMDB--Indexing24 Point Quadtrees XLB, XUB, YLB, and YUB are upper bounds and lower bounds of X and Y, respectively. Suppose N is a root node of a tree or subtree of quadtree: Every node P 1 in N.NW is such that P 1.XVAL < N.XVAL and P 1.YVAL N.YVAL. Every node P 2 in N.SW is such that P 2.XVAL < N.XVAL and P 2.YVAL < N.YVAL. Every node P 3 in N.NE is such that P 3.XVAL N.XVAL and P 3.YVAL N.YVAL. Every node P 4 in N.SE is such that P 4.XVAL N.XVAL and P 4.YVAL < N.YVAL.
Spring 2008MMDB--Indexing25 Nodes in a Point Quadtree
Spring 2008MMDB--Indexing26 Example of Point Quadtree INFO (XVAL, YVAL) A (19, 45) B (40, 50) C (38, 38) D (54, 40) E (4, 4) A (19, 45) NW SW NE SE E (4, 4) NW SW NE SE B (40, 50) NW SW NE SE C (38, 38) NW SW NE SE D (54, 40) NW SW NE SE
Spring 2008MMDB--Indexing27 Split Regions A (19, 45) D (54, 40) C (38, 38) B (40, 50) E (4, 4)
Spring 2008MMDB--Indexing28 Deletion in Point Quadtrees A node N to be deleted -- If N is a leaf node, the link from parent to N is set to NIL then releases node N. Otherwise, try to find a replacement node R from subtrees such that: Every other node R’ in N.NW (N.SW / N.NE / N.SE) is to the north west (south west / north east / south east) of R.
Spring 2008MMDB--Indexing29 Can’t Find Replacement Node Unfortunately, replacement node may not be found. (i.e. deleting A) Thus, in the worst case, deletion of an interior node N may require reinsertion of some nodes in the subtrees pointed to by N.NE, N.SE, N.NW, and N.SW.
Spring 2008MMDB--Indexing30 Range queries in Point Quadtrees Do not search regions that do not intersect the circle defined by the query. Search procedure: ProcRangeQueryPQtree(T:newqtnodetype, C:circle) if region(T) C = Ø then Halt else if (T.XVAL, T.YVAL) C then print (T.XVAL, T.YVAL); ProcRangeQueryPQtree(T.NW, C); ProcRangeQueryPQtree(T.SW, C); ProcRangeQueryPQtree(T.NE, C); ProcRangeQueryPQtree(T.SE, C); end proc
Spring 2008MMDB--Indexing31 MX-Quadtree MX-quadtrees attempt to: ensure that the shape of the tree are independent of the number of nodes present in the tree, as well as the order of insertion of these nodes. MX-quadtrees also attempt to provide efficient deletion and search algorithms. The map being represented is “split up” into a grid of size (2 k 2 k ) for some k. All physical data are performed at leaf nodes. Node Structure: Exactly the same as for point quadtrees.
Spring 2008MMDB--Indexing32 Example of MX-quadtree INFO (XVAL, YVAL) A (1, 3) B (3, 3) C (3, 1) D (3, 0) AB C D NW SW NE SE A NW SW NE SE B NW SW NE SE C NW SW NE SE D NW SW NE SE
Spring 2008MMDB--Indexing33 Deletion in MX-Quadtrees Deletion in an MX-quadtree is a fairly simple operation, because all points are represented at the leaf level. When a node is deleted, we have to keep trace its parent node to see whether it contains no child node? If so, continue to delete the parent nodes recursively.
Spring 2008MMDB--Indexing34 Range Queries in MX-Quadtrees Handled in exactly the same way as for point quardtrees. But there are two differences: The content of XLB, XUB, YLB, YUB, fields is different from that in the case of point quadtrees. As points are stored at the leaf level, checking to see if a point is in the circle defined by the range query needs to be performed only at the leaf level.
Spring 2008MMDB--Indexing35 R-Trees Used to store rectangular regions of an image or a map. R-trees are particularly useful in storing very large amounts of data on disk. They provide a convenient way of minimizing the number of disk accesses. Each R-tree has an associated order, integer k. Each nonleaf R-tree node contains a set of at most k rectangles and at least k/2 rectangles. Each disk access brings back a page containing at least k/2 rectangles.
Spring 2008MMDB--Indexing36 Example: a map R5 R1 R2 R3 R4 R7 R6 R9 R8 G1 G2 G3
Spring 2008MMDB--Indexing37 Example R-Tree R-Tree of order 4 G1 G2 G3 R1 R2 R3R4 R5 R6 R7R8 R9 Structure of R-tree node: rtnodetype = record Rec 1, …, Rec k : rectangle; P 1, …, P k : rtnodetype; end
Spring 2008MMDB--Indexing38 Insertion into a R-Tree R5 R1 R2 R3 R4 R7 R6 R9 R8 G1 G2 G3 R10 R11
Spring 2008MMDB--Indexing39 Option 1 R5 R1 R2 R3 R4 R7 R6 R9 R8 G1 G2 G3 R10 R11
Spring 2008MMDB--Indexing40 Option 2 : Preferred R5 R1 R2 R3 R4 R7 R6 R9 R8 G1 G2 G3 R10 R11 G4 *** Total area of the group rectangles is smallest.
Spring 2008MMDB--Indexing41 Option 3 : Incorrect R5 R1 R2 R3 R4 R7 R6 R9 R8 G1 G2 G3 R10 R11 G4 *** G4 is underflow.
Spring 2008MMDB--Indexing42 Deletion in R-Trees Deletion of objects from R-trees may cause a node in the R-tree to “underflow” in which contains less than k/2 rectangles. Such as delete R9 from our example. If delete R9, we must create a new logical grouping. One possibility is as follows: G1 G2 G3 R1 R2 R3R4 R6 R7R5 R8