Multimedia Information Retrieval System Architecture

Multimedia Information Retrieval Systems MMDB--MIRS

MIRS Architecture • Should be flexible and extensible • to support diverse applications, query types and contents (features). • MIRSs consist • functional modules －added to extend, deleted or replaced. • Another characteristic of MIRSs is that they are normally distributed • consisting of servers and clients • results from the large size of multimedia data MMDB--MIRS

User Interface Feature Extractor Communication Manager Indexing and Search Engine Storage Manager MIRS Architecture Figure MMDB--MIRS

MIRS architecture • user interface • insertion of new multimedia items and retrieval. • These items can be stored files or input from variant devices • feature extractor • The contents or features of multimedia items are extracted either automatically or semi-automatically • communication managers • These features and the original items are sent to the server or servers • indexing and retrieval engine • At the server(s), the features are organized according to a certain indexing scheme for efficient retrieval • storage manager • The indexing information and the original items are stored appropriately. MMDB--MIRS

Multimedia Databases MMDB

MMDB Architectures • The Principle of Autonomy: each media type (e.g. image, video, etc.) is organized in a media-specific manner suitable for that media type. • The Principle of Uniformity: we represent the content of all the different media objects within a single data structure (“unified index”). • The Principle of Hybrid Organization: certain media types use their own indexes, while others use the “unified” index. MMDB

Autonomy MMDB

Uniformity MMDB

Hybrid Organization MMDB

Specialized Indexing Structures MMDB--Indexing

Multidimensional Data Structures • A geographic information system (GIS) stores information about some physical region of the world. • A map is just viewed as a 2-d image, and certain “points” on the map are considered to be interest. • Specialized data structures: • k-d Trees • Point Quadtrees • MX-Quadtrees • R-Trees MMDB--Indexing

k-D Trees • Used to store k dimensional point data. • A 2-d tree stores 2-dimensional point data; a 3-d tree stores 3-dimensional point data, and so on. • Node structure: nodetype = record INFO: infotype; // any user-defined type XVAL: real; // coordinate YVAL: real; // coordinate LLINK: nodetype; RLINK: nodetype; end MMDB--Indexing

2-d Tree Def: A 2-d tree is a binary tree satisfying the following condition. (with root a level 0) • For node N with level(N) is even, then every node M under N.LLINK has the property that M.XVAL < N.XVAL, and every node P under N.RLINK has the property that P.XVAL  N.XVAL. • For node N with level(N) is odd, then every node M under N.LLINK has the property that M.YVAL < N.YVAL, and every node P under N.RLINK has the property that P.YVAL  N.YVAL. MMDB--Indexing

B (40, 50) A (19, 45) D (54, 40) C (38, 38) E (4, 4) Example: a map MMDB--Indexing

A(19, 45) Level 0 LLINK RLINK E(4, 4) B(40, 50) Level 1 C(38, 38) Level 2 D(54, 40) Level 3 Example 2-d Tree INFO (XVAL, YVAL) A (19, 45) B (40, 50) C (38, 38) D (54, 40) E (4, 4) MMDB--Indexing

B (40, 50) A (19, 45) D (54, 40) C (38, 38) E (4, 4) Split Regions MMDB--Indexing

Deletion in 2-d Trees Suppose T is a 2-d tree, and we wish to delete N(x, y). • If N is a leaf node, then set link to NIL and delete N. • Otherwise, Step 1: Find a “candidate replacement” node R that occurs either in Ti for i {, r}. (Ti under N) Step 2: Replace all of N’s non-link fields by those of R. Step 3: Recursively delete R from Ti. MMDB--Indexing

Candidate Replacement Node • If we find a replacement node from left subtree, it might be violated the definition of 2-d tree. (Two nodes have equal maximal values.) • Find the replacement node R from right subtree (Tr). • Every node M in Tr is such that M.XVAL  R.XVAL if level(N) is even, and M.YVAL  R.YVAL if level(N) is odd. • If Tr is not empty, and level(N) is even (odd), then any node in Tr that has the smallest possible XVAL (YVAL) field in Tr is a candidate replacement node. MMDB--Indexing

Candidate Replacement Node • If Tr is empty – • Find the node R’ in left subtree T with the smallest XVAL if level(N) is even, or the smallest YVAL if level(N) is odd. • Replace all of N’s nonlink fields by those of R’. • Set N.RLINK = N.LLINK and set N.LLINK = NIL. • Recursively delete R’. MMDB--Indexing

Range Queries in 2-d Trees • Find all the points within distance r of point (xc, yc). • Each node N in a 2-d tree implicitly represents a region RN. • If the circle specified in a query has no intersection with RN, then there is no point searching the subtree rooted at node N. MMDB--Indexing

B (40, 50) A (19, 45) (51, 43) r=5 D (54, 40) C (38, 38) E (4, 4) Example of Range Query Query: point (51, 43) r = 5 MMDB--Indexing

k-d Tree for k 2 For each node N • Suppose level(N)mod k = i. • For each node M in N’s left subtree, M.VAL[i] < N.VAL[i]. • For each node P in N’s right subtree, P.VAL[i]  N.VAL[i]. Note: when k=1, we get a standard binary search tree. MMDB--Indexing

Point Quadtrees • Point quardtrees split regions into four parts. • These four parts are called the NW (northwest), SW (southwest), NE (northeast), and SE (southeast) quadrants determined by node N. • Node structure in a point quadtree: qtnodetype = record INFO: infotype; XVAL: real; YVAL: real; XLB, XUB, YLB, YUB: real  {+, - } NW, SW, NE, SE: qtnodetype; end MMDB--Indexing

Point Quadtrees • XLB, XUB, YLB, and YUB are upper bounds and lower bounds of X and Y, respectively. • Suppose N is a root node of a tree or subtree of quadtree: • Every node P1 in N.NW is such that P1.XVAL < N.XVAL and P1.YVAL  N.YVAL. • Every node P2 in N.SW is such that P2.XVAL < N.XVAL and P2.YVAL < N.YVAL. • Every node P3 in N.NE is such that P3.XVAL  N.XVAL and P3.YVAL  N.YVAL. • Every node P4 in N.SE is such that P4.XVAL  N.XVAL and P4.YVAL < N.YVAL. MMDB--Indexing

Nodes in a Point Quadtree MMDB--Indexing

A (19, 45) NW SW NE SE E (4, 4) NW SW NE SE B (40, 50) NW SW NE SE C (38, 38) NW SW NE SE D (54, 40) NW SW NE SE Example of Point Quadtree INFO (XVAL, YVAL) A (19, 45) B (40, 50) C (38, 38) D (54, 40) E (4, 4) MMDB--Indexing

B (40, 50) A (19, 45) D (54, 40) C (38, 38) E (4, 4) Split Regions MMDB--Indexing

Deletion in Point Quadtrees A node N to be deleted -- • If N is a leaf node, the link from parent to N is set to NIL then releases node N. • Otherwise, try to find a replacement node R from subtrees such that: • Every other node R’ in N.NW (N.SW / N.NE / N.SE) is to the north west (south west / north east / south east) of R. MMDB--Indexing

Can’t Find Replacement Node • Unfortunately, replacement node may not be found. (i.e. deleting A) • Thus, in the worst case, deletion of an interior node N may require reinsertion of some nodes in the subtrees pointed to by N.NE, N.SE, N.NW, and N.SW. MMDB--Indexing

Range queries in Point Quadtrees • Do not search regions that do not intersect the circle defined by the query. • Search procedure: ProcRangeQueryPQtree(T:newqtnodetype, C:circle) if region(T)  C = Ø then Halt else if (T.XVAL, T.YVAL)  C then print (T.XVAL, T.YVAL); ProcRangeQueryPQtree(T.NW, C); ProcRangeQueryPQtree(T.SW, C); ProcRangeQueryPQtree(T.NE, C); ProcRangeQueryPQtree(T.SE, C); end proc MMDB--Indexing

MX-Quadtree • MX-quadtrees attempt to: ensure that the shape of the tree are independent of the number of nodes present in the tree, as well as the order of insertion of these nodes. • MX-quadtrees also attempt to provide efficient deletion and search algorithms. • The map being represented is “split up” into a grid of size (2k2k) for some k. • All physical data are performed at leaf nodes. • Node Structure: Exactly the same as for point quadtrees. MMDB--Indexing

NW SW NE SE NW SW NE SE NW SW NE SE NW SW NE SE A NW SW NE SE B NW SW NE SE C NW SW NE SE D NW SW NE SE Example of MX-quadtree A B C D INFO (XVAL, YVAL) A (1, 3) B (3, 3) C (3, 1) D (3, 0) MMDB--Indexing

Deletion in MX-Quadtrees • Deletion in an MX-quadtree is a fairly simple operation, because all points are represented at the leaf level. • When a node is deleted, we have to keep trace its parent node to see whether it contains no child node? If so, continue to delete the parent nodes recursively. MMDB--Indexing

Range Queries in MX-Quadtrees Handled in exactly the same way as for point quardtrees. But there are two differences: • The content of XLB, XUB, YLB, YUB, fields is different from that in the case of point quadtrees. • As points are stored at the leaf level, checking to see if a point is in the circle defined by the range query needs to be performed only at the leaf level. MMDB--Indexing

R-Trees • Used to store rectangular regions of an image or a map. • R-trees are particularly useful in storing very large amounts of data on disk. • They provide a convenient way of minimizing the number of disk accesses. • Each R-tree has an associated order, integer k. • Each nonleaf R-tree node contains a set of at most k rectangles and at least k/2 rectangles. • Each disk access brings back a page containing at least k/2 rectangles. MMDB--Indexing

Example: a map G1 G2 R5 R1 R6 R7 R4 R2 R3 G3 R8 R9 MMDB--Indexing

G1 G2 G3 R1 R2 R3 R4 R5 R6 R7 R8 R9 Example R-Tree R-Tree of order 4 Structure of R-tree node: rtnodetype = record Rec1, …, Reck: rectangle; P1, …, Pk: rtnodetype; end MMDB--Indexing

Insertion into a R-Tree G1 G2 R5 R1 R6 R7 R4 R2 R3 R10 G3 R11 R8 R9 MMDB--Indexing

Option 1 G1 G2 R5 R1 R6 R7 R4 R2 R3 R10 G3 R11 R8 R9 MMDB--Indexing

Option 2 : Preferred G1 G2 R5 R1 R6 R7 R4 R2 R3 R10 G3 R11 R8 G4 *** Total area of the group rectangles is smallest. R9 MMDB--Indexing

Option 3 : Incorrect G1 G2 R5 R1 R6 R7 R4 R2 R3 R10 G3 R11 R8 G4 *** G4 is underflow. R9 MMDB--Indexing

G1 G2 G3 R1 R2 R3 R4 R6 R7 R5 R8 Deletion in R-Trees • Deletion of objects from R-trees may cause a node in the R-tree to “underflow” in which contains less than k/2 rectangles. Such as delete R9 from our example. • If delete R9, we must create a new logical grouping. One possibility is as follows: MMDB--Indexing

Multimedia Information Retrieval System Architecture