Multimedia Database Schema Design Jianguo Huo
Outline MMDB Design Issues Multimedia Data Types Features and Similarity Functions M-Dependencies Normalization Evaluation
MMDB Design Issues Requirements for the MMDB –Representation, storage, interpretation, composition, retrieval and delivery of diverse data types Data Model Storage structure Architecture Retrieval algorithms
MMDB Schema Design Building blocks –Data type –Relations –Rows (tuples) –Columns –Similarity function and thresholds –Dependencies –MMDB schema Knobs –Data types, relations, similarity functions, thresholds
Data Types Semantics of Multimedia Attributes Why not BLOB Generalized Icon –(x m, x i ) –Earcons, ticons, micon, vicons –multicons
Features and Similarity Functions Features and Tuples comparison –Equal vs. similar Similarity Functions –Distance functions –Threshold –Combination of similarity functions Let R(z1:Z1,…, zn:Zn), v be a tuple distance function on R, t be a maximum distance threshold, x=(x1,…, xn) and y=(y1,…, yn) be two tuples in R, we say that x is similar within t to y with respect to v, denoted with x w(t) y, iff v(x, y) t.
M-Dependencies The benefits from exploiting dependencies Classes of M-Dependencies –Type M Functional dependency (MFD) –Type M Multivalued dependency (MMD) –Type M Join dependency (MJD) –…technically MFD: Let R be a relation with attribute set U, and X, Y U. Xg1(t’) → Yg2(t’’) is a type-M functional dependency (MFD) relation if and only if for any two tuples t1 and t2 in R that have t1[X] g1(t’) t2[X], then t1[Y] g2(t’’) t2[Y], where g1 TD(X) and g2 TD(Y), whereas t’ and t’’ [0,1] are thresholds. MMD: Let R be a multimedia relation with attribute set U, and X, Y U. Xg1(t') →» Yg2(t'')[g3(t''')] is a type-M multivalued dependency (MMD) relation if and only if for any two tuples t1 and t2 in R such that t1[X] g1(t') t2[X], there also exist in R two tuples t3 and t4 with the following properties: t3[X], t4[X] [t1[X]] g1(t') t3[Y] g2(t'') t1[Y] and t4[Y] g2(t'') t2[Y] t3[R– (XY)] g3(t''') t2[R – (XY)] and t4[R – (XY)] g3(t''') t1[R – (XY)]. where g1 TD(X), g2 TD(Y) and g3 TD(R–(XY)), whereas t', t'' and t''' [0,1] are thresholds. MJD: Let R be a relation on U, and {X1,…,Xn} U, with the union of Xi’s being U. If R = PX1,g1(t1)(R) g1(t1) PX2,g2(t2)(R) g2(t2)… gn-1(tn-1) PXn,I (R), we say that R satisfies a Type-M Join Dependency (MJD), denoted by [g1(t1),…, gn-1(tn-1)][X1,..., Xn], where gi TD(Xi Xi+1) and ti [0,1] for each 1 i n-1.
Normal Forms Dependency-based design practice Benefits Types of Normal Forms –1MNF –2MNF –3MNF –4MNF –5MNF –…technically 1MNF: We say that a multimedia database schema is in first multimedia normal form (1MNF) if each attribute A has the type of number, string or elementary generalized icon. 2MNF: We say that a multimedia database schema is in second multimedia normal form (2MNF) if it is in 1MNF and each non prime attribute A is fully dependent on the primary key. 3MNF: We say that a multimedia database schema R is in third multimedia normal form (3MNF) if it is in 2MNF and the non prime attributes are not mutually dependent. 4MNF: We say that a multimedia database schema R is in fourth multimedia normal form (4MNF) with respect to a set of multimedia dependencies D if, for every nontrivial MMD Xg1(t1) →» Yg2(t2)[g3(t3)] in D+, X is a superkey for R. 5MNF: we say that R is in 5MNF with respect to a set D of MFDs, MMDs, and MJDs if, for every nontrivial type-M join dependency [g1(t1),…, gn-1(tn-1)](X1,..., Xn) in D+, every Xi is a superkey for R.
Term Project: MMDB Refactoring
Project Goal To implement an usable tool to enable automatic database transformation –Improve the user interface –Improve the algorithm