Object Oriented Database Management
Outline Motivation Embedding SQL in host language Object Data Model Persistent Programming Languages Object Query Language Object-orientation in SQL
Motivation of ODBMSs Complex objects in emerging DBMS applications cannot be effectively represented as records in relational model. Representing information in RDBMSs requires complex and inefficient conversion into and from the relational model to the application programming language ODBMSs provide a direct representation of objects to DBMSs overcoming the impedance mismatch problem Application data structures Relational representation RDBMS Copy and translation Transparent ODBMS data transfer
Embedded SQL Access to database from a general purpose programming language required since: –Not all queries can be expressed in SQL --e.g., recursive queries cannot be written in SQL. –Non declarative actions -- e.g., printing reports cannot be done from SQL. General purpose language in which SQL is embedded called host language. SQL structures permitted in host language called embedded SQL. SQL+ C.o file SQL library object code pre- compiler loader SQL library calls + C C compiler Embedded SQL Compilation
Embedded SQL SQL commands embedded in the host programming language Data exchanged between host language and DBMS using cursors SQL query passed from host language to DBMS which computes the answer set A cursor can be viewed as a pointer into the answer set DBMS returns the cursor to the programming language Programming language can use the cursor to get a record at a time access to materialized answer.
:dname = “toy”; raise = 0.1; EXEC SQL SELECT dnum into :dnum FROM Department WHERE dname= :dname; EXEC SQL DECLARE Emp CURSOR FOR SELECT * FROM Employee WHERE dno = :dnum FOR UPDATE EXEC SQL OPEN Emp; EXEC SQL FETCH Emp INTO :E.ssn, :E.dno, :E.name, :E.sal; while (SQLCODE == 0) { EXEC SQL UPDATE WHERE CURRENT OF CURSOR SET sal = sal * (1 + ::raise); EXEC SQL FETCH Emp INTO :E.ssn, :E.dno, :E.name, :E.sal; } EXEC SQL CLOSE CURSOR Emp /* SQL embedded in C to read the list of employees who work for the toy department and give them a 10 percent raise */ Example of Embedded SQL
Object Oriented Database Management Object Oriented databases have evolved along two different paths: Persistent Object Oriented Programming Languages: (pure ODBMSs) –Start with an OO language (e.g., C++, Java, SMALLTALK) which has a rich type system –Add persistence to the objects in programming language where persistent objects stored in databases Object Relational Database Management Systems (SQL3 Systems) –Extend relational DBMSs with the rich type system and user-defined functions. –Provide a convenient path for users of relational DBMSs to migrate to OO technology –All major vendors (e.g., Informix, Oracle) will/are supporting features of SQL3.
Object Database Management Group (ODMG) Special interest group to develop standards that allow ODBMS customers to write portable applications Standards include: –Object Model –Object Specification Languages Object Definition Language (ODL) for schema definition Object Interchange Format (OIF) to exchange objects between databases –Object Query Language declarative language to query and update database objects –Language Bindings (C++, Java, Smalltalk) Object manipulation language Mechanisms to invoke OQL from language Procedures for operation on databases and transactions
Object Model Object: –observable entity in the world being modeled –similar to concept to entity in the E/R model An object consists of: –attributes: properties built in from primitive types –relationships: properties whose type is a reference to some other object or a collection of references –methods: functions that may be applied to the object.
Class Similar objects with the same set of properties and describing similar real-world concepts are collected into a class. Class definition: interface Employee { attribute string name; attribute integer salary; attribute date date-of-birth; attribute integer empid; relationship Projects works-for inverse Projects::team; age-typeage(); } Interface Projects{ attributestring name; attribute integer projid; relationship Employee team inverse Emplolyee works-for; int number-of-employees(); }
Class Extents For each ODL class, an extent may be declared. Extent is the current set of objects belonging to the class. –Similar notion to the relation in the relational model. –Queries in OQL refer to the extent of a class and not the class directly. interface Employee (extent Emp-set) { attribute string name; attribute integer salary; attribute date date-of-birth; attribute integer empid; relationship Projects works-for inverse Projects::team; age-typeage();}
Subclasses and Inheritance A class can be declared to be a subclass of another class. Subclasses inherit all the properties –attributes –relationships –methods from the superclass. Interface Married-Employee: Employees { string spouse-name; } Substitutability: any method of superclass can be invoked over objects of any subclass (code reuse)
Class Hierarchy person employee student staff faculty student assistant grad undergrad RATA
Multiple Inheritance A class may have more than one superclass. A class inherits properties fromeach of its superclasses. There is a potential of ambiguity -- variable with same name inherited from two superclasses: –flag and error –rename variable –choose one
Object Identity Each object has an identity which it maintains even if some or all of its attributes change. Object identity is a stronger notion of identity than in relational DBMSs. Identity in relational DBMSs is value based (primary key). Identity in ODBMSs built into data model –no user specified identifier is required OID is a similar notion as pointer in programming language Object identifier (OID) can be stored as attribute in object to refer to another object. References to other objects via their OIDs can result in a containment hierarchy Note: containment hierarchy different from class hierarchy
Containment Hierarchy bicycle wheelbrakegear frame rim spoke tire leverpad Links in containment hierarchy should be read as is-part-of instead of is-a
Persistence Objects created may have different lifetimes: –transient: allocated memory managed by the programming language run-time system. E.g., local variables in procedures have a lifetime of a procedure execution global variables have a lifetime of a program execution –persistent: allocated memory and stored managed by ODBMS runtime system. Classes are declared to be persistence-capable or transient. Different languages have different mechanisms to make objects persistent: –creation time: Object declared persistent at creation time (e.g., in C++ binding) (class must be persistent-capable) –persistence by reachability: object is persistent if it can be reached from a persistent object (e.g., in Java binding) (class must be persistent- capable).
Persistent Object-Oriented Programming Languages Persistent objects are stored in the database and accessed from the programming language. Classes declared in ODL mapped to the programming language type system (ODL binding). Single programming language for applications as well as data management. –Avoid having to translate data to and from application programming language and DBMS efficient implementation less code –Programmer does not need to write explicit code to fetch data to and from database persistent objects to programmer looks exactly the same as transient objects. System automatically brings the objects to and from memory to storage device. (pointer swizzling).
Disadvantages of ODBMS Approach Low protection –since persistent objects manipulated from applications directly, more changes that errors in applications can violate data integrity. Non-declarative interface: –difficult to optimize queries –difficult to express queries But ….. –Most ODBMSs offer a declarative query language OQL to overcome the problem. –OQL is very similar to SQL and can be optimized effectively. –OQL can be invoked from inside ODBMS programming language. –Objects can be manipulated both within OQL and programming language without explicitly transferring values between the two languages. –OQL embedding maintains simplicity of ODBMS programming language interface and yet provides declarative access.
OQL Example interface Employee { attribute string name; relationship setof(Projects) works-for inverse Projects::team; } Interface Projects{ attributestring name; relationship setof(Employee) team inverse Emplolyee works-for; int number-of-employees(); } Select number-of-employees() From Employee e, e.works-for where name = “sharad” Find number of employees working on each project “sharad” works on
Migration of RDBMSs towards OO Technologies SQL3 standard incorporates OO concepts in the relational model. A row in a table considered as an object SQL3 allows a type to be declared for tuples (similar to class in ODBMSs) Relations are collection of tuples of a row type (similar to extent in ODBMSs) Rows in a relation can refer to each other using a reference type (similar to object identity in ODBMSs) A reference can be dereferenced to navigate among tables Attributes in a relation can belong to abstract data types Methods and functions (expressed in SQL as well as host programming language) can be associated with abstract data types
SQL-3 Example CREATE ROW TYPE Employee-type { name CHAR(30) works-for REF(Projects-type) } CREATE ROW TYPE Projects-type { name CHAR(30) team setof(REF(Employee-type)) } CREATE TABLE Emp OF TYPE Employee-type CREATE TABLE Project of TYPE Project-type Select works-for --> name From Emp Where name = ‘sharad’ Return name of the project sharad works for
OQL CMSC-461 Database Management Systems
OQL -- Motivation Relational languages suffer from impedance mismatch when we try to connect them to conventional languages like C or C++. – The data models of C and SQL are radically different, e.g. C does not have relations, sets, or bags as primitive types; C is tuple-at-a-time, SQL is relation-at-a-time.
OQL -- Motivation (II) OQL is an attempt by the OO community to extend languages like C++ with SQL-like, relation-at-a-time dictions. OQL is query language paired with schema-definition language ODL.
OQL Types Basic types: strings, ints, reals, etc., plus class names. Type constructors: – Struct for structures. – Collection types: set, bag, list, array. Like ODL, but no limit on the number of times we can apply a type constructor. Set(Struct()) and Bag(Struct()) play special roles akin to relations.
OQL Uses ODL as its Schema-Definition Portion For every class we can declare an extent = name for the current set of objects of the class. – Remember to refer to the extent, not the class name, in queries.
Example interface Bar (extent Bars) { attribute string name; attribute string addr; relationship Set beersSold inverse Sell::bar; }
Example (II) interface Beer (extent Beers) { attribute string name; attribute string manf; relationship Set soldBy inverse Sell::beer; }
Example (III) interface Sell (extent Sells) { attribute float price; relationship Bar bar inverse Bar::beersSold; relationship Beer beer inverse Beer::soldBy; }
Path Expressions Let x be an object of class C. If a is an attribute of C, then x.a = the value of a in the x object. If r is a relationship of C, then x.r = the value to which x is connected by r. – Could be an object or a collection of objects, depending on the type of r. If m is a method of C, then x.m (...) is the result of applying m to x.
Examples Let s be a variable whose type is Sell. s.price = the price in the object s. s.bar.addr = the address of the bar mentioned in s. – Note: cascade of dots OK because s.bar is an object, not a collection.
Example of Illegal Use of Dot b.beersSold.price, where b is a Bar object. Why illegal? Because b.beersSold is a set of objects, not a single object.
OQL Select-From-Where SELECT FROM WHERE
OQL Select-From-Where (II) Collections in FROM can be: 1. Extents. 2. Expressions that evaluate to a collection. Following a collection is a name for a typical member, optionally preceded by AS.
Example Get the menu at Joe's. SELECT s.beer.name, s.price FROM Sells s WHERE s.bar.name = "Joe's Bar" Notice double-quoted strings in OQL. Result is of type Bag(Struct(name: string, price: float))
Example Another way to get Joe's menu, this time focusing on the Bar objects. SELECT s.beer.name, s.price FROM Bars b, b.beersSold s WHERE b.name = "Joe's Bar" Notice that the typical object b in the first collection of FROM is used to help define the second collection. – Typical usage: if x.a is an object, you can extend the path expression; if x.a is a collection, you use it in the FROM list.
Tailoring the Type of the Result Default: bag of structs, field names taken from the ends of path names in SELECT clause. Example SELECT s.beer.name, s.price FROM Bars b, b.beersSold s WHERE b.name = "Joe's Bar" has result type: Bag(Struct( name: string, price: real))
Rename Fields Prefix the path with the desired name and a colon. Example SELECT beer: s.beer.name, s.price FROM Bars b, b.beersSold s WHERE b.name = "Joe's Bar"
Change the Collection Type Use SELECT DISTINCT to get a set of structs.
Example SELECT DISTINCT s.beer.name, s.price FROM Bars b, b.beersSold s WHERE b.name = "Joe's Bar" Use ORDER BY clause to get a list of structs.
Example joeMenu = SELECT s.beer.name, s.price FROM Bars b, b.beersSold s WHERE b.name = "Joe's Bar" ORDER BY s.price ASC ASC = ascending (default); DESC = descending. We can extract from a list as if it were an array, e.g. cheapest = joeMenu[1].name;
Subqueries Used mainly in FROM clauses and with quantifiers EXISTS and FORALL.
Example: Subquery in FROM Find the manufacturers of the beers served at Joe's. SELECT b.manf FROM ( SELECT s.beer FROM Sells s WHERE s.bar.name = "Joe's Bar" ) b
Quantifiers Boolean-valued expressions for use in WHERE-clauses. FOR ALL x IN : EXISTS x IN : The expression has value TRUE if the condition is true for all (resp. at least one) elements of the collection.
Example Find all bars that sell some beer for more than $5. SELECT b.name FROM Bars b WHERE EXISTS s IN b.beersSold : s.price > 5.00 Problem How would you find the bars that only sold beers for more than $5?
Example Find the bars such that the only beers they sell for more than $5 are manufactured by Pete's. SELECT b.name FROM Bars b WHERE FOR ALL be IN ( SELECT s.beer FROM b.beersSold s WHERE s.price > 5.00 ) : be.manf = "Pete's"
Extraction of Collection Elements a) A collection with a single member: Extract the member with ELEMENT.
Example Find the price Joe charges for Bud and put the result in a variable p. p = ELEMENT( SELECT s.price FROM Sells s WHERE s.bar.name = "Joe's Bar" AND s.beer.name = "Bud" )
Extraction of Collection Elements (II) b) Extracting all elements of a collection, one at a time: – 1. Turn the collection into a list. – 2. Extract elements of a list with [i].
Example Print Joe's menu, in order of price, with beers of the same price listed alphabetically.
Example (II) L = SELECT s.beer.name, s.price FROM Sells s WHERE s.bar.name = "Joe's Bar" ORDER BY s.price, s.beer.name; printf("Beer\tPrice\n\n"); for(I = 1; I <= COUNT(L); i++) printf("%s\t%f\n", L[i].name, L[i].price );
Aggregation The five operators avg, min, max, sum, count apply to any collection, as long as the operators make sense for the element type.
Example Find the average price of beer at Joe's. x = AVG( SELECT s.price FROM Sells s WHERE s.bar.name = "Joe's Bar" ); Note coercion: result of SELECT is technically a bag of 1-field structs, which is identified with the bag of the values of that field.
Grouping Recall SQL grouping, for example: SELECT bar, AVG(price) FROM Sells GROUP BY bar; Is the bar value the "name" of the group, or the common value for the bar component of all tuples in the group?
Grouping (II) In SQL it doesn't matter, but in OQL, you can create groups from the values of any function(s), not just attributes. – Thus, groups are identified by common values, not \name." – Example: group by first letter of bar names (method needed).
Outline of OQL Group-By Collection Defined by FROM, WHERE Collection with function values and partition Output collection Group by values of function(s) Terms from SELECT clause
Example l Find the average price of beer at each bar. SELECT barName, avgPrice: AVG( SELECT p.s.price FROM partition p ) FROM Sells s GROUP BY barName: s.bar.name
Example (II) l 1. Initial collection = Sells. »But technically, it is a bag of structs of the form Struct(s: s1) Where s1 is a Sells object. Note, the lone field is named s; in general, there are fields for all of the tuple variables in the FROM clause.
Example (II) l 2. Intermediate collection: »One function: s.bar.name maps Sells objects s to the value of the name of the bar referred to by s. »Collection is a set of structs of type: Struct{barName: string, partition: Set }
Example (III) »For example: Struct(barName = "Joe's Bar", partition = {s 1,…, s n }) where s 1,…, s n are all the structs with one field, named s, whose value is one of the Sells objects that represent Joe's Bar selling some beer.
Example (IV) l 3. Output collection: consists of beer- average price pairs, one for each struct in the intermediate collection. »Type of structures in the output: Struct{barName: string, avgPrice: real}
Example (V) »Note that in the subquery of the SELECT clause: SELECT barName, avgPrice: AVG( SELECT p.s.price FROM partition p ) We let p range over all structs in partition. Each of these structs contains a single field named s and has a Sells object as its value. Thus, p.s.price extracts the price from one of the Sells tuples. »Typical output struct: Struct(barName = "Joe's Bar", avgPrice = 2.83)
Another, Less Typical Example l Find, for each beer, the number of bars that charge a "low" price ( 2.00) and a "high" price ( 4.00) for that beer. l Strategy: group by three things: »1. The beer name, »2. A boolean function that is true iff the price is low. »3. A boolean function that is true iff the price is high.
The Query l SELECT beerName, low, high, count: COUNT(partition) FROM Beers b, b.soldBy s GROUP BY beerName: b.name, low: s.price = 4.00
The Query (II) l 1. Initial collection: Pairs (b; s), where b is a beer, and s is a Sells object representing the sale of that beer at some bar. »Type of collection members: Struct{b: Beer, s: Sell}
2. Intermediate collection l Quadruples consisting of a beer name, booleans telling whether this group is for high, low, or neither prices for that beer, and the partition for that group. l The partition is a set of structs of the type: Struct{b: Beer, s: Sell} A typical value: Struct(b: "Bud" object, s: a Sells object involving Bud)
2. Intermediate collection (II) »Type of quadruples in the intermediate collection: Struct{ beerName: string, low: boolean, high: boolean, partition: Set }
2. Intermediate collection (III) »BeerName low high partition »Bud TRUE FALSE S low Bud FALSE TRUE S high Bud FALSE FALSE S mid »where S low S high, and S mid are the sets of beer-sells pairs (b; s) where the beer is Bud and s has, respectively, a low ( 2:00), high ( 4:00) and medium (between 2.00 and 4.00) price. » Note the partition with low = high = TRUE must be empty and will not appear.
3. Output collection: l The first three components of each group's struct are copied to the output, and the last (partition) is counted. The result: beerName low high count Bud TRUE FALSE 27 Bud FALSE TRUE 14 Bud FALSE FALSE 36
SQL3 Objects
Objects in SQL3 OQL extends C++ with database concepts, while SQL3 extends SQL with OO concepts.
Objects in SQL3 (II) Ullman's personal opinion: the relation is so fundamental to data manipulation that retaining it as the core, as SQL3 does, is "right." Systems using the SQL3 philosophy are called object-relational.
Objects in SQL3 (III) – All the major relational vendors have something of this kind, allowing any class to become the type of a column. InformixData Blades OracleCartridges SybasePlug-Ins IBM/DB2Extenders
Two Levels of SQL3 Objects 1. For tuples of relations = "row types." 2. For columns of relations = "types." – But row types can also be used as column types.
References Row types can have references. If T is a row type, then REF(T) is the type of a reference to a T object. Unlike OO systems, refs are values that can be seen by queries.
Example of Row Types CREATE ROW TYPE BarType ( name CHAR(20) UNIQUE, addr CHAR(20) ); CREATE ROW TYPE BeerType ( name CHAR(20) UNIQUE, manf CHAR(20) );
Example of Row Types (II) CREATE ROW TYPE MenuType ( bar REF(BarType), beer REF(BeerType), price FLOAT );
Creating Tables Row-type declarations do not create tables. – They are used in place of element lists in CREATE TABLE statements. Example – CREATE TABLE Bars OF TYPE BarType – CREATE TABLE Beers OF TYPE BeerType – CREATE TABLE Sells OF TYPE MenuType
Dereferencing A B = the B attribute of the object referred to by reference A. Example – Find the beers served by Joe. SELECT beer -> name FROM Sells WHERE bar -> name = 'Joe''s Bar';
OID's as Values A row type can have a reference to itself. – Serves as the OID for tuples of that type. Example CREATE ROW TYPE BarType ( name CHAR(20), addr CHAR(20), barID REF(BarType) ); CREATE TABLE Bars OF TYPE BarType VALUES FOR barID ARE SYSTEM GENERATED
OID's as Values (II) VALUES... clause forces the barID of each tuple to refer to the tuple itself. NameaddrbarID Joe'sMaple St.
Example: Using References as Values Find the menu at Joe's. SELECT Sells.beer->name, Sells.price FROM Bars, Sells WHERE Bars.name = 'Joe''s Bar' AND Bars.barID = Sells.bar;
ADT's in SQL3 Allows a column of a relation to have a type that is a "class," including methods. Intended application: data that doesn't fit relational model well, e.g., locations, signals, images, etc. The type itself is usually a multi-attribute tuple.
ADT's in SQL3 (II) Type declaration: CREATE TYPE ( attributes method declarations or definitions ); Methods defined in a PL/SQL-like language.
Example CREATE TYPE BeerADT ( name CHAR(20), manf CHAR(20), FUNCTION newBeer( :n CHAR(20), :m CHAR(20)) RETURNS BeerADT; :b BeerADT; /* local decl. */ BEGIN :b := BeerADT(); /* built-in constructor */ :b.name := :n; :b.manf := :m; RETURN :b; END; FUNCTION getMinPrice(:b BeerADT) RETURNS FLOAT; );
Example (II) getMinPrice is declaration only; newBeer is definition. getMinPrice must be defined somewhere where relation Sells is available.
Example (III) FUNCTION getMinPrice(:b BeerADT) RETURNS FLOAT; :p FLOAT; BEGIN SELECT MIN(price) INTO :p FROM Sells WHERE beer->name = :b.name; RETURN :p; END;
Built-In Comparison Functions We can define for each ADT two functions EQUAL and LESSTHAN that allow values of this ADT to participate in WHERE clauses involving =, <=, etc.
Example: A " Point" ADT CREATE TYPE Point ( x FLOAT, y FLOAT, FUNCTION EQUALS( :p Point, :q Point ) RETURNS BOOLEAN; BEGIN IF :p.x = :q.x AND :p.y = :q.y THEN RETURN TRUE ELSE RETURN FALSE; END;
Example: A " Point" ADT (II) FUNCTION LESSTHAN( :p Point, :q Point ) RETURNS BOOLEAN; BEGIN IF :p.x > :q.x THEN RETURN FALSE ELSIF :p.x < :q.x THEN IF :p.y <= :q.y THEN RETURN TRUE ELSE RETURN FALSE ELSE /* :p.x = :q.x IF :p.y < :q.y THEN RETURN TRUE ELSE RETURN FALSE END; );
Using the Comparison Functions Here is a query that computes the lower convex hull of a set of points. Assumes MyPoints(p) is a relation with a single column p of type Point. – SELECT p FROM MyPoints WHERE NOT p > ANY MyPoints;
Using the Comparison Functions (II)