Download presentation
Presentation is loading. Please wait.
1
CIS-552Introduction1 Object-Oriented Database New Database Applications Object-Oriented Data Models Object-Oriented Languages Persistent Programming Languages Persistent C++ Systems
2
CIS-552Introduction2 New Database Applications Data models designed for data-processing-style applications are not adequate for new technologies such as computer-aided design, computer-aided software engineering, multimedia, and image database, and document/hypertext databases. These new applications requirement the database system to handle features such as: –Complex data types –Data encapsulation and abstract data structures –Novel methods for indexing and querying
3
CIS-552Introduction3 Object-Oriented Data Model Loosely speaking, an object corresponds to an entity in the E-R model. The object-oriented paradigm is based on encapsulating code and data related to an object into a single unit. The object-oriented data model is a logical model (like the E/R model). Adaptation of the object-oriented programming paradigm (e.g. Smalltalk, C++) to database systems.
4
CIS-552Introduction4 Object Identity An object retains its identity even if some or all of the values of the variables or definitions of methods change over time. Object identity is a stronger notion of identity than in programming languages or data models not based on object orientation. –Value – data value; used in relational systems. –Name – supplied by user; used for variables in procedures. –Build-in – identity built into data model or programming language No user-supplied identifier is required. Form of identity used in object-oriented systems.
5
CIS-552Introduction5 Object Identifiers Object identifiers used to uniquely identify objects –Can be stored as a field of an object, to refer to another object. –E.g., the spouse field of a person object may be an identifier of another person object –Can be system generated (created by database) or external (such as social-security number)
6
CIS-552Introduction6 Object Containment Each component in a design may contain other components Can be modeled as containment of objects. Objects containing other objects are called complex or composite objects. Multiple levels of containment create a containment hierarchy: links interpreted as is-part-of, not is-a. Allows data to be viewed at different granularities by different users. bicycle wheelbrakeframegear rimlevercablespokestirepad
7
CIS-552Introduction7 Object-Oriented Languages Object-oriented concepts can be used as a design tool, and be encoded into, for example, a relational database (analogous to modeling data with E/R diagram and then converting to a set of relations). The concepts of object orientation can be incorporated into a programming language that is used to manipulate the database. –Object-relational systems – add complex types and object-orientation to relational languages. –Persistent programming languages – extend object- oriented programming language to deal with databases by adding concepts such as persistence and collections.
8
CIS-552Introduction8 OO-DBMS Save objects created by an OOP language to disk (make objects persistent). Ensure that if an object is saved, all of the objects it references are saved. Allow saved objects (and the objects they reference) to be retrieved from disk. Provide transaction management and concurrency control to maintain data integrity.
9
CIS-552Introduction9 Persistent Programming Language Persistent programming languages: –Allow objects to be created and stored in a database without any explicit format changes (format changes are carried out transparently). –Allow objects to be manipulated in-memory – do not need to explicitly load from or store to the database. –Allow data to be manipulated directly from the programming language without having to go though a data manipulation language like SQL. Due to power of most programming languages, it is easy to make programming errors that damage the database. Complexity of languages makes automatic high-level optimization more difficult. Do not support declarative querying very well
10
CIS-552Introduction10 Persistence of Objects Approaches to make transient objects persistent include establishing persistence by: –Class – declare all objects of a class to be persistent; simple but inflexible. –Creation – extend the syntax for creating transient objects to create persistent objects. –Marking – an object that is to persist beyond program execution is marked as persistent before program termination. –Reference – declare (root) persistent objects; objects are persistent if they are referred to (directly or indirectly) from a root object.
11
CIS-552Introduction11 Object Identity and Pointers A persistent object is assigned a persistent object identifier. Degrees of permanence of identity: –Intraprocedure – identity persists only during the execution of a single procedure. –Intraprogram – identity persists only during execution of a single program or query. –Interprogram – identity persists from one program execution to another. –Persistent – identity persists through program executions and structural reorganizations of data; required for object-oriented systems.
12
CIS-552Introduction12 Object Identity and Pointers (Cont.) In O-O languages such as C++, an object identifier is actually an in-memory pointer. Persistent pointer – persists beyond program execution; can be thought as a pointer into the database.
13
CIS-552Introduction13 Storage and Access of Persistent Objects How to find objects in the database: Name objects (as you would name files) – cannot scale to large number of objects. –Typically given only to class extents and other collections of objects, but not to objects. Expose object identifiers or persistent pointers to the objects – can be stored externally. –All objects have object identifiers.
14
CIS-552Introduction14 Storage and Access of Persistent Objects (Cont.) How to find objects in the database (Cont): Store collections of objects and allow programs to iterate over the collections to find required objects. –Model collections of objects as collection types –Class extent – the collection of all objects belonging to the class; usually maintained for all classes that can have persistent objects.
15
CIS-552Introduction15 Persistent C++ System C++ language allows support for persistence to be added without changing the language –declare a class called Persistent_Object with attributes and methods to support persistence –Overloading - ability to redefine standard function names and operators (i.e., +, -, the pointer dereference operator ) when applied to new types Providing persistence without extending the C++ language is –relatively easy to implement –but more difficult to use
16
CIS-552Introduction16 ODMG C++ Object Definition Language Standardized language extensions to C++ to support persistence ODMG standard attempts to extend C++ as little as possible, providing most functionality via template classes and class libraries Templates class Ref used to specify references (persistent pointers) Template class Set used to define sets of objects. Provides methods such as insert_element and delete_element. The C++ object definition language (ODL) extends the C++ type definition syntax in minor ways. Example: Use notation inverse to specify referential integrity constraints.
17
CIS-552Introduction17 ODMG C++ ODL: Example Class Person : public Persistent Object { public: String name; String address; }; class Customer : public Person { public: Date member_from; int customer_id; Ref home_branch; Set > accounts inverse Account::owners; };
18
CIS-552Introduction18 ODMG C++: Example (Cont.) Class Account : public Persistent_Object { private: int balance; public: int number; Set > owners inverse Customer::accounts; int find_balance(); int update_balance(int delta); }
19
CIS-552Introduction19 ODMG C++ Object Manipulation Language Uses persistent versions of C++ operators such as new(db). Ref account = new(bank_db) Account; new allocates the object in the specified database, rather than in memory Dereference operator when applied on a Ref object in memory (if not already present) and returns in-memory pointer to the object. Constructor for a class – a special method to initialize objects when they are created; called automatically when new is executed Destructor for a class – a special method that is called when objects in the class are deleted.
20
CIS-552Introduction20 ODMG C++ OML: Example int create_account_owner(String name, String address) { Database * bank_db; bank_db = Database::open(“Bank-DB”); Transaction Trans; Trans.begin(); Ref account = new(bank_db) Account; Ref cust = new(bank_db) Customer; cust->name = name; cust->address = address; cust->accounts.insert_element(account); account->owners.insert_element(cust); … Code to initialize customer_id, account number, etc. Trans.commit(); }
21
CIS-552Introduction21 ODMG C++ OML: Example of Iterators int print_customers() { Database * bank_db; bank_db = Database::open(“Bank-DB”); Transaction Trans; Trans.begin(); Iterator > iter = Customer::all_customer.create_iterator(); Ref p; while (iter.next(p)) { print_cust(p); } Trans.commit(); } Iterator construct helps step through objects in a collection
22
CIS-552Introduction22 Mapping of Objects to Files Mapping objects to files is similar to mapping tuples to files in a relational system; object data can be stored using file structures. Objects in O-O databases may lack uniformity and may be very large; such objects have to be managed differently from records in a relational system. –Set fields with a small number of elements may be implemented using data structures such as linked lists. –Set fields with a larger number of elements may be implemented as B-trees, or as separate relations in the database. –Set fields can also be eliminated at the storage level by normalization.
23
CIS-552Introduction23 Mapping of Objects to Files (Cont.) Objects are identified by an object identifier (OID); the storage system needs a mechanism to locate an object given its OID. –logical identifiers do not directly specify an object’s physical location; must maintain an index that maps an OID to the object’s actual location. –physical identifiers encode the location of the object so the object can be found directly. Physical OIDs typically have the following part: 1. a volume or file identifier 2. a page identifier within the volume or file 3. an offset within the page
24
CIS-552Introduction24 Management of Persistent Pointers Physical OIDs may have a unique identifier. This identifier is stored in the object also and is used to detect references via dangling pointers. Physical Object Identifier Object (a) General Structure (b) Example of use 6.32.45608 Good OID Bad OID Location Unique-IdData
25
CIS-552Introduction25 Management of Persistent Pointers (Cont.) Implement persistent pointers using OIDs; persistent pointers are substantially longer than are in-memory pointers Pointer swizzling cuts down on cost of locating persistent objects already in memory. Software swizzling (swizzling on pointer dereference) –When a persistent pointer is first dereferenced, it is swizzled (replaced by an in-memory pointer) after the object is located in memory. –Subsequent dereferences of the same pointer become cheap –The physical location of an object in memory must not change if swizzled pointers point to it; the solution is to pin pages in memory –When an object is written back to disk, any swizzled pointers it contains need to be unswizzled.
26
CIS-552Introduction26 Hardware Swizzling Persistent pointers in objects need the same amount of space as in-memory pointers – extra storage external to the object is used to store rest of pointer information. Uses virtual memory translation mechanism to efficiently and transparently convert between persistent pointers and in-memory pointers. All persistent pointers in a page are swizzled when the page is first read in. –Thus programmers have to work with just one type of pointer, i.e. in-memory pointer. –Some of the swizzled pointers may point to virtual memory addresses that are currently not allocated any real memory.
27
CIS-552Introduction27 Hardware Swizzling Persistent pointer is conceptually split into two parts: a page identifier, and an offset within the page. –The page identifier in a pointer is a short indirect pointer: each page has a translation table that provides a mapping from the short page identifiers to full database page identifiers. –Translation table for a page is small (at most 1024 pointers in a 4096 byte page with 4 byte pointers) –Multiple pointers in a page to the same page share same entry in the translation table.
28
CIS-552Introduction28 Hardware Swizzling (Cont.) Page image when on disk (before swizzling) Object 2Object 1Object 3 PageID FullPageID Translation Table
29
CIS-552Introduction29 When an in-memory pointer is dereferenced, if the operating system detects the page it points to has not yet been allocated storage, a segmentation violation occurs. mmap call associates function to be called on segmentation violation The function allocates storage for the page and reads in the page from disk. Swizzling is then done for all persistent pointers in the page (located using object type information). –If pointer points to a page not already allocated a virtual memory address, a virtual memory address is allocated (preferably the address in the short page identifier if it is unused). Storage is not yet allocated for the page. –The page identifier in pointer (and translation table entry) are changed to the virtual memory address of the page. Hardware Swizzling (Cont.)
30
CIS-552Introduction30 Page image after swizzling Page with short page identifier 2395 was allocated address 5001. Observe change in pointers and translation table. Page with short page identifier 4867 has been allocated address 4867. No change in pointer and translation table. Hardware Swizzling (Cont.) Object 2Object 1Object 3 PageID FullPageID Translation Table
31
CIS-552Introduction31 After swizzling, all short page identifiers point to virtual memory address allocated for the page –Functions accessing the objects need not know it has persistent pointers! –Can reuse existing code and libraries that use in-memory pointers. If all pages are allocated the same address as in the short page identifier, no changes required in the page! No need for deswizzling – page after swizzling can be saved back directly to disk A process should not access more pages than size of virtual memory – reuse of virtual memory addresses for other pages is expensive. Hardware Swizzling (Cont.)
32
CIS-552Introduction32 Disk versus Memory Structure of Objects The format in which objects are stored in memory may be different from the format in which they are stored on disk in the database. Reasons are : –software swizzling – structure of persistent and in-memory pointers are different –database accessible from different machines, with different data representations Make the physical representation of objects in the database independent of the machine and the compiler. Can transparently convert from disk representation to form required on the specific machine, language, and compiler, when the object (or page) is brought into memory.
33
CIS-552Introduction33 Large Objects Very large objects are called binary large objects (blobs) because they typically contain binary data. Examples include: –text documents –Graphical data such as images and computer aided designs –audio and video data Large objects may need to be stored in a contiguous sequence of bytes when brought into memory. –If an object is bigger than a page, contiguous pages of the buffer pool must be allocated to store it. –May be preferable to disallow direct access to data, and only allow access through a file-system-like API, to remove need for contiguous storage.
34
CIS-552Introduction34 Modifying Large Objects Use B-tree structures to represent object: permits reading the entire object as well as updating, inserting, and deleting bytes from specified regions of the object. Special-purpose application programs outside the database are used to manipulate large objects: –Text data treated as a byte string manipulated by editors and formatters. –Graphical data is represented as a bit map or as a set of geometric objects; can be managed within the database system or by special software (e.g. VLSI design). –Audio/video data is typically created and displayed by separate application software and modified using special purpose editing software. –checkout/checkin method for concurrency and version control
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.