ACTIVE DATABASE Traditional database systems are passive: respond to applications We may wish to extend the database systems by allowing them to invoke applications in response to an event For example, an inventory database system may initiate re-order transaction if the inventory level falls below specified level
ACTIVE DATABASE An active database system (ADBS) is a DBS that monitors situations of interest and, when they occur, triggers an appropriate response in a timely manner. The desired behavior is expressed in production rules (also called event-condition-action rules), which are defined and stored in the DBS.
An active database is a database that includes active rules, mostly in the form of ECA rules. Active database systems enhance traditional database functionality with powerful rule-processing capabilities, providing a uniform and efficient mechanism for many database system applications. Among these applications are integrity constraints, views, authorization, statistics gathering, monitoring and alerting, knowledge-based systems, expert systems, and workflow management. This significant collection focuses on the most prominent research projects in active database systems. The project leaders for each prototype system provide detailed discussions of their projects and the relevance of their results to the future of active database systems.
Active Databases and Triggers Triggers is a technique for specifying certain types of active rules. The Event-Condition-Action (ECA) is a model for specifying active database rules. A rule in ECA model has three components: The event would be a database update operations, temporal events, or other kinds of external events. The condition determines whether the rule action should be executed. The action is a sequence of SQL statements, transactions, or an external program that will be automatically executed.
Active Databases and Triggers Example: Consider a simplified version of the Company database In this version, the TOTAL_SAL attribute is a derived attribute, whose value should be the sum of the salaries of all employees who are assigned to the particular department.
Active Databases and Triggers Maintaining the correct value for TOTAL_SAL can be done via an active rule. The events that may cause a change in the value of TOTAL_SAL are: Inserting new employee tuples. Changing the salary of existing employees. Changing the assignment of existing employees from one department to another. Deleting (one or more) employee tuples.
Active Databases and Triggers FIGURE 24.2 Specifying active rules as triggers in Oracle notation. Triggers for automatically maintaining the consistency of TOTAL_SAL of DEPARTMENT.
Active Databases and Triggers FIGURE 24.2 (continued) Specifying active rules as triggers in Oracle notation. (b) Trigger for comparing an employee’s salary with that of his or her supervisor.
Active Databases and Triggers The optional FOR EACH ROW clause is known as a row-level trigger (i.e., the rule is triggered separately for each tuple). If FOR EACH ROW clause was left out, the trigger would be known as a statement-level trigger (i.e., the rule would be triggered once for each triggering statement). The keywords NEW and OLD can only be used with row-level triggers.
Active Databases and Triggers FIGURE 24.3 A syntax summary for specifying triggers in the Oracle system (main options only).
Design Issues for Active Databases The first issue concerns, activation, deactivation, and grouping of rules. The activate command will make the rule active again. The deactivate command will make the trigger event not be triggered. The delete command deletes the rule from the system. The rule set option can be used to group rules (so, the whole set of rules can be activated, deactivated, or dropped). The PROCESS RULES command can triggers a rule or rule set.
Design Issues for Active Databases The second issue concerns whether the triggered action should be executed before, after, or concurrently with the trigger event. A related issue is whether the action being executed should be considered as a separate action or whether it should be part of the same transaction that triggered the rule. The rule condition evaluation is also known as rule consideration.
Design Issues for Active Databases Three main possibilities for rule consideration: Immediate consideration: the condition is evaluated as part of the same transaction as the trigger event, and is evaluated immediately; in one of the following forms Before executing the trigger event. After executing the trigger event. Instead of executing the trigger event. Deferred consideration: the condition is evaluated at the end of transaction that include the trigger event. Detached consideration: the condition is evaluated as a separate transaction.
Design Issues for Active Databases Difficulties with using active rules: To verify that a set of rules is consistent. To guarantee termination of a set of rules under all circumstances. FIGURE 24.4 An example to illustrate the termination problem for active rules.
Potential applications for Active Databases To allow notification of certain conditions that occur (e.g., to monitor the temperature of an industrial furnace). To enforce integrity constraints by specifying the types of events that may cause the constraints to be violated. Automatic maintenance of derived data (e.g., the derived attribute TOTAL_SAL in the simplified version of Company schema).
Temporal Database Concepts Temporal databases encompass all database applications that require some aspect of time when organizing their information (e.g., reservation system in hotels, airline, etc.). For temporal databases, time is considered to be an ordered sequences of points (e.g., seconds, minutes, day, etc.). Is SQL2, the temporal data types include DATE, TIME, TIMESTAMP, INTERVAL, and PERIOD
Temporal Database Concepts A temporal database will store information concerning when certain events occur. Types of temporal information: Point events: associated in the database with a single time point (e.g., 15/08/1998). Duration events: associated in the database with a specific time period (e.g., [15/08/1998, 15/08/2000]). An interpretation of a temporal database in which the associated time with their events is a valid time in the real world, is known as a valid time database.
Temporal Database Concepts An interpretation of a temporal database in which the associated time with their events is the value of the system time clock, is known as a transaction time database. Incorporating time in relational databases can be done by adding attributes VST (Valid_START_TIME) and VET (VALID_END_TIME) into an ordinary relation. Each tuple, V, represents a version of the entity member that is valid in the interval [V.VST, V.VET]. The current version has a special value, now.
Temporal Database Concepts FIGURE 24.7 Different types of temporal relational databases. Valid time database schema. Transaction time database schema. Bitemporal database schema.
Temporal Database Concepts The special value, now, is a temporal variable that implicitly represents the current time as time progresses. In order to update a tuple, a new version is created and the current version is closed (by changing its VET to the end time). In proactive update, the database is updated before it becomes effective in the real world. In retroactive update, the database is updated after it becomes effective in the real world.
Temporal Database Concepts To deleting a tuple the current version is closed. To insert a new entity member, create the first tuple version and make it the current version (i.e., VST being the effective time, and VET= new). Note: in a valid time relation, the nontemporal key, such as SSN in EMPLOYEE relation, is no longer unique in each version. The new relation key for EMP_VT is a combination of the nontemporal key and the valid start time attribute VST.
Temporal Database Concepts FIGURE 24.8 Some tuple versions in the valid time relations EMP_VT and DEPT_VT.
Temporal Database Concepts Attributes that change over time are called time-varying attributes. Attributes that do not change over time are called non-time-varying attributes. In tuple versioning approach, whenever an attribute value in changed a whole new tuple version is created. In attribute versioning approach , a single complex object is used to store all the temporal changes of the object.
Temporal Database Concepts FIGURE 24.10 Possible ODL schema for a temporal valid time Employee_VT object class using attribute versioning
Multimedia Databases Spatial databases store objects that have spatial characteristics that describe them (e.g., cartographic databases store maps). The main extensions that are needed for spatial databases are models that can interpret spatial characteristics. The basic extensions needed are to include two dimensional geometric concepts such as points, lines, circles, polygons, and arcs.
Multimedia Databases Typical types of spatial queries: Range query: finds the objects of a particular type that are within a given spatial area (e.g., finds all hospitals in a given suburb or city). Nearest neighbor query: finds an object of a particular type that is closest to a given location (e.g., finds the nearest shopping center). Spatial joins or overlays: joins the objects of two types based on some spatial conditions (e.g., finds all homes that are within two km of a lake).
Multimedia Databases One of the best known technique, in order to answer spatial queries, is the use of R-trees and their variations. R-trees group together objects that are in close spatial physical proximity on the same leaf nodes of a tree-structured index. Other spatial storage structures include quadtrees and their variations. Quadtrees divide each space into equally sized areas, and proceed with the subdivisions of each subspace to identify the positions of various objects.
Multimedia Databases Multimedia databases provide features that allow users to store and query different types of multimedia information (e.g., images, video clips, audio clips, and documents). The main types of database queries involve locating multimedia sources that contain objects of interest. These types of queries are called content-based retrieval queries. Identifying the contents of multimedia sources is a difficult and time-consuming task.
Multimedia Databases Two main approaches to content-based retrieval: Automatic analysis: uses different techniques depending on the type of multimedia source (e.g., image, text, video, or audio). Manual identification: depends on the objects and activities of interest in each multimedia source and on using this information to index the source.It requires a manual preprocessing phase where a person has to scan each multimedia source to identify and catalog the objects and activities it contains.
Deductive databases A deductive database system is a database system which can make deductions (ie: conclude additional facts) based on rules and facts stored in the (deductive) database. Datalog is the language typically used to specify facts, rules and queries in deductive databases. Deductive databases have grown out of the desire to combine logic programming with relational databases to construct systems that support a powerful formalism and are still fast and able to deal with very large datasets. Deductive databases are more expressive than relational databases but less expressive than logic programming systems
Deductive Databases A database system that includes capabilities to define (deductive) rules, which can deduce or infer additional information from the facts that are stored in the database is called a deductive database. Rules are specified through a declarative language –we specify what to achieve rather than how to achieve it. The model used for deductive databases is related to the logic programming and the prolog language.
Deductive Databases A variation of Prolog called Datalog is used to define rules declaratively in conjunction with an existing set of relations. Two main types of specifications: Facts: similar to the way relations are specified, except that it is not necessary to include the attribute names (note that a tuple in a relation describes some real-world facts). Rules: similar to relational views –that are not actually stored but can be formed from the facts by applying inference mechanisms based on the rule specifications.
Prolog/Datalog Notation A logic program can be thought of: A set of facts (assuming that these are all the fact in our modeled mini-world) A set of permissible deductions (proof rules) A method of deducting A goal to prove (or a query to answer) An important difference between rules and facts is that, rules specify things that are true if some conditions are satisfied.
Prolog/Datalog Notation A predicate has an implicit meaning and a fixed number of arguments. If the arguments are all constant values, the predicate states a fact. If the arguments contains variables, then it is either a query or as part of a rule or constraint. Constant values in a predicate are either numeric or strings – stating with lowercase letters. Variable names always start with an uppercase letter.
Prolog/Datalog Notation A rule is of the form head :- body, where The head or left-hand side (LHS) or conclusion is a single predicate. The body or right-had side (RHS) or premise consists of one or more predicates. A predicate with constants as arguments is called ground or an instantiated predicate.
Prolog/Datalog Notation FIGURE 24.11 (a) Prolog notation. (b) The supervisory tree.
Prolog/Datalog Notation A program contains a number of built-in predicates. Two main types of built-in predicates: The binary comparison predicates <, <=, >, >= (corresponding to less, less_or_equal, greater, greater_or_equal) over ordered domains. The comparison predicates =, /= (corresponding to equal, not_equal) over unordered domains. A program is built from basic objects called atomic formulas. Atomic formulas are literals of the form p(a1, a2, …, an), where p is the predicate name and n (the arity or degree of p) is the number of arguments.
Prolog/Datalog Notation A literal is either an atomic formula (in this case it is called a positive literal), or it is an atomic formula preceded by not (in this case it is called a negative literal). Prolog/Datalog has an internal inference engine that can be used to process and compute the results of queries. Prolog inference engine return one result to the query at a time, but Datalog returns results set-at-a-time.
Prolog/Datalog Notation Two interpretation of rules: Proof-theoretic: considers the facts and rules to be true statements, or axioms. Facts are ground axioms –have no variables Rules are deductive axioms –can deduce new facts. Model-theoretic: assigns to a predicate every possible combination of values as arguments and specifies the combination of the arguments that make the predicate true. An interpretation is called a model for a specific set of rules if the rules are always true under that interpretation.
Prolog/Datalog Notation FIGURE 24.12 Proving a new fact.
Prolog/Datalog Notation A model is called minimal model for a set of rules if we cannot change any fact from true to false and still get a model for these rules. Two main method of defining the truth values of predicates in Datalog programs: Fact-defined predicates (or relations): listing all the combinations of values (the tuples) that make the predicate true. Rule-defined predicates (or views): being the LHS of one or more Datalog rules. They correspond to virtual relations whose content can be inferred by the inference engine.
Prolog/Datalog Notation FIGURE 24.14 Fact predicates for part of the database from Figure 5.6.
Prolog/Datalog Notation FIGURE 24.15 Rule-defined predicates.
Prolog/Datalog Notation Many operations of the relational algebra can be specified in the form of Datalog rules. We do not need to specify the attribute names. The arity (degree) of each predicate and the domain (data type) of each attribute is important for operations such as UNION, INTERSECTION, and JOIN.
Prolog/Datalog Notation FIGURE 24.16 Predicates for illustrating relational operations.