Transaction
Introduction to ARIES
Introduction to ARIES ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) ARIES is a recovery algorithm When the recovery manager is invoked after a crash, restart proceeds in three phases. Analysis phase. Determines the earliest log record from which the next pass must start. It also scans the log forward from the checkpoint record to construct a snapshot of what the system looked like at the instant of the crash. Redo phase. it repeats all actions, starting from an appropriate point in the log, and restores the database state to what it was at the time of the crash. Undo phase. it undoes the actions of transactions that did not commit, so that the database reflects only the actions of committed transactions.
Introduction to ARIES In addition to the log, the following two tables contain important recovery related information: Transaction Table: Dirty page table: Transaction Table: This table contains one entry for each active transaction. 'The entry contains the transaction id, the status, and a field called lastLSN, which is the LSN of the most recent log record for this transaction. The status of a transaction can be that it is in progress, or aborted Dirty page table: This table contains one entry for each dirty page in the buffer pool, that is, each page with changes not yet reflected on disk. The entry contains a field rec LSN, which is the LSN of the first log record that caused the page to become dirty.
Introduction to ARIES During normal operation, these are maintained by the transaction manager and the buffer manager, respectively, and during restart after a crash, these tables are reconstructed in the Analysis phase of restart.
Distributed Database System
Distributed Database System A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (D–DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users.
Distributed Database System Implicit assumptions – Data stored at a number of sites each site logically consists of a single processor – Processors at different sites are interconnected by a computer network – DDBMS is a collections of DBMSs not a collection of files (not a remote file system)
Distributed Database System
Distributed Database System Advantages: • Higher reliability • Improved performance • Easier system expansion • Transparency of distributed and replicated data
Distributed Database System Advantages: Higher reliability • Replication of components • No single points of failure • e.g., a broken communication link or processing element does not bring down the entire system • Distributed transaction processing guarantees the consistency of the database and concurrency
Distributed Database System Advantages: Improved performance – Reduces remote access delays – Requires some support for fragmentation and replication – Intra-query parallelism Easier system expansion • Issue is database scaling – Network of workstations much cheaper than a single mainframe computer • Increasing database size
Distributed Database System Advantages: Transparency • Refers to the separation of the higher-level semantics of the system from the lower-level implementation issues • A transparent system “hides” the implementation details from the users. • A fully transparent DBMS provides high-level support for the development of complex applications.
Distributed Database System Disadvantages N/W connection problem: technical problem may be generated when we want to connect dissimilar machines Data security problem: security problem increases when data are located at multiple sites Data integrity problem: because date are access from many locations and perform some operations like select, update, maintaining the integrity is a big issue Cost: large communication n/w is maintained so h/w & s/w implementation cost is high
Distributed Database System Distributed database design The design of a distributed database introduces 3 new cases How to partition the database into fragments? Which fragments to replicate? Where to locate those fragments?
Distributed database design Data fragmentation It allows to break a single object into two or more segments or fragments The object might be database or a table Each fragment can be store at any site over a computer n/w Data fragmentation information is stored in the distributed data catalog Data fragmentation can be classified as Horizontal fragmentation Vertical fragmentation Mixed fragmentation
Distributed database design Data fragmentation Horizontal fragmentation It refers to the division of a relation into subset of tuples (rows) Each fragments is stored at a different nodes We can use selection capability in SQL to choose the rows in a table that we want to b returned by a query Vertical fragmentation It refers to division of a relation into attributes (columns) subset We can use projection capability in SQL to choose the columns in a table that we want to b returned by a query
Distributed database design Data fragmentation Mixed fragmentation It refers to the combination of horizontal & vertical fragmentation. That is division of a relation into subset of rows & columns Each fragments is stored at a different nodes + =
Distributed database design Data Replication Refers to the storage of data copies at multiple sites by a computer n/w Each copies are stored at a different nodes Data copies can help to reduces communication response time Suppose database F is divided into two fragments F1 & F2 Within a replicated distributed database the mechanism is possible
Distributed database design Data Replication Fragment F1 is stored at sites S1 & S3 Fragment F2 is stored at sites S2 & S3 So in data replication the same copy of data is available at more than one site
Distributed database design Data Replication Data replication can be performed in three ways Fully replicated database: it stores multiple copies of each database fragments at multiple sites Here all database fragments are replicated Partially replicated database: it stores multiple copies of some database fragment at multiple sites Most DDMS are able to handle the partial replicated database well Unreplicated database : it stores each database fragment at a single site