Temporal Databases. Outline Spatial Databases Indexing, Query processing Temporal Databases Spatio-temporal ….

Slides:



Advertisements
Similar presentations
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
Advertisements

I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Advanced Databases Temporal Databases Dr Theodoros Manavis
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
B+-trees. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B = n pages I/O complexity:
Multiversion Access Methods - Temporal Indexing. Basics A data structure is called : Ephemeral: updates create a new version and the old version cannot.
Temporal Indexing Snapshot Index. Transaction Time Environment Assume that when an event occurs in the real world it is inserted in the DB A timestamp.
B+-tree and Hashing.
Spatio-Temporal Databases
Temporal Databases. Outline Spatial Databases Indexing, Query processing Temporal Databases Spatio-temporal ….
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Temporal Databases. Outline Spatial Databases Indexing, Query processing Temporal Databases Spatio-temporal ….
Time Chapter 10 © Worboys and Duckham (2004)
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
Chapter 4: Transaction Management
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
Chapter 12 Trees. Copyright © 2005 Pearson Addison-Wesley. All rights reserved Chapter Objectives Define trees as data structures Define the terms.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
CS4432: Database Systems II
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
 B+ Tree Definition  B+ Tree Properties  B+ Tree Searching  B+ Tree Insertion  B+ Tree Deletion.
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Starting at Binary Trees
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
SPATIO-TEMPORAL DATABASES Temporal Databases. Temporal Data. Modeling Temporal Data Temporal Semantics Temporal density: the time is seen as being: 
Indexing Database Management Systems. Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files File Organization 2.
Temporal Data Modeling
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Trees CSIT 402 Data Structures II 1. 2 Why Do We Need Trees? Lists, Stacks, and Queues are linear relationships Information often contains hierarchical.
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
SPECIAL PURPOSE DATABASES 13/09/ Temporal Database Concepts  Time is considered ordered sequence of points in some granularity Use the term choronon.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
Presenters : Virag Kothari,Vandana Ayyalasomayajula Date: 04/21/2010.
CSE 373 Data Structures Lecture 7
Data Indexing Herbert A. Evans.
Spatio-Temporal Databases
Multiway Search Trees Data may not fit into main memory
Temporal Indexing MVBT.
Temporal Indexing MVBT.
B+-Trees.
B+ Tree.
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Spatio-Temporal Databases
B+-Trees and Static Hashing
Temporal Databases.
B+Trees The slides for this text are organized into chapters. This lecture covers Chapter 9. Chapter 1: Introduction to Database Systems Chapter 2: The.
Temporal Databases.
RUM Conjecture of Database Access Method
Trees CSE 373 Data Structures.
Presentation transcript:

Temporal Databases

Outline Spatial Databases Indexing, Query processing Temporal Databases Spatio-temporal ….

Temporal DBs – Motivation Conventional databases represent the state of an enterprise at a single moment of time Many applications need information about the past Financial (payroll) Medical (patient history) Government Temporal DBs: a system that manages time varying data

Comparison Conventional DBs: Evolve through transactions from one state to the next Changes are viewed as modifications to the state No information about the past Snapshot of the enterprise Temporal DBs: Maintain historical information Changes are viewed as additions to the information stored in the database Incorporate notion of time in the system Efficient access to past states

Temporal Databases Temporal Data Models: extension of relational model by adding temporal attributes to each relation Temporal Query Languages: TQUEL, SQL3 Temporal Indexing Methods and Query Processing

Taxonomy of time Transaction time databases Transaction time is the time when a fact is stored in the database Valid time databases: Valid time is the time that a fact becomes effective in reality Bi-temporal databases: Support both notions of time

Example Sales example: data about sales are stored at the end of the day Transaction time is different than valid time Valid time can refer to the future also! Credit card: 03/13-04/16

Transaction Time DBs Time evolves discretely, usually is associated with the transaction number: A record R is extended with an interval [t.start, t.end). When we insert an object at t1 the temporal attributes are updated -> [t1, now) Updates can be made only to the current state! Past cannot be changed “Rollback” characteristics T1 -> T2 -> T3 -> T4 ….

Transaction Time DBs Deletion is logical (never physical deletions!) When an object is deleted at t2, its temporal attribute changes from [t1, now)  [t1, t.t2) (lifetime) Object is “ alive ” from insertion to deletion time, ex. t1 to t2. If “ now ” then the object is still alive eidsalarystartend 1020K9/9310/ K4/94* 3330K5/946/ K1/95* time

Transaction Time DBs Database evolves through insertions and deletions id

Transaction Time DBs Requirements for index methods: Store past logical states Support addition/deletion/modification changes on the objects of the current state Efficiently access and query any database state

Transaction Time DBs Queries: Timestamp (timeslice) queries: ex. “ Give me all employees at 05/94 ” Range-timeslice: “ Find all employees with id between 100 and 200 that worked in the company on 05/94 ” Interval (period) queries: “ Find all employees with id in [100,200] from 05/94 to 06/96 ”

Valid Time DBs Time evolves continuously Each object is a line segment representing its time span (eg. Credit card valid time) Support full operations on interval data: Deletion at any time Insertion at any time Value change (modification) at any time (no ordering)

Valid Time DBs Deletion is physical: No way to know about the previous states of intervals The notion of “ future ”, “ present ” and “ past ” is relative to a certain timestamp t

Valid Time DBs The reality “ best know now ! ”

Valid Time DBs Requirements for an Index method: Store the latest collection of interval-objects Support add/del/mod changes to this collection Efficiently query the intervals in the collection Timestamp query Interval (period) query

Bitemporal DBs A transaction-time Database, but each record is an interval (plus the other attributes of the record) Keeping the evolution of a dynamic collection of interval-objects At each timestamp, it is a valid time database

Bitemporal DBs

Requirements for access methods: Store past/logical states of collections of objects Support add/del/mod of interval objects of the current logical state Efficient query answering

Temporal Indexing Straight-forward approaches: B+-tree and R-tree Problems? Transaction time: Snapshot Index, TSB-tree, MVB-tree, MVAS Valid time: Interval structures: Segment tree, even R-tree Bitemporal: Bitemporal R-tree

Temporal Indexing Lower bound on answering timeslice and range-timeslace queries: Space O(n/B), search O(log B n + s/B) n: number of changes, s: answer size, B page capacity

Transaction Time Environment Assume that when an event occurs in the real world it is inserted in the DB A timestamp is associated with each operation Transaction timestamps are monotonically increasing Previous transactions cannot be changed  we cannot change the past

Example Time evolving set of objects: employees of a company Time is discrete and described by a succession of non-negative integers: 1,2,3, … Each time instant changes may happen, i.e., addition, deletion or modification We assume only insertion & deletion : modifications can be represented by a deletion and an insertion

Records Each object is associated with: 1.An oid (key, time invariant, eid) 2.Attributes that can change (salary) 3.A lifespan interval [t.start, t.end) An object is alive from the time it inserted in the set, until it was deleted At insertion time deletion is unknown Deletions are logical: we change the now variable to the current time, [t1, now)  [t1, t2)

Evolving set The state S(t) of the evolving set at t is defined as: “ the collection of alive objects at t ” The number of changes n represents a good metric for the minimal space needed to store the evolution

Evolving sets A new change updates the current state S(t) to create a new state time a a,h S(t i ) a,f,g titi t1t1 t2t2

Transaction-time Queries Pure-timeslice Range-timeslice Pure-exact match

Temporal Indexing Straight-forward approaches: B+-tree and R-tree Problems? Transaction time: Snapshot Index, TSB-tree, MVB-tree, MVAS Valid time: Interval structures: Segment tree, even R-tree Bitemporal: Bitemporal R-tree

Temporal Indexing Lower bound on answering timeslice and range-timeslace queries: Space O(n/B), search O(log B n + s/B) n: number of changes, s: answer size, B page capacity

Snapshot Index Snapshot Index, is a method that answers efficiently pure-timeslice queries Based on a main memory method that solves the problem in O(a+log 2 n), O(n)space External memory: O(a/B + log B n)

MM solutions Copy approach: O(a + logn) but O(n 2 ) space Log approach: O(n) space but O(n) query time We should combine the fast query time with the small space (and update)

Assumptions Assumptions (for clarity) At each time instant there exist exactly one change Each object is named by its creation time

Access Forest A double linked list L. Each new object is appended at the right end of L A deleted object is removed from L and becomes the next child of its left sibling in L Each object stores a pointer to its parent object. Also a parent object points to its first and last children

AF example SP

Additional structures A hashing scheme that keeps pointers to the positions of the alive elements in L An array A that stores the time changes. For each time change instant, it keeps the pointer to the current last object in L

Properties of AF In each tree of the AF the start times are sorted in preorder fashion The lifetime of an object includes the lifetimes of its descendants The intervals of two consecutive children under the same parent may have the following orderings: s i < e i < s i+1 <e i+1 or s i < s i+1 <e i < e i+1

Searching Find all objects alive at t q Use A to find the starting object in the access forest L (the last object in L at time t q ) (O(logn)) Traverse the access forest and report all alive objects at t q O(a) using the properties

Searching in AF Given query time q: Use table A to find the time of the last object in L at time q. Say node Y. Starting from Y go up (if it has a parent) recursively For each node in the path from Y to the current node in list L (or Y itself if it is right now in L) recursively: Visit the rightmost child in its subtree Visit the left sibling Stop when you find a node non in the result

Disk based Solution Keep changes in pages as it was a Log Use hashing scheme to find objects by name (update O(1)) Acceptor : the current page that receives objects

Definitions A page is useful for the following time instants: I-useful: while this page was the acceptor block II-useful: for all time instants for which it contains at least uB “ alive ” records u is the usefulness parameter

Properties 1) Each time only 1 block is I-useful 2) Times that a block is II-useful are continuous 3) A block has exactly one I-useful and at most one II-useful intervals.

Meta-evolution From the actual evolution of objects, now we have the evolution of pages! meta- evolution The “ lifetime ” of a page is its usefulness!!

Searching Find all alive objects at t q  Find all useful pages at t q One I-useful at tq All II-useful at tq The search can be done in O(a/B + log B n) using the AF for the pages! SR is used to implement the AF!

Copying procedure To maintain the answer in few pages we need clustering: controlled copying If a page has less than uB alive objects, we artificially delete the remaining alive objects and copy them to the acceptor bock

Optimal Solution We can prove that the SI is optimal for pure-timeslice queries: O(n/B) space, O(a/B + log B n) query and O(1) update (expected, using hashing)

Space Need to prove that copying does not inflate the space overhead Lemma 1: total space is O(n/B) even with copying and update constant See SI paper

Query time Based on the properties of block usefulness and AF structure! See Lemma 2 and Theorem 1 Notice update is efficient using a hashing shceme Also, notice that we need to update AT on meta-changes!! (real changes that affect changes in the block useful intervals)