PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Slides:



Advertisements
Similar presentations
Temporal Constraints Time and Time Again: The Many Ways to Represent Time James F Allen.
Advertisements

Serializable Isolation for Snapshot Databases Michael J. Cahill, Uwe Röhm, and Alan D. Fekete University of Sydney ACM Transactions on Database Systems.
Advanced Databases Temporal Databases Dr Theodoros Manavis
By César Urdaneta.  Purpose ◦ Replicate records from different tables (for inserting / updating record), from a source database to a target one, keeping.
Data Design The futureERD - CardinalityCODINGRelationshipsDefinition.
Representing Data Elements Gayatri Gopalakrishnan.
CS240A: Databases and Knowledge Bases Time Ontology and Representations Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Temporal Databases. Outline Spatial Databases Indexing, Query processing Temporal Databases Spatio-temporal ….
Temporal Databases. Outline Spatial Databases Indexing, Query processing Temporal Databases Spatio-temporal ….
Concurrency Control by Validation (18.9) By: Pushkar Marathe Id: 217.
Spatio-Temporal GIS Philip Sargent May 25th 1998.
Data Warehouse View Maintenance Presented By: Katrina Salamon For CS561.
CS240A: Databases and Knowledge Bases Time Ontology and Representations Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Professor Michael J. Losacco CIS 1150 – Introduction to Computer Information Systems Databases Chapter 11.
Event Processing Symposium CEP for Customer Acquisition And Customer Service.
Introduction to Database Management. 1-2 Outline  Database characteristics  DBMS features  Architectures  Organizational roles.
Chapter Eight Database ( 資料庫 ) Applications and Implications.
Understanding SQL Server 2008 Change Data Capture Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
EOVSA Prototype Review MONITOR DATABASE EVOSA Project Review Meeting MONITOR DATABASE September GIL JEFFER.
Introduction to Computers Lesson 10B. home Database A collection of related data or facts.
USE OF TEMPORAL CONCEPTS IN TRANSACTIONAL DATABASES.
2-1 Skyline College Chapter Business Transactions The accounting process starts with the analysis of business transactions. A business transaction.
DAY 14: MICROSOFT ACCESS – CHAPTER 1 Madhuri Siddula October 1, 2015.
Quiz questions. 1 A data structure that is made up of fields and records? Table.
Music Festival Evaluating Leaflets. Is there an image? If so is it related to the theme? Are the name of the event and the acts which are playing Identified?
Intro to Databases Vocabulary Copyright © Texas Education Agency, All rights reserved.
CS240A: Databases and Knowledge Bases Time Ontology and Representations Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
I NTRODUCTION TO DATABASE By: Afraa Sayah. I NTRODUCTION Data: - Known facts and can have implicit meaning. - Types: text, date, number… Database: - A.
Distributed Time Series Database
CS240A: Databases and Knowledge Bases Temporal Databases Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Temporal Data Modeling
Where does time go ?. Applications abound Temporal database systems provide built-in support for recording and querying time-varying information Application.
10 1 Chapter 10 - A Transaction Management Database Systems: Design, Implementation, and Management, Rob and Coronel.
1 DBS201: More on SQL Lecture 2. 2 Agenda Select command review How to create a table How to insert data into a table.
CS240A: Databases and Knowledge Bases TSQL2 Carlo Zaniolo Department of Computer Science University of California, Los Angeles Notes From Chapter 6 of.
DATA MANAGEMENT AND DATABASES. Data Management Data management is the process of controlling the information generated during a research project or transaction.
SWBAT… graph piecewise and absolute value functions Agenda 1. Warm Up (15 min) 2. Absolute value and piecewise functions (30 min) Warm-Up: 1. Take out.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Database Overview What is a database? What types of databases are there? How are databases more powerful than spreadsheets?
SPECIAL PURPOSE DATABASES 13/09/ Temporal Database Concepts  Time is considered ordered sequence of points in some granularity Use the term choronon.
Presenters : Virag Kothari,Vandana Ayyalasomayajula Date: 04/21/2010.
Trigger used in PosgreSQL
Fundamentals of Databases
records Database Vocabulary It can be useful to collect information.
Database Management  .
Tips Need to Consider When Organizing a College Event
Database Vs. Data Warehouse
Chap 2: A Prelude to Parametric Data
Reporting An In-Depth Guide.
Temporal Databases.
ماجستير إدارة المعارض من بريطانيا
What is a Database? A collection of data organized in a manner that allows access, retrieval, and use of that data.
Typically data is extracted from multiple sources
12.4: Effects of Inflation 1. Decreasing value of the dollar
A practical IT based tool to improve distributors' quality of supply
Physical Storage materialized views indexes partitions ETL CDC.
Database systems Lecture 3 – SQL + CRUD
Temporal Databases.
Name of Event Name of Event
№96 сонли умумий ўрта мактабининг ўқитувчиси Эшанкулова феруза
Christopher M. Pascucci
Database Assignment Write down your answers in word document with file name highlighting your name, student number, and class. E.g “95002”+”_”+ “03 class”+”_”+”name”,
Author: Miran Miklič Comparison and advantages of the Bitemporal and Temporal databases over Tuple-versioning (point-in-time) database.
Thing / Person:____________________ Dates:_________________
CS240A: Databases and Knowledge Bases TSQL2
Record your QUESTIONS as your read.
Analytics, BI & Data Integration
Section 13.2: Understanding Inverse Functions
CS240A: Databases and Knowledge Bases A Taxonomy of Temporal DBs
Leveraging Analytics to Maximize Updater Efficiency
Presentation transcript:

PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009

Agenda What kind of temporal data do we need? What data types does PostgreSQL offer? Temporality Representations Time Travel, Transaction Tables, Serial Numbers

What kind of temporal data do we need? Databases store facts about objects and events Interesting times include When an event took place When the event was recorded When someone was charged for the event

More Interesting Times When you start recognizing income on the event When you end recognizing income on the event When an object state begins When an object state ends

PostgreSQL Data Types Date Problem: Pre-assumes evaluation of cutoff between days! Time with/without timezone Problem: Comparisons of Date+Time turn into hideous SQL Timestamp Combines Date + Time

PostgreSQL Data Types Timestamp with time zone Allows collecting time in ‘local times’ and recognizing that Interval Difference between two times/timestamps Very useful for indicating duration of time ranges

Operators time/timestamp/date +|- interval = time/timestamp/date timestamp - timestamp = interval (likewise for the others) timestamp, >= timestamp A BETWEEN B AND C A >= B and A = B and A <= C

Variations on “when is it???” NOW(), transaction_timestamp, current_timestamp all providing start of transaction statement_timestampclock_timestamp transaction commit timestamp - not available!

Commit Timestamp Useful representation: Tables record (serverID, ctid) At COMMIT time, if the transaction has used this, then insert (serverID, ctid, clock_timestamp) into timestamp table Eliminates Slony-I “SYNC” thread & simplifies queries Helpful for multimaster replication strategies Adds a table full of timestamps that needs cleansing :-(

PGTemporal PgFoundry project implementing (timestamp,timestamp) type + all logical operations First aspect: Supports inclusive & exclusive periods [ From, To ], ( From, To ), [ From, To ), ( From, To ] [ and ] indicate “inclusive” periods beginning and ending at the specified moment ( and ) indicate exclusive periods excluding endpoints

Inclusion & Exclusion Commonly, [From, To) is the ideal representation Today’s data easily characterized as [ , ) This month’s period: [ , ) Successive periods do not overlap [ , ),[ , ) Note that SQL “BETWEEN” is equivalent to [From,To]

A Veritable Panoply of Operators length(p), first(p), last(p), prior(p), next(p) contains(p, t), contains(p1, p2), contained_by(t, p), contained_by(p1,p2), overlaps(p1,p2), adjacent(p1,p2), overleft(p1,p2), overright(p1,p2), is_empty(p), equals(p1,p2), nequals(p1,p2), before(p1,p2), after(p1,p2) period(t), period(t1,t2), empty_period() period_intersect(p1,p2), period_union(p1,p2), minus(p1,p2)

Core???? Should PGTemporal be in core? What would be needed for it to head in?

Classical SQL Temporality Developing Time-Oriented Database Applications in SQL - Richard Snodgrass, available freely as PDF Uses periods much as in PGTemporal Standard SQL does not support periods, alas! Considerable attention to handling insertion of past/future history

Foreign Key Challenges Nontemporal tables: No temporality, No problem! Referencing table is temporal, referenced table isn’t: No problem! Referenced table is temporal Troublesome! Referential integrity may be violated simply via passage of time Referenced & referencing tables may vary independently!

PostgreSQL Time Travel Take a stateful table Add triggers to capture (From,To) timestamps on INSERT, UPDATE, DELETE Sadly, this breaks if you require referential integrity constraints pointing to this table :-(

Time Travel Actions On INSERT (NEW.From, NEW.To) = (NOW(), NULL) On DELETE (OLD.From, OLD.To) = (PrevValue, NOW()) On UPDATE Transforms into DELETE old, INSERT new

Pulling Specific State Current state: select * from table where endtime is NULL State at a particular time: Set Returning Function select * from table_at_time(ts) Pulls tuples effective at that time starttime <= ts endtime is null or endtime >= ts

Explicit Temporal Tables Accept that it’s temporal to begin with Not just a way to get “history for free” Enables Science Fiction: Declaring future state! At 9am next Wednesday, state will change Eliminates need for “batch jobs” May need to pre-record future-dated events!

Science Fiction....

Problems Foreign key references into temporal tables are problematic Overlap? Reference disappearing? Fixing problems requires “fabricating a historical story” not just “fixing the state”

Temporality via Tx References create table transactions ( tx_id integer primary key default nextval(‘tx_seq’), whodunnit integer not null references users(user_id), and_when timestamptz not null default NOW()); create table slightly_temporal_object ( object_id serial primary key, tx_id integer not null default currval(‘tx_seq’) references transactions(tx_id));

Getting More Temporal - I Add ON UPDATE trigger that updates tx_id to currval(‘tx_seq’)

More Temporal: History! Create a “past history” table Similar schema, but drop all data validation Add end_tx UPDATE/DELETE throw obsolete tuples into the “past history table” Data validation dropped because validation can change over time

Serial Number Temporality Used in DNS Sets of updates grouped together temporally A “bump of serial number” indicates common publishing at a common point in time

ObjectValueZoneFromTo ns1.abc.org org13 ns1.abc.org org3 ns2.abc.org org2 ns3.abc.org org13 ns1.abc.org info1719 ns2.abc.org info1418 ns2.abc.org info18 ns3.abc.org info19

Zone Representation Merits It’s fast. We extract multimillion record zones in minutes Arbitrary ability to roll back... Nicely supports DNS AXFR/IXFR operations Each serial # represents a sort of “Logical Commit”

Further Merits of this Rename “zone” to “module” and this is nice for configuration We already know it supports large amounts of data efficiently Configuration is smaller (we hope!)

Demerits of zone-like structure No way to specify a point of time in the future Serial numbers are intended to just keep rolling along HOWEVER.... With complex apps & configuration, fancier temporality looks like a misfeature

Conclusions 3 ways to represent temporal information Timestamps, Transaction IDs, Serial numbers PostgreSQL changes possible Should PGtemporal be added to “core”? Should we try to have temporal foreign key functionality in core?