Conceptual Architecture of PostgreSQL S-Queue-L Khurrum A Mujeeb, Adam Abu Hijleh, Adam Ali Stephen McDonald, Wisam Zaghal CISC 322 - Fall 2010
Overview What is Postgres? Research Methods Considered Alternatives Reference Architecture Conceptual Architecture Inside Subsystems – Query Processor Inside Subsystems – Storage Manager Inside Subsystems – Utilities Use Case Concurrency Control Design Trade-offs Limitations of Research Lessons Learned Summary Q & A
What is PostgreSQL? Open-Source database management system ‘Ingres Project’ at UC Berkeley First Postgres version released in 1997 Cross-Platform Written in C Used by organisations such as:- Yahoo MySpace Skype
Research Methods General understanding of PostgreSQL Developers guide PostgreSQL wiki page PostgreSQL manual Wikipedia Reference architecture for Database Management System Backbone of conceptual architecture Conceptual architecture for PostgreSQL Various available online documentation of Conceptual Architectures of PostgreSQL
Considered Alternatives 1. Client – Server 2. Client – Server w/ Pipe & Filter 3. Client – Server w/ Pipeline & Repository
Reference Architecture Figure. 1
Conceptual Architecture Client Communications Manager Legend Dependencies Utilities & Shared Components Server (Query Processor) Storage Manager Figure 2.
Query Processor Figure 3.
Inside Subsystems Query Processor Consists of :- Parser: syntax Traffic Cop : simple/complex Utility Command: simple queries Rewriter: rule augmentation Planner/Optimizer: optimal plan Executor: execute optimal plan Models a Pipe & Filter style Architecture Uses storage management & shared utilities
Inside Subsystems Storage Manager Legend Provides Shared memory for buffers & access to database. Suggests repository style Figure 4.
Inside Subsystems Utilities Legend Consists of : Utilities Catalog Access Methods Nodes/Lists Utilities are used by all sub-components of the query processor Figure 5.
Use Case – Select Query Figure 6.
Concurrency Control Postmaster spawns multiple server threads (process per request) Problem - overwriting or modifying data Solution… - MVCC – Multi-version concurrency control - Point in time DB snapshot - Locks – locks entire table from being altered/deleted
Design Trade-offs Reliability vs Performance Scalability vs Maintainability Security vs Performance
Limitations of Research Personal Knowledge as well as experience with architectures & databases Determining depth of research Sources are incomplete
Lessons Learned Cannot rely on one source for information, will have to go through several sources to build a complete picture Hard to decide on an architecture style The value of the reference architecture
Summary Hybrid Conceptual Architecture Design Attributes Client Server – front/back connection Pipe & Filter – back end processes Repository – storage management/access Design Attributes Reliable & Secure - data integrity, strict SQL compliance, user authentication Performance - slower and more complicated
Thank You! Questions?