Persistent Programming with ZODB 10 th International Python Conference Alexandria, Virginia Jeremy Hylton and Barry Warsaw

Slides:



Advertisements
Similar presentations
1 A B C
Advertisements

Variations of the Turing Machine
Page 1 AccountsPayable Check Maintenance. Page 2 A/P Check Maintenance Maintenance/Inquiry Options WMN2001S.
AP STUDY SESSION 2.
1
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Sequential Logic Design
Copyright © 2013 Elsevier Inc. All rights reserved.
David Burdett May 11, 2004 Package Binding for WS CDL.
Supporting Persistent Objects In Python Jeremy Hylton
Persistent Programming with ZODB 10 th International Python Conference Alexandria, Virginia Jeremy Hylton and Barry Warsaw
Persistent Programming with ZODB 10 th International Python Conference Alexandria, Virginia February 4, 2002 Jeremy Hylton and Barry Warsaw
What is Persistence? Automatic management of persistent storage Frees programmer from writing code to dump objects into files Allows programmer to focus.
(Very) Simple Group Calendar Calendar which can display a whole month, or a single day, with events Can create new appointments with rendezvous information.
Writing persistent classes Persistent if reachable from the root Persistency by storing/loading pickles ZODB must know when an object is accessed or changed.
Transaction Intro slide XXX New builtin get_transaction() Returns transaction for current thread ZEO manages transactions across processes, machines Transactions.
Create an Application Title 1Y - Youth Chapter 5.
Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
The 5S numbers game..
© Tally Solutions Pvt. Ltd. All Rights Reserved Shoper 9 License Management December 09.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Welcome. © 2008 ADP, Inc. 2 Overview A Look at the Web Site Question and Answer Session Agenda.
Break Time Remaining 10:00.
Factoring Quadratics — ax² + bx + c Topic
EE, NCKU Tien-Hao Chang (Darby Chang)
Anything But Typical Learning to Love JavaScript Prototypes Page 1 © 2010 Razorfish. All rights reserved. Dan Nichols March 14, 2010.
PP Test Review Sections 6-1 to 6-6
1 IMDS Tutorial Integrated Microarray Database System.
Multicore Programming Skip list Tutorial 10 CS Spring 2010.
Briana B. Morrison Adapted from William Collins
Regression with Panel Data
Operating Systems Operating Systems - Winter 2012 Chapter 2 - Processes Vrije Universiteit Amsterdam.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Reference Manual Roger L. Costello XML Technologies Course.
Biology 2 Plant Kingdom Identification Test Review.
Chapter 1: Expressions, Equations, & Inequalities
Lecture plan Transaction processing Concurrency control
Lilian Blot PART III: ITERATIONS Core Elements Autumn 2012 TPOP 1.
Adding Up In Chunks.
FAFSA on the Web Preview Presentation December 2013.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.
Artificial Intelligence
Before Between After.
Slide R - 1 Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Prentice Hall Active Learning Lecture Slides For use with Classroom Response.
12 October, 2014 St Joseph's College ADVANCED HIGHER REVISION 1 ADVANCED HIGHER MATHS REVISION AND FORMULAE UNIT 2.
Subtraction: Adding UP
: 3 00.
5 minutes.
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 9 TCP/IP Protocol Suite and IP Addressing.
Chapter 11 Creating Framed Layouts Principles of Web Design, 4 th Edition.
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
Outlook 2013 Web App (OWA) User Guide Durham Technical Community College.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Presentation transcript:

Persistent Programming with ZODB 10 th International Python Conference Alexandria, Virginia Jeremy Hylton and Barry Warsaw

Slide 2©2001 Zope Corporation. All Rights Reserved. What is Persistence? Automatic management of object state; maintained across program invocation Frees programmer from writing explicit code to dump objects into files Allows programmer to focus on object model for application

Slide 3©2001 Zope Corporation. All Rights Reserved. ZODB Approach to Persistence Minimal impact on existing Python code (transparency) Serialization (pickle) to store objects Transactions to control updates Pluggable backend storages to write to disk

Slide 4©2001 Zope Corporation. All Rights Reserved. Alternatives to ZODB Many: –flat files, relational database, structured data (XML), BerkeleyDB, shelve Each has limitations –Seldom matches app object model –Limited expressiveness / supports few native types –Requires explicit app logic to read and write data

Slide 5©2001 Zope Corporation. All Rights Reserved. ZODB -- the Software Object database for Zope –Designed by Jim Fulton –Started as BoboPOS Extracted for non-Zope use –Andrew Kuchling Source release w/distutils from Zope Corp. –January 2002 Wiki: –info central for ZODB

Slide 6©2001 Zope Corporation. All Rights Reserved. Software architecture StandaloneZODB packages –Persistence, ZODB, ZEO –ExtensionClass, sundry utilities ZODB contains –DB, Connection –Several storages Compatibility –Runs with Python 2.0 and higher –ExtensionClass has some limitations No cycle GC, no weak refs, …

Slide 7©2001 Zope Corporation. All Rights Reserved. ZODB Architecture (1) Persistent Application Persistence ZODB Database Transaction Storage ZEO

Slide 8©2001 Zope Corporation. All Rights Reserved. Public Components Components with public APIs –Database allows application to open connections connection: app interface for accessing objects –Transaction: app interface for making changes permanent –Persistent base class Logically distinction from ZODB

Slide 9©2001 Zope Corporation. All Rights Reserved. Internal Components Storage –manage persistent representation on disk ZEO –Share storage among multiple processes, machines

Slide 10©2001 Zope Corporation. All Rights Reserved. Future ZODB Architecture ZODB4 will isolate components –Persistent, Transaction interfaces separate –Database, Storage stay in ZODB Advantages –Allows other databases, e.g. object- relational mapping –Use Persistence, Transaction without ZODB

Slide 11©2001 Zope Corporation. All Rights Reserved. Key ZODB Concepts Persistence by reachability Transactions Resource management –Multiple threads –Memory and caching

Slide 12©2001 Zope Corporation. All Rights Reserved. Persistence by Reachability All objects reachable from root stored in database –Root mapping provided by database Each persistent object stored independently –use pickle –all non-persistent attributes included –customize with __getstate__()

Slide 13©2001 Zope Corporation. All Rights Reserved. Transactions Coordinate update of objects –Modified objects associated with transaction –Commit makes modification persistent –Abort reverts to previous state Means to cope with failure –Conflicting updates –Something goes wrong with system

Slide 14©2001 Zope Corporation. All Rights Reserved. Resource Management Threads –One thread per transaction –Controlled sharing via transactions Memory –Database contains many objects Too many to fit in memory –Objects moved in and out of memory ZODB manages this automatically Knobs exposed to applications

Slide 15©2001 Zope Corporation. All Rights Reserved. Writing Persistent Applications This section will: –Introduce a simple application –Show how to make it persistent

Slide 16©2001 Zope Corporation. All Rights Reserved. (Very) Simple Group Calendar Calendar which can display a whole month, or a single day, with events Can create new appointments with rendezvous information Can invite people to an appointment

Slide 17©2001 Zope Corporation. All Rights Reserved. Group Calendar Objects Calendar – holds appointments keyed by subject and date (sorted) Person – name, core hours ; later updated to just username, realname Appointment – holds date, duration, subject, location, list of participants (a driver script)

Slide 18©2001 Zope Corporation. All Rights Reserved. Required imports Applications must import ZODB first, either explicitly or implicitly through a package reference Importing ZODB has side-effects (this will be fixed in ZODB4). import ZODB from ZODB.DB import DB from ZODB.FileStorage import FileStorage from BTrees.OOBTrees import OOBTrees # Works as side-effect of importing ZODB above from Persistence import Persistent

Slide 19©2001 Zope Corporation. All Rights Reserved. Creating persistent classes All persistent classes must inherit from Persistence.Persistent from Persistence import Persistent class Person(Persistent): # …

Slide 20©2001 Zope Corporation. All Rights Reserved. Application boilerplate Create a storage Create a database object that uses the storage Open a connection to the database Get the root object (and perhaps add app collections) fs = FileStorage(cal.fs) db = DB(fs) conn = DB.open() root = conn.root() if not root.has_key(collectionName): root[collectionName] = OOBTree() get_transaction().commit()

Slide 21©2001 Zope Corporation. All Rights Reserved. Using transactions After making changes –Get the current transaction –Commit or abort it calendar = root[calendar] calendar.add_appointment(app) get_transaction().commit() # …or… get_transaction().abort()

Slide 22©2001 Zope Corporation. All Rights Reserved. Writing persistent classes Persistent if reachable from the root Persistency by storing/loading pickles ZODB must know when an object is accessed or changed Automatic (transparent) for attribute access Some common Python idioms require explicit interactions

Slide 23©2001 Zope Corporation. All Rights Reserved. Persistence by reachability Persistent object must be reachable from the root object, which ZODB creates automatically person = Person(name, hours) people = root[people] if not people.has_key(name): people[name] = person get_transaction().commit()

Slide 24©2001 Zope Corporation. All Rights Reserved. What state is saved? Objects to be stored in ZODB must be picklable. ZODB pickles all object attributes –Looks in __dict__ –Loads pickled state into __dict__ Classes can override behavior –via __getstate__() and __setstate__()

Slide 25©2001 Zope Corporation. All Rights Reserved. References to other objects Sub-objects are pickled by value except: –Persistent sub-objects are pickled by reference –Classes, modules, and functions are pickled by name –Upon unpickling instances, __init__() is not called unless the class defined a __getinitargs__() method at pickle-time –See the Python 2.2 pickle module documentation for more rules regarding extension types, etc.

Slide 26©2001 Zope Corporation. All Rights Reserved. Automatic notice of changes Changes to an object via attribute access are noticed automatically by the persistence machinery –Implemented as tp_getattr hook in C person.name = Barry Warsaw get_transaction().commit()

Slide 27©2001 Zope Corporation. All Rights Reserved. Mutable attributes Mutable non-Persistent sub-objects, e.g. builtin types (list, dict), instances Changes not caught by ZODB –Attribute hook only works for parent –Must mark parent as changed (_p_changed) class Appointment: # … def add_person(self, person): self.participants.append(person) self._p_changed = 1

Slide 28©2001 Zope Corporation. All Rights Reserved. PersistentMapping Persistent, near-dictionary-like semantics –In StandaloneZODB, inherits from UserDict It fiddles with _p_changed for you: >>> person.contacts >>> person.contacts[Barry] = barry >>> get_transaction().commit()

Slide 29©2001 Zope Corporation. All Rights Reserved. PersistentList Provides list-like semantics while taking care of _p_changed fiddling In StandaloneZODB only (for now) –Inspired by Andrew Kuchlings SourceForge project (zodb.sf.net) Inherits from UserList

Slide 30©2001 Zope Corporation. All Rights Reserved. Handling unpicklable objects class F(Persistent): def __init__(self, filename): self.fp = open(filename) def __getstate__(self): return self.fp.name def __setstate__(self, filename): self.fp = open(filename) >>> root[files] = F(/etc/passwd) >>> get_transaction().commit() >>>

Slide 31©2001 Zope Corporation. All Rights Reserved. Volatile attributes Attributes not to be stored persistently should be prefixed with _v_ class F(Persistent): def __init__(self, filename): self._v_fp = open(filename) >>> root[files] = F(/etc/passwd) >>> get_transaction().commit() # later… >>> root[files].__dict__ {}

Slide 32©2001 Zope Corporation. All Rights Reserved. Python special methods ExtensionClass has some limits –post methods –Reversed binops, e.g. __radd__ –Comparisons with other types –Ported to Python 2.2, but not being actively maintained. Not fundamental to approach –Future implementation will not use E.C.

Slide 33©2001 Zope Corporation. All Rights Reserved. Managing object evolution Methods and data can change –Add or delete attributes –Methods can also be redefined Classes stored by reference –Instances gets whatever version is imported __get/setstate__() can handle data –Provide compatibility with old pickles, or –Update all objects to new representation

Slide 34©2001 Zope Corporation. All Rights Reserved. __setstate__() class Person(Persistent): def __init__(self, name): self.name = name >>> barry = Person(Barry Warsaw) >>> root[people][barry] = barry >>> get_transaction().commit()

Slide 35©2001 Zope Corporation. All Rights Reserved. __setstate__() cont class Person(Persistent): def __init__(self, username,realname): self.username = username self.realname = realname def __setstate__(self, d): self.realname = name = d[name] username = name.split()[0].lower() self.username = username

Slide 36©2001 Zope Corporation. All Rights Reserved. Transactions and Persistence This section will: –Explain the purpose of transactions –Show how to add transactions to app

Slide 37©2001 Zope Corporation. All Rights Reserved. Using Transactions ZODB adds builtin get_transaction() –Side-effect of import ZODB Each thread gets its own transaction –get_transaction() checks thread id Threads are isolated –Each thread should use its own DB connection –Changes registered with conn that loaded object –Synchronization occurs at transaction boundaries

Slide 38©2001 Zope Corporation. All Rights Reserved. ACID properties Atomic – All updates performed, or none Consistent –Responsibility of application –Changes should preserve object invariants Isolated –Each transaction sees consistent state –Transactions occur in serializable order Durable –After a commit, change will survive crash

Slide 39©2001 Zope Corporation. All Rights Reserved. Optimistic concurrency control Two alternatives to isolation –Locking: transaction locks object it modifies –Optimistic: abort transactions that conflict ZODB is optimistic –Assume conflicts are uncommon –If conflict occurs, abort later transaction Effect on programming style –Any operation may raise ConflictError –Wrap all code in try/except for this –Redo transaction if it fails

Slide 40©2001 Zope Corporation. All Rights Reserved. Transaction boundaries Under application control Transaction begin is implicit –Begins when object loaded or modified get_transaction().commit() –Make changes permanent get_transaction().abort() –Revert to previously committed state

Slide 41©2001 Zope Corporation. All Rights Reserved. Write conflicts Transactions must be serializable Two transactions change object concurrently –Only one change can succeed –Other raises ConflictError on commit() Handling ConflictError –Abort transaction, and retry –Application-level conflict resolution

Slide 42©2001 Zope Corporation. All Rights Reserved. Conflicts and Consistency New method on Calendar object def make_appointment(self, apt, attendees): self.add_appointment(apt) for person in attendees: if person.is_available(apt.date, apt.duration): person.add_appointment(apt) apt.add_person(person) –Guarantees appointments dont conflict Consider two calls at same time –Data race on is_available()? –Conflict raised when object commits

Slide 43©2001 Zope Corporation. All Rights Reserved. Conflict Example def update1(cal, attendees): apt = Appointment(refrigerator policy, Time(2/5/ :00), Time(0:30)) cal.make_appointment(apt, attendees) def update2(cal, attendees): apt = Appointment(curly braces, Time(2/5/ :00), Time(1:00)) cal.make_appointment(apt, attendees) Two calls at once results in one error Traceback (most recent call last): File, line 1, in ? File ZODB/Transaction.py, line 233, in commit File ZODB/Connection.py, line 347, in commit File ZODB/FileStorage.py, line 634, in store ConflictError: database conflict error (serial was e55d, now c1bdd)

Slide 44©2001 Zope Corporation. All Rights Reserved. Read conflicts (1) What if transaction never commits? –Operation is read-only –Must still have consistent view Always read current object revision –If another transaction modifies the object, the current revision is not consistent –ReadConflictError raised in this case

Slide 45©2001 Zope Corporation. All Rights Reserved. Read conflicts (2) Example with transactions T1, T2 –Sequence of operations T1: Read O1 T2: Read O1, O2 T2: Write O1, O2 T2: Commit T1: Read O2 – ReadConflictError –Cant provide consistent view T1 already saw old revision of O1 Cant read new revision of O2

Slide 46©2001 Zope Corporation. All Rights Reserved. Multi-version concurrency control Planned for ZODB4 –Allow transactions to proceed with old data In previous example, T1 would see version of O2 from before T2 –Eliminate conflicts for read-only transactions Limited solution exists now –Define _p_independent() Return true if its safe to read old revision

Slide 47©2001 Zope Corporation. All Rights Reserved. Example transaction wrapper from ZODB.POSException import ConflictError def wrapper(func, retry=1): while 1: try: func() get_transaction().commit() except ConflictError: if retry: get_transaction().abort() retry -= 1 continue else: break

Slide 48©2001 Zope Corporation. All Rights Reserved. Application-level conflict resolution Objects can implement their own (write) conflict resolution logic Define _p_resolveConflict() method –Arguments (unpickled object states) Original object state Committed state for last transaction State for transaction that conflicts –Returns new state or None or raises error Requires careful design –Cant access other objects at resolution time –Must store enough info in object state to resolve

Slide 49©2001 Zope Corporation. All Rights Reserved. Conflicts and ZEO ZEO uses asyncore for I/O –Invalidation msgs arrive asynchronously –Processed when transaction commits Application must either –Start asyncore mainloop –Synchronize explicitly Connection method sync() Call when transaction begins

Slide 50©2001 Zope Corporation. All Rights Reserved. Subtransactions You can create subtransactions within a main transaction –individually commit and abort subtransactions –not truly committed until containing transaction is committed Primarily for reducing in-memory footprint >>> get_transaction().commit(1)

Slide 51©2001 Zope Corporation. All Rights Reserved. Practical Considerations This section will: –Help you select components –Discuss sys admin issues –Manage resources effectively

Slide 52©2001 Zope Corporation. All Rights Reserved. BTrees from BTrees.OOBTree import OOBTree Mapping type implemented as Btree –Implemented in C for performance –Several flavors with object or int key/values OOBTree, IIBTree, OIBTree, IOBTree Limited memory footprint –Dictionary keeps everything in memory –BTree divided into buckets Not all buckets in memory at once

Slide 53©2001 Zope Corporation. All Rights Reserved. Pros and cons of various storages FileStorage –Widely used (the default) –Large in-memory index StandaloneZODB has a smaller index –Stores everything in one big file BerkeleyDB storage –Uses transactional BerkeleyDB –Large blobs (pickles) may cause performance problems Others: –OracleStorage, MappingStorage, …

Slide 54©2001 Zope Corporation. All Rights Reserved. Object revisions Each update creates new revision –Storages (may) keep old revisions –Allows application to undo changes –Must pack storage to remove revisions

Slide 55©2001 Zope Corporation. All Rights Reserved. Packing Some storages store multiple object revisions –Used for undo –Eventually used for multi-version concurrency control Old object revisions consume space Pack storages to reclaim –Cant undo –Experimental garbage collection

Slide 56©2001 Zope Corporation. All Rights Reserved. Storage management FileStorage –Packing –Backing up –Recover Berkeley storage –Packing –Backing up –Berkeley maintenance –Tuning

Slide 57©2001 Zope Corporation. All Rights Reserved. When to use ZEO Storages may only be opened with a single process –Although it may be multithreaded ZEO allows multiple processes to open a storage simultaneously Processes can be distributed over a network ZEO cache provides read-only data if server fails

Slide 58©2001 Zope Corporation. All Rights Reserved. ZEO Cache Disk-based cache for objects –Server sends invalidations on update –Checks validity when client connects Persistent caching –Reuse cache next time client is opened –Default is not persistent ZEO connections –Attempts new connection in background when server fails

Slide 59©2001 Zope Corporation. All Rights Reserved. ZEO management StorageServer –Run as separate process May want to run under init or rc.d –ZEO/start.py provided as startup script ClientStorage from ZEO.ClientStorage import ClientStorage s = ClientStorage((server.host, 3400)) Persistent cache s = ClientStorage(host_port, client=abc)

Slide 60©2001 Zope Corporation. All Rights Reserved. zLOG Zope logging mechanism –More flexible than writing to stderr Controlled by environment vars –STUPID_LOG_FILE –STUPID_LOG_SEVERITY Severity levels –300 to –300; 0 is default –-100 provides more details –-300 provides enormous detail

Slide 61©2001 Zope Corporation. All Rights Reserved. Storage migration Storages can be migrated via the iterator protocol –High level interface –Low level interface src = FileStorage(foo.fs) dst = Full(BDB) # Berkeley storage dst.copyTransactionsFrom(src) dst.close()

Slide 62©2001 Zope Corporation. All Rights Reserved. Advanced Topics This section will: –Describe ZODB internals –Discuss data structures –Introduce various advanced features

Slide 63©2001 Zope Corporation. All Rights Reserved. Internals: Object state Objects in memory –Four states Unsaved Up-to-date Changed Ghost –Ghost is placeholder No attributes loaded _p_deactivate()

Slide 64©2001 Zope Corporation. All Rights Reserved. Internals: Connection Cache Each connection has an object cache –Objects referenced by OID –Objects may be ghosts All loads go through cache –Cache access to recent objects –Prevent multiple copies of one object –Creates ghost unless attribute needed Note: dir() doesnt behave correctly with ghosts

Slide 65©2001 Zope Corporation. All Rights Reserved. Internals: Cache Size Controlled by DB & Connection Methods on DB object –Affects all conns associated with DB DB(…, cache_size=400, cache_deactivate_after=60) setCacheSize(size) setCacheDeactiveAfter(size) cacheFullSweep(age) cacheMinimize(age) –Size is objects; after/age is seconds

Slide 66©2001 Zope Corporation. All Rights Reserved. Advanced topics Undo –Transactional –destructive Deprecated, but used as fallback Versions

Slide 67©2001 Zope Corporation. All Rights Reserved. Transactional Undo Implemented by writing a new transaction Supports redo Supported by FileStorage and BerkeleyDB storage (Full) if db.supportsTransactionalUndo(): db.undo(txn_id) get_transaction().commit()

Slide 68©2001 Zope Corporation. All Rights Reserved. undoLog() undoLog(start, stop [, func]) –Returns a dictionary –Entries describe undoable transactions between start & stop time (sec. since epoch) –Optional func is a filter function Takes a dictionary describing each txn Returns true if txn matches criteria

Slide 69©2001 Zope Corporation. All Rights Reserved. UndoError ZODB.POSException.UndoError –raised when undo() is passed a txn id for a non-undoable transaction –Packing can cause transactions to be non-undoable

Slide 70©2001 Zope Corporation. All Rights Reserved. Versions Like a long running, named transaction If a change to an object is made in a version, all subsequent changes must occur in that version until; –version is committed –version is aborted Otherwise, VersionLockError

Slide 71©2001 Zope Corporation. All Rights Reserved. Opening a version if db.supportsVersions(): db.open(version=myversion) # Commit some changes, then db.commitVersion(myversion) # … or … db.abortVersion(myversion)

Slide 72©2001 Zope Corporation. All Rights Reserved. Database API open(version=, transaction=None, temporary=0) open() returns new Connection object –If version specified, work in a version –If transaction specified, close connection on commit –If temporary specified, do not use connection pool DB keeps pool of connections (and their caches) to reuse Default pool size is 7 Locking prevents more than 7 non-temporary connections Separate pools for versions pack(t=None, days=0) –Pack revisions older than t (default is now) –Optional days subtracts days from t