CS 440 Database Management Systems

Name: CS 440 Database Management Systems
Uploaded: 2017-12-02T14:52:59+00:00
Duration: PTM16S9
Channel: Calista Edds
Description: CS 440 Database Management Systems

CS 440 Database Management Systems
Lecture 14: NoSQL & NewSQL, Cont’d. Some slides due to Magda Balazinska

Scaling by partitioning & replication
Partition the data across machines Replicate the partitions Good: spread read queries across replica Bad: should keep the replica consistent after write queries Ugly: difficult to scale transactions two phase commit is expensive difficult to scale complex operations

NoSQL: Not Only SQL/ Not relational
Goals highly scalable data management system flexible data model: various records from different schema They are willing to give up Complex queries e.g. no join ACID guarantees weaker versions, e.g. eventual consistency Multi-object transactions Not all NoSQL systems give up all these properties

NoSQL key features Scale horizontally simple operations
key lookups, reads and writes of one record or a small number of records, simple selections Replicate/distribute data over many servers Simple call level interface (contrast w/ SQL) Weaker concurrency model than ACID Efficient use of distributed indexes and RAM Flexible schema

Different types of NoSQL
Taxonomy based on the data models: Key-value stores e.g., Dynamo, Project Voldemort, Memcached Document stores e.g., SimpleDB, CouchDB, MongoDB Extensible Record stores e.g., Bigtable, HBase, Cassandra NewSQL: new type of RDBMSs e.g., Megastore, VoltDB,

Key-Value stores features
Data model: (key, value) pairs values are binary objects no further schema Operations insert, delete, and lookup operations on keys no operation across multiple data items Consistency replication with eventual consistency e.g., vector clocks in Dynamo goal to NEVER reject any writes (bad for business!) multiple versions with conflict resolution during reads

Key-Value stores features
Use replication to provide fault-tolerance Quorum replication in Dynamo Each update creates a new version of an object Vector clocks track causality between versions Parameters: N = number of copies (replicas) of each object R = minimum number of nodes that must participate in a successful read W = minimum number of nodes that must participate in a successful write Quorum: R+W > N

Key-Value stores internals
Only primary index: lookup by key No secondary indexes! Data remains in main memory Most systems also offer a persistence option Some offer ACID transactions others do not Multiversion concurrency control or locking

Multiversion Concurrency Control
Idea: Let writers make a “new” copy while readers use an appropriate “old” copy: MAIN SEGMENT (Current versions of DB objects) VERSION POOL (Older versions that may be useful for some active readers.) O’ O O’’ Readers are always allowed to proceed. But may be blocked until writer commits.

Multiversion CC (Contd.)
Each version of an object has its writer’s TS as its WTS, and the TS of the Xact that most recently read this version as its RTS. Versions are chained backward; we can discard versions that are “too old to be of interest”. Each Xact is classified as Reader or Writer. Writer may write some object; Reader never will. Xact declares whether it is a Reader when it begins.

Reader Xact For each object to be read:
old new Reader Xact WTS timeline T For each object to be read: Finds newest version with WTS < TS(T). (Starts with current version in the main segment and chains backward through earlier versions.) Assuming that some version of every object exists from the beginning of time, Reader Xacts are never restarted. However, might block until writer of the appropriate version commits.

Writer Xact To read an object, follows reader protocol.
To write an object: Finds newest version V s.t. WTS < TS(T). If RTS(V) < TS(T), T makes a copy CV of V, with a pointer to V, with WTS(CV) = TS(T), RTS(CV) = TS(T). (Write is buffered until T commits; other Xacts can see TS values but can’t read version CV.) Else, reject write. old new WTS CV V T RTS(V)

Check out DynamoDB!

Taxonomy based on the data models: Key-value stores e.g., Dynamo, project voldemort, Memcached Document stores e.g., SimpleDB, CouchDB, MongoDB Extensible Record stores e.g., BigTable, HBase, Cassandra NewSQL: new type of RDBMSs

Document stores A "document” is a pointer-less object e.g., JSON
nested or not schema-less They may have secondary indexes. Scalability Replication (e.g. SimpleDB, CounchDB – means entire db is replicated) Sharding (MongoDB) Both

Amazon SimpleDB (1/3) Partitioning Data Model/ Schema
Data partitioned into domains: queries run within a domain Domains seem to be unit of replication. Limit 10GB Can use domains to manually create parallelism Data Model/ Schema No fixed schema Objects are defined with attribute-value pairs

Amazon SimpleDB (2/3) Indexing Support for writing
Automatically indexes all attributes Support for writing PUT and DELETE items in a domain Support for querying GET by key Selection + sort: SELECT output_list FROM domain_name [where expression] [sort_instructions] [limit limit] A simple form of aggregation: count Query is limited to 5s and 1MB output (but can continue)

Amazon SimpleDB (3/3) Availability and consistency
Data is stored redundantly across multiple servers Takes time for the update to propagate to all locations Eventually consistent, but an immediate read might not show the change Choose between consistent or eventually consistent read

Taxonomy based on the data models: Key-value stores e.g., Dynamo, project voldemort, Memcached Document stores e.g., SimpleDB, CouchDB, MongoDB Extensible record stores e.g., BigTable, HBase, Cassandra NewSQL: new type of RDBMSs

Extensible record stores
Data model is rows and columns Typical Access: Row ID, Column ID, Timestamp Scalability by splitting rows and columns over nodes Rows: sharding on primary key Columns: "column groups" = indication for which columns to be stored together (e.g. customer name/address group, financial info group, login info group)

Google Bigtable Distributed storage system
Designed to store structured data Scale to thousands of servers Store up to several hundred terabytes (maybe even petabytes) Perform backend bulk processing Perform real-time data serving To scale, Bigtable has a limited set of features

Bigtable data model Sparse, multidimensional sorted map
(row:string, column:string, time:int64)string Columns are grouped in to families

Bigtable key features Read/writes of data under single row key is atomic Only single-row transactions! Data is stored in lexicographical order Improves data access locality Horizontally partitioned into tablets Tablets are unit of distribution and load balancing Column families are unit of access control Data is versioned (old versions garbage collected) Ex: most recent three crawls of each page, with times

Bigtable API Data definition
Creating/deleting tables or column families Changing access control rights Data manipulation Writing or deleting values Looking up values from individual rows Iterating over subset of data in the table Can select on rows, columns, and timestamps

HBase Open source implementation of BigTablehttp://hbase.apache.org/

Taxonomy based on the data models: Key-value stores e.g., Dynamo, project voldemort, Memcached Document stores e.g., SimpleDB, CouchDB, MongoDB Extensible record stores e.g., BigTable, HBase, Cassandra NewSQL: new type of RDBMSs

Scalable RDBMS: NewSQL
Means RDBS that are offering sharding Key difference: NoSQL make it difficult or impossible to perform large scope operations and transactions (to ensure performance), while scalable RDBMS do not preclude these operations, but users pay a price only when they need them. Megastore, VoltDB, MySQL Cluster, Clusterix, ScaleDB

Megastore Implemented over Bigtable, used within Google
Megastore is a layer on top of Bigtable Transactions that span nodes A database schema defined in a SQL-like language Hierarchical paths that allow some limited joins Megastore is made available through the Google App Engine Datastore

VoltDB Main-memory RDBMS: no disk IO no buffer mngmt!
Sharded across a shared-nothing cluster One transaction = one stored procedure So both the data and processing are partitioned Transaction processing SQL execution single-threaded for each shard Avoids all locking and latching overheads Synchronous multi-master replication for HA

Application 1 Web application that needs to display lots of customer information; the users data is rarely updated, and when it is, you know when it changes because updates go through the same interface. Use key-value store

Application 2 Department of Motor Vehicle: lookup objects by multiple fields (driver's name, license number, birth date, etc); "eventual consistency" is ok, since updates are usually performed at a single location. Document store

Application 3 eBay-style application. Cluster customers by country; separate the rarely changed "core” customer information (address, ) from frequently-updated info (current bids). Extensible record store

Application 4 Everything else (e.g. a serious DMV application)
Scalable RDBMS

Criticism (from Stonebraker, CACM2011)
No ACID Equals No Interest in enterprises Screwing up mission-critical data is no-no-no Low-level Query Language is Death Before SQL NoSQL means NoStandards One (typical) large enterprise has 10,000 databases. These need accepted standards Scalable RDBMS

CS 440 Database Management Systems

Similar presentations

Presentation on theme: "CS 440 Database Management Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 440 Database Management Systems

Similar presentations

Presentation on theme: "CS 440 Database Management Systems"— Presentation transcript:

Similar presentations

About project

Feedback