Download presentation
Presentation is loading. Please wait.
Published byClyde Adams Modified over 9 years ago
1
Getting Biologists off ACID Ryan Verdon 3/13/12
2
Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL databases with examples
3
Thesis Idea Amount of biological data is growing exponentially Hardware cannot keep up with the growth
4
Database Survey In 2003, a survey was conducted of 111 biological databases. 96 (i.e. 87%) of the 111 considered databases have Hypertext references to other databases, 40 to 44 (i.e. 36% to 40%) are implemented as flat files, 41 (or 42) (i.e. 37%) are implemented using a relational database management system, 7 (i.e. 6%) use an object database management system, 3 (.i.e. 3%) use an object-relational database management system, and all databases collect data from different sources.
5
Specific Database
6
Flybase Primary biology database on the insect family Drosophilidae (fruit fly) Important research from fruit flies includes – Genes – Recombination – Signaling networks (important for major diseases) – Stem cells – Growth control
7
Info on the Flybase database Single server relational database Over 135 tables Uses Chado schema Gives data a “type” Creates a graph that relates the different types
8
Effects of losing ACID
9
ACID Atomicity Consistency Isolation Durability
10
CAP Theorem Consistency Availability Partition tolerance
11
BASE consistency model Basically Available Soft state Eventually consistent Given enough time where no changes are made, all replicas will see the same data
12
NoSQL
13
What is a NoSQL database Any non-relational database – Hierarchical – Graph – Object oriented – Etc.
14
Why were NoSQL databases created To overcome limitations of relational databases – Predefined layout – Scaling – SQL – Large feature set
15
Major types of NoSQL databases Key-value stores Column-oriented databases Document based stores
16
Key-value stores Stored values are indexed for retrieval by keys Can store unstructured or structured data Similar to DHTs
17
Example: Dynamo Created by Amazon, only used internally Highly available even in the face of continual failures Amazon realized many services only need primary-key access – Best seller lists – Shopping carts – Session management
18
Column-oriented databases Contain extendable columns of closely related data Can greatly benefit from compression
19
Example: HBase Apache open-source project Based off of Google’s Bigtable database Ties in closely with Hadoop and map-reduce
20
Document based stores Data stored as a collection of documents Documents can have any number of fields and any length Documents are accessed via a unique key Capabilities of the query language heavily depend on the implementation
21
Example: MongoDB Started by 10gen Supports – Replication – Map-reduce – Sharding Used by lots of large companies including – Disney – Craigslist
22
Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.