In a Document-Oriented NoSQL Database { "name": "Andrew Liu", " ": "twitter": }

Slides:



Advertisements
Similar presentations
Search Engine Optimisation (SEO) by Graham Sowerby (28 th November 2013)
Advertisements

ENTITY RELATIONSHIP MODELLING
Brian Alderman | MCT, CEO / Founder of MicroTechPoint Pete Harris | Microsoft Senior Content Publisher.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
From Class Diagrams to Databases. So far we have considered “objects” Objects have attributes Objects have operations Attributes are the things you record.
ISD3 Chris Wallace Next 6 Weeks Extended Relational Model Object Orientation Matching systems 3 tier architecture Technology.
Introduction to Databases CIS 5.2. Where would you find info about yourself stored in a computer? College Physician’s office Library Grocery Store Dentist’s.
Database Design Chapter 2. Goal of all Information Systems  To add value –Reduce costs –Increase sales or revenue –Provide a competitive advantage.
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
user experiencesapp development data platform 8.
Data Model Examples USER SPECIFICATIONS.
Case study Lisa’s Bookstore IST210.
Entity/Relationship Modelling
A Social blog using MongoDB ITEC-810 Final Presentation Lucero Soria Supervisor: Dr. Jian Yang.
SYSTEMSDESIGNANALYSIS 1 Chapter 15 Designing Output Jerry Post Copyright © 1997.
Database Design for DNN Developers Sebastian Leupold.
Systems analysis and design, 6th edition Dennis, wixom, and roth
ASP.NET Programming with C# and SQL Server First Edition
DBA Developer. Responsibilities  Designing Relational databases  Developing interface layer Environment Microsoft SQL Server,.NET SQL Layer: Stored.
CHAPTER 7 Database: SQL, MySQL. Topics  Introduction  Relational Database Model  Relational Database Overview: Books.mdb Database  SQL (Structured.
CHAPTER 2. FUNDAMENTAL OF ENTITY RELATIONSHIP (ER)
ITOM 2308 Introduction to Databases Review Access Database Corporate Case Study ITOM 2308 Class 81.
Introduction to SQL Steve Perry
Database Design Sections 6 & 7 Second Normal Form (2NF), Unique Identifiers (UID), Third Normal Form (3NF), Arcs, Hierarchies and Recursive relationships.
SQL/lesson 2/Slide 1 of 45 Retrieving Result Sets Objectives In this lesson, you will learn to: * Use wildcards * Use the IS NULL and IS NOT NULL keywords.
Microsoft Access Introduction. What Is a Database Suppose you are a school administrator. You need to have information about –Students –Faculty –Staff.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
INSERT BOOK COVER 1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Access 2010 by Robert Grauer, Keith.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
 2004 Prentice Hall, Inc. All rights reserved. 1 Segment – 6 Web Server & database.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
1.NET Web Forms Business Forms © 2002 by Jerry Post.
Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance.
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
CS370 Spring 2007 CS 370 Database Systems Lecture 4 Introduction to Database Design.
Relational Database. Database Management System (DBMS)
App Dev with Documents, their Schemas and Relationships Tugdual Grall Technical Evangelist.
Database Fundamentals Lecture 4 Useful website for MySQL download language.com/workshops/Default.asp ?workshop=21.
IST 220 Introduction to Databases Course Wrap-up.
© 2007 by Prentice Hall1-1 Chapter 1 Introduction to Relational Database Systems and Oracle 10g Introduction to Oracle 10g James Perry and Gerald Post.
Databases,Tables and Forms Access Text by Grauer Chapters 1 & 2.
DAT602 Database Application Development Lecture 2 Review of Relational Database.
Lesson 01: Introduction to Database Software. At the end of this lesson, students should be able to: State the usage of database software. Start a database.
CS520 Project Online Book Store
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Entity/Relationship Modelling. Entity Relationship Modelling In This Lecture Entity/Relationship models Entities and Attributes Relationships Attributes.
Microsoft Office 2013 Try It! Chapter 4 Storing Data in Access.
Introduction to MongoDB. Database compared.
JSON C# Libraries Parsing JSON Files “Deserialize” OR Generating JSON Files “Serialize” JavaScriptSerializer.NET Class JSON.NET.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
DATA MODELING AND ENTITY-RELATIONSHIP MODEL II IST 210: Organization of Data IST210 1.
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
COMP 430 Intro. to Database Systems MongoDB. What is MongoDB? “Humongous” DB NoSQL, no schemas DB Lots of similarities with SQL RDBMs, but with more flexibility.
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 2 Objectives: Understanding and Creating Table.
1 Database Design Sections 6 & 7 First Normal Form (1NF), Second Normal Form (2NF), Unique Identifiers (UID), Third Normal Form (3NF), Arcs, Hierarchies.
{ "name": "SmugMug", "permalink": "smugmug", "homepage_url": " "blog_url": " "category_code": "photo_video",
Getting started with Accurately Storing Data
Entity/Relationship Modelling
Understanding and Improving Server Performance
NoSQL Databases NoSQL Concepts Databases Telerik Software Academy
Business in a Connected World
Bookstore DB Requirements
J.K Rowling A brief presentation concerning Rowling as an author and the creator of a loved book series.
Advanced Database Concepts: Reports & Views
SEO Hand Book.
Data Modeling.
Polyglot Persistence: Document Databases
Developer Intro to Cosmos DB
Presentation transcript:

in a Document-Oriented NoSQL Database { "name": "Andrew Liu", " ": "twitter": }

NoSQL is buzzword NoSQL is varied Key-value Wide-column Document-oriented Graph

Document stores contain data objects that are inherently hierarchical, tree-like structures (most notably JSON). Built for scale and performance Great for: Hierarchical Trees, Logging, Telemetry

{ "name": "SmugMug", "permalink": "smugmug", "homepage_url": " "blog_url": " "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Evelyn Ave", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": , "longitude": } ] } Perfect for these Documents

Not these documents

{ "name": "SmugMug", "permalink": "smugmug", "homepage_url": " "blog_url": " "category_code": "photo_video", "products": [ { "name": "SmugMug", "permalink": "smugmug" } ], "offices": [ { "description": "", "address1": "67 E. Evelyn Ave", "address2": "", "zip_code": "94041", "city": "Mountain View", "state_code": "CA", "country_code": "USA", "latitude": , "longitude": } ] } Perfect for these Documents schema-agnostic JSON store for hierarchical and de-normalized data at scale

ItemAuthorPagesLanguage Harry Potter and the Sorcerer’s Stone J.K. Rowling309English Game of Thrones: A Song of Ice and Fire George R.R. Martin 864English

ItemAuthorPagesLanguage Harry Potter and the Sorcerer’s Stone J.K. Rowling309English Game of Thrones: A Song of Ice and Fire George R.R. Martin 864English Lenovo Thinkpad X1 Carbon???

ItemAuthorPagesLanguageProcessorMemoryStorage Harry Potter and the Sorcerer’s Stone J.K. Rowling 309English??? Game of Thrones: A Song of Ice and Fire George R.R. Martin 864English??? Lenovo Thinkpad X1 Carbon ??? Core i7 3.3ghz 8 GB256 GB SSD

ItemAuthorPagesLanguage Harry Potter and the Sorcerer’s Stone J.K. Rowling309English Game of Thrones: A Song of Ice and Fire George R.R. Martin 864English ItemCPUMemoryStorage Lenovo Thinkpad X1 CarbonCore i7 3.3ghz8 GB256 GB SSD

ProductIdItem 1Harry Potter and the Sorcerer’s Stone 2Game of Thrones: A Song of Ice and Fire 3Lenovo Thinkpad X1 Carbon ProductIdAttributeValue 1AuthorJ.K. Rowling 1Pages309 … 2AuthorGeorge R.R. Martin 2Pages864 … 3ProcessorCore i7 3.3ghz 3Memory8 GB …

Come as you are Data normalization ORM

Modeling data, the relational way

Modeling data, the document way

To embed, or to reference, that is the question embedreference

To embed, or to reference, that is the question Data from entities are queried together

To embed, or to reference, that is the question Data from entities are queried together

To embed, or to reference, that is the question Data from entities are queried together { id: "book1", covers: [ {type: "front", artworkUrl: " {type: "back", artworkUrl: " ], index: "", chapters: [ {id: 1, synopsis: "", pageCount:24, wordCount:1456}, {id: 2, synopsis: "", pageCount:18, wordCount:960} ] }

To embed, or to reference, that is the question Data from entities are queried together The child is a dependent e.g. Order Line depends on Order { id: "order1", customer: "customer1", orderDate: " T23:14: Z" lines: [ {product: "13inch screen", price: , qty: 50 }, {product: "Keyboard", price:23.67, qty:4}, {product: "CPU", price:87.89, qty:1} ] }

To embed, or to reference, that is the question Data from entities are queried together The child is a dependent e.g. Order Line depends on Order 1:1 relationship { id: "person1", name: "Mickey" creditCard: { number: "**** **** **** 4794", expiry: "06/2019", cvv: "868", type: "Mastercard" }

To embed, or to reference, that is the question Data from entities are queried together The child is a dependent e.g. Order Line depends on Order 1:1 relationship Similar volatility { id: "person1", name: "Mickey", contactInfo: [ { {mobile: " "}, {twitter: ] }

To embed, or to reference, that is the question Data from entities are queried together The child is a dependent e.g. Order Line depends on Order 1:1 relationship Similar volatility The set of values or sub-documents is bounded (1:few) { id: "task1", desc: "deliver an awesome #sqlbits", categories: [ "conference", "talk", "workshop", “databases“ ] }

To embed, or to reference, that is the question Data from entities are queried together The child is a dependent e.g. Order Line depends on Order 1:1 relationship Similar volatility The set of values or sub-documents is bounded (1:few) Typically denormalized data models provide better read performance

To embed, or to reference, that is the question one-to-many relationships (unbounded) { id: "post1", author: "Mickey Mouse", tags: [ "fun", "cloud", "develop"] } {id: "c1", postId: "post1", comment: "Coolest blog post"} {id: "c2", postId: "post1", comment: "Loved this post, awesome"} {id: "c3", postId: "post1", comment: "This is rad!"} … {id: "c10000", postId: "post1", comment: "You are the coolest cartoon character"} … {id: "c ", postId: "post1", comment: "Are we still commenting on this blog?"}

To embed, or to reference, that is the question one-to-many relationships (unbounded) many-to-many relationships { id: "book1", name: "100 Secrets of Disneyland" } { id: "book2", name: "The best places to Disney" } { author-id: "author1", book-id: "book1" } { author-id: "author2", book-id: "book1" } { id: "author1", name: "Mickey Mouse" } { id: "author2", name: "Donald Duck" } Look familiar? It should …. It's the "relational" way

To embed, or to reference, that is the question one-to-many relationships (unbounded) many-to-many relationships { id: "book1", name: "100 Secrets of Disneyland", authors: ["author1", "author2"] } { id: "book2", name: "The best places to Disney”, authors: ["author1"] } { id: "author1", name: "Mickey Mouse", books: ["book1", "book2"] } { id: "author2", name: "Donald Duck" books: ["book1"] }

To embed, or to reference, that is the question one-to-many relationships (unbounded) many-to-many relationships Related data changes frequently The referenced entity is a key entity used by many others { id: "1", author: "Mickey Mouse", stocks: ["dis", "msft"] } { id: "dis", opening: "52.09", numerOfTrades: 10000, trades: [{qty:57, price: 53.97}, {qty:5, price: 54.01}] }

To embed, or to reference, that is the question one-to-many relationships (unbounded) many-to-many relationships Related data changes frequently The referenced entity is a key entity used by many others Normalized data models can require more round trips to the server. Typically normalizing provides better write performance.

Publisher document: { id: "mspress", name: "Microsoft Press", books: [ 1, 2,... ] } Book documents: {id: 1, name: "DocumentDB 101" } {id: 2, name: "DocumentDB for RDBMS Users" }

Publisher document: { id: "mspress", name: "Microsoft Press", } Book documents: {id: 1, name: "DocumentDB 101", pub-id: "mspress"} {id: 2, name: "DocumentDB for RDBMS Users", pub-id: "mspress"}

{ "id": "product1", "type": "product", "name": "Microsoft Band 2 – Medium", "price": "174.99", "summary": "Continuous heart rate monitor tracks heart rate...", "images": [ {"image1": " { "image2": " ], "reviews": { "averageStars": 4, "reviewCount": 313 }

{ "id": "product1", "type": "reviewSummary", "reviewBreakdown: [ {5:24},{4:10},{3:3},{2:0},{1:4} ], "topReview": { "rating": 4, "title": "More comfortable than Band 1: But New Size Scale!", "snippet": "I've been wearing the first Band since it…", "fullReviewLink": " }

{ {id: "Jill" }, {id: "Ben", manager: "Jill" }, {id: "Susan", manager: "Jill" }, {id: "Andrew", manager: "Ben" }, {id: "Sven", manager: "Susan" }, {id: "Thomas", manager: "Sven" } } SELECT manager FROM org WHERE id = "Susan" To get the manager of any employee is trivial - Jill BenSusan Sven Andrew Thomas

SELECT * FROM org WHERE manager = "Jill" To get all employees where Jill is the manager is also easy - { {id: "Jill" }, {id: "Ben", manager: "Jill" }, {id: "Susan", manager: "Jill" }, {id: "Andrew", manager: "Ben" }, {id: "Sven", manager: "Susan" }, {id: "Thomas", manager: "Sven" } } Jill BenSusan Sven Andrew Thomas

{ {id: "Jill", directs:["Ben","Susan"] }, {id: "Ben", directs:["Andrew"] }, {id: "Susan", directs: ["Sven"] }, {id: "Andrew" }, {id: "Sven", directs: ["Thomas"] }, {id: "Thomas" } } SELECT * FROM org WHERE id = "Jill" To get all direct reports for Jill is easy - Jill BenSusan Sven Andrew Thomas

SELECT * FROM emp WHERE ARRAY_CONTAINS(emp.directs, "Ben") To find the manager for an employee is possible - { {id: "Jill", directs:["Ben","Susan"] }, {id: "Ben", directs:["Andrew"] }, {id: "Susan", directs: ["Sven"] }, {id: "Andrew" }, {id: "Sven", directs: ["Thomas"] }, {id: "Thomas" } } Jill BenSusan Sven Andrew Thomas

{ id: "CDC101", title: "Fundamentals of database design", credits: 10 } }

{ id: "CDC101", title: “The Fundamentals of Database Design", titleWords: ["database","design","database design"], credits: 10 } Consider using a RegEx to transform words to lowercase and remove punctuation. Strip out stop words like “to”, “the”, “of” etc. Denormalize keywords in to key phrases SELECT books.title FROM books WHERE ARRAY_CONTAINS(books.titleWords, "database")

{ id: "", timestamp: "...", reading: 123 }

{ id: "...", timestampMinute: "...", readings: [ {minute:0, reading:123}, {minute:1, reading:456},... {minute:59,reading:999} ] }

{ id: "...", timestamp: "...", logData: {attr1: value1, attr2: value2,...} }

{ type: "book", bookId: "book1", authors: [authorId:1, authorId:2]... } { type: "author", authorId: 1, authorName: "Andrew"... } SELECT b.* FROM b WHERE b.type="book"

{ type: "book", bookId: "book1", authors: [authorId:1, authorId:2]... } { type: "author", authorId: 1, authorName: "Andrew"... } SELECT b.* FROM b WHERE ARRAY_CONTAINS(b.authorId,1 ) OR b.authorId = 1

{ "id": "1", "firstName": "Thomas", "lastName": "Andersen", "addresses": [ { "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": } ], "contactDetails": [ {" {"phone": " ", "extension": 5555} ] }

{ "id": "xyz", "username: "user xyz" } { "id": "address_xyz", "userid": "xyz", "address" : { … } { "id: "contact_xyz", "userid": "xyz", " " : "phone" : " " } Normalizing typically provides better write performance

No magic bullet Think about how your data is going to be written, read and model accordingly { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [ {"thumbnail": " {"profile": " ] } { "id": 1, "name": "DocumentDB 101", "authors": [ {"id": 1, "name": "Thomas Andersen", "thumbnail": " {"id": 2, "name": "William Wakefield", "thumbnail": " ] }

Understand the access patterns on your database Read/Write Ratio Top Queries, Sprocs, and CRUD operations The life-cycle of the data and growth rate of documents Use built-in properties Use Id (id) to enforce uniqueness constraint and efficient querying Use TTL (ttl) to prune out old data Use Timestamp (_ts) for checking for incremental changes Use ETag (_etag) for optimistic concurrency and cache refresh semantics

{ "name": "Andrew Liu", " ": "twitter": }