Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.

Similar presentations


Presentation on theme: "CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles."— Presentation transcript:

1 CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles

2 The Need for NoSQL Big Data Semi-structured Data Large-scale Parallel Processing

3 Is CSNS Big? Database 112 tables 1.6 million records 260 MB (including indexes) Files 327,985 71.4 GB Data collected on 11/29/2015

4 Some Data Are Definitely Big Google processed 24 PB data per day in 2009 Facebook had 1.5 PB photos and 60 billion images in 2009 As of 7/1/2015, the Internet Archive Wayback Machine contains 23 PB data and grows at a rate of 50-60 TB per week

5 How Big Is Big? Currently any data set over a few TB is considered big data Data size large enough to span several storage units Traditional RDBMS starts to show signs of stress

6 Semi-Structured Data Data that has some structure but does not conform strictly to a schema Data may be irregular or incomplete Structure may change rapidly or unpredictably

7 Semi-Structured Data Example: HTML Pages Many HTML pages have a structure: header, footer, menu, title, content … But many don’t And those that do implement the structure in all kinds of different ways HTML5 introduces many new tags New data, e.g. Same data now under different tags, e.g. vs.

8 The “Web Scale” Google (2012) 3.3 billion search per day Twitter (2013) 500 million tweets per day Facebook (4/2015) 936 million daily active users

9 Scalability The ability of a system to increase throughput with addition of resources to address load increases Vertical Scaling – faster CPUs, more memory, bigger hard drives … Horizontal Scaling – add more nodes to server clusters

10 Why Not RDBMS? Its strengths are also its weakness StrengthWeakness Schema Clearly defines data and relationship; ensures data quality and integrity. Not suitable for semi- structured data. ACID Guarantees the correctness of the operations and the durability of the data. Makes it very difficult to scale SQL One language for all data and all RDBMS. Impedes rapid application development (RAD) due to the mismatch between SQL and the application languages.

11 NoSQL No SQL, Not Only SQL, Not Relational … A term that describes a class of data storage and manipulation technologies and products that do not follow the RDBMS principles and focus on large datasets, performance, scalability, and agility.

12 Types of NoSQL Databases Key-Value Stores Column Family Stores Graph Databases Document Databases

13 Key-Value Stores Simple, fast, scalable ProductUsed By RedisTwitter, GitHub, Snapchat, Craigslist DynamoEA, New York Times, HTC CassandraFacebook, Twitter, Reddit VoldemortLinkedin

14 Column Family Stores Data is stored in a column-oriented way as opposed to the row-oriented format in RDBMS ProductUsed By BigTableGoogle HBaseFacebook, Yahoo, Hulu and others HypertableBaidu, Rediff

15 Column and Column Family Columns: first_name, last_name, gender, occupation, zip_code Column families name : first_name, last_name profile : gender, occupation location : zip_code Column families typically need to be pre- defined while new columns can be added at any time

16 Units of Data row-key: 1 first_name: John last_name: Doe gender: male zip_code: 10001 row-key: 2 first_name: Jane zip_code: 10002

17 Data Storage row-key: 1 first_name: John last_name: Doe row-key: 2 first_name: Jane row-key: 1 zip_code: 10001 row-key: 2 zip_code: 10002 row-key: 1 gender: male namelocationprofile

18 Graph Databases Stores vertices (i.e. entities) and edges (i.e. relationships between vertices) Optimized for graph storage and processing ProductUsed By Neo4jInfoJobs, Addidas

19 Document Databases A document in a document database consists of a loosely structured set of key-value pairs. ProductUsed By MongoDBFacebook, Craigslist, Adobe CouchDBApple, BBC

20 “Document” Example { ‘first_name’: ‘John’, ‘last_name’: ‘Doe’, ‘age’: 20, ‘address’: { ‘street’: ‘123 Main’ ‘city’: ‘Los Angeles’ ‘state’: ‘CA’ } It’s really an object!

21 Objects in JavaScript (I) A JavaScript object consists of a set of properties which can be added dynamically var car = new Object(); car.make = ‘Honda’; car.model = ‘Civic’; car.year = 2001; var owner = new Object(); owner.name = ‘Chengyu’; car.owner = owner;

22 Objects in JavaScript (II) Object Literal var car = { make: ‘Honda’, model: ‘Civic’, year: 2001, owner: { name: ‘Chengyu’ } };

23 Objects in JavaScript (III) JSON (JavaScript Object Notation) var car = { ‘make’: ‘Honda’, ‘model’: ‘Civic’, ‘year’: 2001, ‘owner’: { ‘name’: ‘Chengyu’ } };

24 NoSQL Database Example: MongoDB http://www.indeed.com/jobtrends

25 MongoDB Server DB Database Collection

26 MongoDB Shell > mongo A command line client that provides an interactive JavaScript interface to MongoDB

27 Basic MongoDB Shell Commands help show dbs use Switch to database won’t be created until some data is inserted into it show collections db.dropDatabase()

28 Some Collection Methods db..insert() db..update() db..save() db..find() db..remove() https://docs.mongodb.org/manual/reference/method/js-collection/

29 Basic CRUD Operations Create a database test1 Create two documents (i.e. objects or records) John and Jane Save the two documents to a collection users Query the collection

30 Using find() find( query, projection ) Both query and projection are documents in the form of { field1:, field2: … } https://docs.mongodb.com/manual/tutorial/query-documents/

31 Query Examples List all users Find the users whose first names are John Find the first names of the users whose last names are Doe Find the users who are older than 20 Find the users who are older than 20 and younger than 30 Find the users who are younger than 20 or older than 30 Find the users who live in CA

32 Programming Language Support Drivers for various server-side programming language – https://docs.mongodb.org/ecosystem/d rivers/ https://docs.mongodb.org/ecosystem/d rivers/

33 Readings Professional NoSQL by Shashank Tiwari MongoDB Manual - https://docs.mongodb.org/manual/ https://docs.mongodb.org/manual/


Download ppt "CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles."

Similar presentations


Ads by Google