VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Aggregation.

Slides:



Advertisements
Similar presentations
Senior Solutions Architect, MongoDB James Kerr Security Features Preview Field Level Access Control.
Advertisements

What is a Database By: Cristian Dubon.
Concepts of Database Management Sixth Edition
Concepts of Database Management Seventh Edition
Chapter 11 Group Functions
Introduction to Structured Query Language (SQL)
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
Introduction to Structured Query Language (SQL)
A Guide to SQL, Seventh Edition. Objectives Retrieve data from a database using SQL commands Use compound conditions Use computed columns Use the SQL.
Microsoft Access 2010 Chapter 7 Using SQL.
WRITING BASIC SQL SELECT STATEMENTS Lecture 7 1. Outlines  SQL SELECT statement  Capabilities of SELECT statements  Basic SELECT statement  Selecting.
SQL Operations Aggregate Functions Having Clause Database Access Layer A2 Teacher Up skilling LECTURE 5.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Read Lecturer.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Write Lecturer.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Data Modeling.
MS Access: Database Concepts Instructor: Vicki Weidler.
Concepts of Database Management, Fifth Edition
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
Relational DBs and SQL Designing Your Web Database (Ch. 8) → Creating and Working with a MySQL Database (Ch. 9, 10) 1.
Chapter 3 Single-Table Queries
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall1 Exploring Microsoft Office Access Committed to Shaping the Next Generation.
MongoDB An introduction. What is MongoDB? The name Mongo is derived from Humongous To say that MongoDB can handle a humongous amount of data Document.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XQuery.
Lesson 1: Introduction to ABAP OBJECTS Todd A. Boyle, Ph.D. St. Francis Xavier University.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation eXist Update Lecturer.
Analyzing Data For Effective Decision Making Chapter 3.
Chapter 8 Cookies And Security JavaScript, Third Edition.
SQL: Data Manipulation Presented by Mary Choi For CS157B Dr. Sin Min Lee.
Arrays An array is a data structure that consists of an ordered collection of similar items (where “similar items” means items of the same type.) An array.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Query Data Model Lecturer.
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
Concepts of Database Management Seventh Edition
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Identity Constraints.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
 Agenda 2/20/13 o Review quiz, answer questions o Review database design exercises from 2/13 o Create relationships through “Lookup tables” o Discuss.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
Concepts of Database Management Eighth Edition Chapter 3 The Relational Model 2: SQL.
Keyword Searching Weighted Federated Search with Key Word in Context Date: 10/2/2008 Dan McCreary President Dan McCreary & Associates
Advanced SELECT Queries CS 146. Review: Retrieving Data From a Single Table Syntax: Limitation: Retrieves "raw" data Note the default formats… SELECT.
Queries SELECT [DISTINCT] FROM ( { }| ),... [WHERE ] [GROUP BY [HAVING ]] [ORDER BY [ ],...]
DATA RETRIEVAL WITH SQL Goal: To issue a database query using the SELECT command.
Concepts of Database Management Seventh Edition Chapter 3 The Relational Model 2: SQL.
XML Query: xQuery Reference: Xquery By Priscilla Walmsley, Published by O’Reilly.
Access Queries Agenda 6/16/14 Review Access Project Part 1, answer questions Discuss queries: Turning data stored in a database into information for decision.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Part:2.  Keywords are words with special meaning in JavaScript  Keyword var ◦ Used to declare the names of variables ◦ A variable is a location in the.
A Guide to SQL, Eighth Edition Chapter Four Single-Table Queries.
Lesson 4: Querying a Database. 2 Learning Objectives After studying this lesson, you will be able to:  Create, save, and run select queries  Set query.
Aggregator  Performs aggregate calculations  Components of the Aggregator Transformation Aggregate expression Group by port Sorted Input option Aggregate.
1 Chapter 3 Single Table Queries. 2 Simple Queries Query - a question represented in a way that the DBMS can understand Basic format SELECT-FROM Optional.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
Aggregator Stage : Definition : Aggregator classifies data rows from a single input link into groups and calculates totals or other aggregate functions.
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
Retrieving Information Pertemuan 3 Matakuliah: T0413/Current Popular IT II Tahun: 2007.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
Lecturer : Dr. Pavle Mogin
Lecturer : Dr. Pavle Mogin
Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together,
SQL: Structured Query Language DML- Queries Lecturer: Dr Pavle Mogin
MongoDB Aggregations.
MongoDB Aggregations.
MongoDB Read/Write.
CS4222 Principles of Database System
Computing in COBOL: The Arithmetic Verbs and Intrinsic Functions
Section 4 - Sorting/Functions
MongoDB Aggregations.
Shelly Cashman: Microsoft Access 2016
Presentation transcript:

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Aggregation Lecturer : Dr. Pavle Mogin

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 1 Plan for MongoDB Aggregation Aggregation Pipeline Map – Reduce Single Purpose Aggregation –Reedings: Have a look at Readings on the Home Page

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 2 Aggregation Operations in MongoDB Three classes of aggregation commands: –Aggregation Pipeline, –Map – Reduce, and –Single Purpose Aggregation All three commands process documents from a single collection Aggregation Pipeline uses native MongoDB operations and is the preferred aggregation method Map – Reduce: –Has two phases: a map phase that processes each input document and emits objects, and a reduce phase that combines the output of the map operation –It is based on custom JavaScript functions and aimed for very large collection

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 3 Single Purpose Aggregation (SPA) Single Purpose Aggregation is simple but limited in scope, everything supported by it can be done by Aggregation Pipeline In principle, SPA supports: count, distinct, and group operations db.myclasses.count() –Returns the number of all documents in myclasses db.myclasses.count({ 'course.code': {$regex:/^SWEN/}}) –Returns the number of SWEN documents in myclasses db.myclasses.distinct('course.code‘) –Returns the distinct course codes in myclasses

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 4 Aggregation Pipeline Aggregation pipeline consists of stages Each stage transforms the documents as they pass through the pipeline: –Some stages transform old into new documents, –Others filter out documents, –The same stage can appear multiple times in a pipeline For the aggregation pipeline, MongoDB provides the db.collection.aggregate() method in the mongo shell Pipeline stages appear in an array Documents pass through the stages in sequence db.collection.aggregate([{ },...])

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 5 Stage Operators(1) $group : –Groups all input documents by a specified identifier expression, –Applies the accumulator expression(s), if specified, to each group, –Outputs one document per each distinct group, –The output documents only contain the identifier field and, if specified, accumulated fields $limit : –Passes only the first n documents unmodified to the pipeline where n is the specified limit $match : –Filters the document stream to allow only matching documents to pass unmodified into the next pipeline stage, –Uses standard MongoDB queries

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 6 Stage Operators(2) $out : –Writes the resulting documents of the aggregation pipeline to a collection –To be used, must be the last stage in the pipeline $project : –Reshapes each document in the stream by adding new or removing existing fields, –For each input document, outputs one document $redact : –Reshapes each document in the stream by restricting the content for each document based on information stored in the documents themselves, –Incorporates the functionality of $project and $match, –For each input document, outputs at most one document

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 7 Stage Operators(3) $skip : –Skips the first n documents and passes the remaining documents unmodified to the pipeline $sort: –Reorders the document stream by a specified sort key(s) –For each input document, outputs one unmodified document $unwind : –Deconstructs an array field from the input documents to output a document for each array element –Each output document replaces the array with an element value –For each input document, outputs n documents where n is the number of array elements and can be zero for an empty array

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 8 Pipeline Expressions (1) Some pipeline stages take pipeline expressions as their operands Pipeline expressions specify the transformation to apply to the input documents Expressions have a document structure and can contain other expressions Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other documents: –Expression operations provide in-memory transformation of documents

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 9 Pipeline Expressions (2) All expressions, except accumulator expression are stateless and are only evaluated when seen by the aggregation process The accumulators, used with the $group pipeline operator, maintain their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline Expression operators take an array of arguments and have the following form: { : [,... ] } If an operator accepts a single argument { : }

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 10 Expression Operators(1) Boolean Operators: $and, $or, and $not –In addition to the false boolean value, Boolean expressions evaluate as false the following: null, 0, and undefined values Arithmetic Operators: $add, $divide, $mod, $multiply, $subtract ¶ ¶ –Some arithmetic expression operators ( $add and $subtract ) can also support date arithmetic Comparison expression operators: $cmp, $eq, $gt, $gte, $lt, $lte, $ne $cmp: –Returns: 0 if the two values are equivalent, 1 if the first value is greater than the second, and -1 if the first value is less than the second.

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 11 Expression Operators(2) String Operators: $concat, $strcasecmp, $substr, $toLower, $toUpper Array Operators: $size –Returns the number of elements in the array Variable Operators: $let and $map $let: –Defines variables for use within the scope of a subexpression and returns the result of the subexpression $map: –Applies a subexpression to each element of an array and returns the array of resulting values in order

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 12 Set Expression Operators Set Operators perform set operation on single level arrays, treating arrays as sets –The operation filters out duplicates in the result –The order of the elements in the output array is unspecified Operators: –$allElementsTrue –$anyElementTrue –$setDifference –$setEquals –$setIntersection –$setIsSubset –$setUnion

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 13 Date Expression Operators The following operators return a number for a date: –$dayOfMonth, –$dayOfWeek, –$dayOfYear, –$hour, –$millisecond, –$minute, –$month, –$second, –$week, –$year

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 14 Accumulator Expression Operators (1) Accumulator Operators are available only for the $group stage Accumulators compute values by combining documents that share the same group key ( _id ) Group key has to be defined Accumulators: –Take as input a single expression, –Evaluate the expression once for each input document, and –Maintain their state for each group of documents

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 15 Accumulator Expression Operators (2) The $group stage has the following prototype form {$group: {_id:, : { : },...} } The _id field is mandatory; however, you can specify an _id value of null to calculate accumulated values for all the input documents as a whole The remaining computed fields are optional and if exist, are computed using the operators

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 16 Accumulator Expression Operators (3) $addToSet –Returns an array of unique expression values for each group $avg $first –Returns a value from the first document for each group $last –Returns a value from the last document for each group $max $min $push –Returns an array of expression values for each group $sum

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 17 Field Path to Access Fields Expressions in the $group and project stages use field path to access fields in the input documents by prefixing the field name with a dollar sign $ : – "$year" to specify the field path to year, or –"$course.title" to specify the field path to “course.title" The output from a db.collection.aggregate() having any of $group, $project, or $reshape stages contains only those fields that are defined using field path, or are generated by expressions

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 18 A Document Example { _id: ObjectId(“ ab01”) course: {code: “SWEN432”, title: “Advanced DB”}, year: 2014, coordinator: “Pavle”, no_of_st: 11, prerequisites: [“SWEN304”, “COMP261”, “NWEN304"], ratings: [1.5, 1.8, 1.3], last_modified: ISODate(" T02:14:37.948Z") }

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 19 Pipeline Aggregate Examples(1) Find the number of all students enrolled in year: 2015 courses db.myclasses.aggregate([{ $match: {year: 2015} }, { $group: {_id: null, total_enrollment: {$sum: "$no_of_st" }}} ])

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 20 Pipeline Aggregate Examples(2) Retrieve course codes by year and month –Take month from last_modified db.users.aggregate( [ {$match: {last_modified: {$exists: true}}}, {$project: {year: “$year”, month: {$month: "$last_modified"}, code: "$course.code", _id: 0}}, {$sort: {year: 1, month: 1} } ] ) The field last_modified has to exist in all documents that enter the $project stage

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 21 Pipeline Aggregate Examples(3) Find the number of courses per group (like: SWEN, NWEN, COMP, ECEN, ENGR) –Display in a descending order of number values db.myclasses.aggregate([ {$group: {_id : "$course.code”, number: {$sum : 1} }}, {$sort: {"number" : -1} } ]) The aggregate expression operator $sum counts courses by adding 1 to the current number

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 22 Pipeline Aggregate Examples(4) Find the five courses that appear most often as prerequisites –Display in descending order db.myclasses.aggregate([ {$unwind: "$prerequisites"}, {$group: {_id:"$prerequisites", number: {$sum : 1}}}, {$sort: {number: -1}}, {$limit: 5} ] ) The $unwind operator separates each value in the prerequisites array, and creates a new version of the source document for every element in the array.

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 23 Pipeline Aggregate Examples(5) Find groups (like: SWEN, NWEN, COMP, ECEN,...) having at least 500 enrolments in 2015 –Display in descending order of enrolment numbers db.myclasses.aggregate( {$match: {year: 2015}}, {$group: {_id: {$substr: ["$course.code", 0, 4]}, enrolled: { $sum : "$no_of_st"} } }, {$match: {enrolled: {$gte: 500} } } ) The $substr expression operator extracts the first four characters of the course.code field as the group identifier value

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 24 Pipeline Aggregate Examples(6) Return course codes of classes having the smallest and the greatest number of students enrolled within each group (like: SWEN, NWEN, COMP, ECEN,...) in 2015 db.myclasses.aggregate({$match: {year: 2015, no_of_st: {$exists: true}}}, {$sort: {no_of_st: -1} }, {$group: {_id : {$substr: ["$course.code", 0, 4]}, biggestClass: {$first: "$course.code"}, bigestEnrollment: {$first: "$no_of_st"}, smallestClass: {$last: "$course.code"}, smallestEnrollment: {$last: "$no_of_st"} } } )

Advanced Database Design and Implementation 2015 MongoDB_Aggregation 25 Summary Three ways to do aggregation: –Pipeline (preferred), –Map Reduce, and –Single purpose Aggregation Pipeline consists of stages Stage operators: –$match, $group, $project, $sort, $skip,… Each stage operator contains expression(s) that specify transformations of documents Expression operators: –Boolean, Arithmetic, Comparison, String, Set, Date, Accumulator,... Expressions in the $group and $project stages use field path to access fields in the input documents Examples show use cases