MongoDB Aggregations.

Slides:



Advertisements
Similar presentations
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Advertisements

Concepts of Database Management Seventh Edition
Concepts of Database Management Sixth Edition
Chapter 11 Group Functions
Chapter 11 Group Functions (up to p.402)
Instructor: Craig Duckett CASE, ORDER BY, GROUP BY, HAVING, Subqueries
The University of Akron Dept of Business Technology Computer Information Systems The Relational Model: Query-By-Example (QBE) 2440: 180 Database Concepts.
Introduction to Oracle9i: SQL1 SQL Group Functions.
Concepts of Database Management Sixth Edition
A Guide to SQL, Seventh Edition. Objectives Retrieve data from a database using SQL commands Use compound conditions Use computed columns Use the SQL.
Microsoft Access 2010 Chapter 7 Using SQL.
Exploring Office Grauer and Barber 1 Information From the Database: Reports and Queries(Wk4)
Concepts of Database Management, Fifth Edition
Xin  Syntax ◦ SELECT field1 AS title1, field2 AS title2,... ◦ FROM table1, table2 ◦ WHERE conditions  Make a query that returns all records.
Relational DBs and SQL Designing Your Web Database (Ch. 8) → Creating and Working with a MySQL Database (Ch. 9, 10) 1.
Chapter 6 Group Functions. Chapter Objectives  Differentiate between single-row and multiple-row functions  Use the SUM and AVG functions for numeric.
Chapter 3 Single-Table Queries
Microsoft Access 2010 Chapter 7 Using SQL. Change the font or font size for SQL queries Create SQL queries Include fields in SQL queries Include simple.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 Working with MSSQL Server Code:G0-C# Version: 1.0 Author: Pham Trung Hai CTD.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Aggregation.
SQL: Data Manipulation Presented by Mary Choi For CS157B Dr. Sin Min Lee.
1 Single Table Queries. 2 Objectives  SELECT, WHERE  AND / OR / NOT conditions  Computed columns  LIKE, IN, BETWEEN operators  ORDER BY, GROUP BY,
Concepts of Database Management Seventh Edition
Using Special Operators (LIKE and IN)
Concepts of Database Management Seventh Edition
Exploring Office Grauer and Barber 1 Committed to Shaping the Next Generation of IT Experts. Chapter 3 - Information From the Database: Reports.
Querying a Database - A question or an inquiry (dictionary.com) - WHAT ARE WE ASKING QUESTIONS ABOUT? THE DATA - BY ASKING QUESTIONS OF THE DATA WE OBTAIN?
Querying a Database Access Project 2. 2 What is a Query?  In general, a query is a form of questioning, in a line of inquiry. A query may also refer.
Intro to SQL Management Studio. Please Be Sure!! Make sure that your access is read only. If it isn’t, you have the potential to change data within your.
Chapter 3 Query and Report. Agenda Report types Report contents Report creation Report design view Query and dynaset Function and grouping Action query.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Plug-In T7: Problem Solving Using Access 2007 Business Driven Technology.
Concepts of Database Management Seventh Edition Chapter 3 The Relational Model 2: SQL.
Lesson 13 Databases Unit 2—Using the Computer. Computer Concepts BASICS - 22 Objectives Define the purpose and function of database software. Identify.
Structured Query Language SQL Unit 4 Solving Problems with SQL.
1 Working with MS SQL Server Beginning ASP.NET in C# and VB Chapter 12.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
SQL SQL Ayshah I. Almugahwi Maryam J. Alkhalifa
Session 1 Retrieving Data From a Single Table
Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha
Tutorial 5: Working with Excel Tables, PivotTables, and PivotCharts
Instructor: Craig Duckett Lecture 09: Tuesday, April 25th, 2017
PL/SQL LANGUAGE MULITPLE CHOICE QUESTION SET-1
Lecturer : Dr. Pavle Mogin
Statistical Analysis with Excel
The Database Exercises Fall, 2009.
Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together,
Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha
Intro to PostgreSQL.
GO! with Microsoft® Access e
Prof: Dr. Shu-Ching Chen TA: Yimin Yang
Prof: Dr. Shu-Ching Chen TA: Hsin-Yu Ha
Chapter 4 Summary Query.
Prof: Dr. Shu-Ching Chen TA: Haiman Tian
Access: SQL Participation Project
MongoDB Aggregations.
MongoDB Read/Write.
MongoDB Read/Write.
MongoDB Read/Write.
CS122 Using Relational Databases and SQL
Query Functions.
Access: Queries III Participation Project
MongoDB Read.
Section 4 - Sorting/Functions
Projecting output in MySql
MongoDB Aggregations.
MongoDB Read Operations
Shelly Cashman: Microsoft Access 2016
Chapter 3 Query and Report.
Group Operations Part IV.
Presentation transcript:

MongoDB Aggregations

Intro Last week we talked about CRUD(Create, Read, Update, Delete) Aggregations very powerful Able to get statistics about large amounts of data Create graphs to visualize data

What are aggregations? From MongoDB documentation: Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform variety of operations on the grouped data to return a single result. Reference: https://docs.mongodb.com/manual/aggregation/ Able to look at massive amounts of data in a simplified way Ex. Counting how many students are in a Students table.

MySQL recap of aggregations Group by was a way to aggregate data Count the number of titles published by an artist. Take a look at the SQL aggregation ppt for more review. SELECT ArtistID, COUNT(*) FROM Artists INNER JOIN Titles ON Artists.ArtistID = Titles.ArtistID GROUP BY ArtistID;

MongoDB Review Remember that you can do READ operations like below: db.collection.find(); // use pretty to print pretty db.colleciton.find().pretty(); // simplest format of aggregation // use count to count number db.collection.find().count(); // or use dinstinct to get unique set of results db.collection.find().distinct("fieldName");

MongoDB Review Projections: Limit the amount of fields to be returned by a find() query db.collection.find( <query filter>, <projection> )

MongoDB Aggregations Three types: Aggregation pipeline Map-reduce Single purpose aggregation operations We will only be going over Aggregation pipeline The other two are very useful, but we will not have time to cover them I highly recommend that you check out the other two https://docs.mongodb.com/manual/aggregation/

Aggregation Pipeline Separates data aggregation into a few pipelines (or stages) The previous graph separates the data into $match and $group pipelines Aggregation pipelines are not limited to just $match and $group pipelines https://docs.mongodb.com/manual/reference/operator/aggregation/#aggreg ation-pipeline-operator-reference

Learn By Example Download and mongoimport the zips.json file to follow along Each document in the zipcodes collection has the following form: { "_id": "10280", "city": "NEW YORK", "state": "NY", "pop": 5574, "loc": [ -74.016323, 40.710537 ] }

Learn By Example The below aggregation returns states with a population above 10 million: Two stages, group and match Group stage groups the documents by the state field, then adds up the sum of the population and assigns it to the “totalPop”. Match stage filters the above grouped docs to output only those docs whose totalPop is greater than 10 million db.zipcodes.aggregate( [ { $group: { _id: "$state", totalPop: { $sum: "$pop" } } }, { $match: { totalPop: { $gte: 10 * 1000 * 1000 } } } ] )

Equivalent MySQL command SELECT state, SUM(pop) AS totalPop FROM zipcodes GROUP BY state HAVING totalPop >= (10 * 1000 * 1000);

More accumulator operators Name Description $sum return a sum of numerical values. Ignore non-numeric values. $avg returns an average of numerical values. Ignore non-numeric values. $first returns a value from the first document for each group. Order is only defined if the documents are in a defined order. $last similar to above but returns last document. $max returns the highest expression value for each group. $min similar to above but returns the lowest $push return an array of expression values for each group. $addToSet returns an array of unique expression values for each group $stdDevPop returns the population standard deviation of the input values. $stdDevSamp returns the sample standard deviation of the input values.

More Examples (Return average city population by state) Two group stages: The first groups the documents by the combination of city and state. It then uses the $sum aggregation to get the total population for each combination of city and state The second $group stage groups the above results by state. It then averages that grouping and assigns that value to the avgCityPop field. db.zipcodes.aggregate( [ { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } } ] )

More Examples (Return largest and smallest cities by state) db.zipcodes.aggregate( [ { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $sort: { pop: 1 } }, _id : "$_id.state", biggestCity: { $last: "$_id.city" }, biggestPop: { $last: "$pop" }, smallestCity: { $first: "$_id.city" }, smallestPop: { $first: "$pop" } // the following $project is optional, and // modifies the output format. { $project: { _id: 0, state: "$_id", biggestCity: { name: "$biggestCity", pop: "$biggestPop" }, smallestCity: { name: "$smallestCity", pop: "$smallestPop" } ] )

Return largest and smallest cities by state The aggregation pipeline has a $group stage, a $sort stage, another $group, and then a $project stage The first $group stage groups documents by combination of the city and state and calculate the sum of the population. The $sort stage orders the documents by the pop field value from smallest to largest. The second $group stage groups the new sorted documents by the _id.state field and outputs a document for each state. Last $project stage rename _id field to state and moves the biggestCity, biggestPop, smallestCity and smallestPop into biggestCity and smallestCity embedded documents.

References SQL aggregation to MongoDB aggregation comparison Aggregation pipeline API documentation