Presentation is loading. Please wait.

Presentation is loading. Please wait.

Welcome to MongoDC: MD Edition

Similar presentations


Presentation on theme: "Welcome to MongoDC: MD Edition"— Presentation transcript:

1 Welcome to MongoDC: MD Edition
Wi-fi: Webs Public, Password: webspublic 7:00 Announcements. 7:15 presentation.

2 Caleb Harris Engineering Manager webs, a vistaprint company

3 Welcome to Webs Enjoy: Pizza, Beer, your seat, and This space

4 announcements Who’s hiring?

5 Aggregation in Mongodb
get your hands dirty!

6 It’s a pipeline…

7 Who’s got some data to aggregate?
Anyone?

8 Here, let me Google that for you…
Don’t worry, we found some data for you to play with Gov’t has been building lots of API’s: We picked US foreign aid data: Thanks Obama!

9 Magical mongoimport incantation
Downloads]$ mongoimport \ --host :27017 \ --db us_gov_data \ --collection loans_and_grants \ --type csv \ --file us_foreign_assistance.csv \ --fields "type,recipient,program,unit,year,obligations”

10 Did it work? Fire up your mongo shell. Should look something like:
mongo :27017\us_gov_data Try this: > db.loans_and_grants.count() Answer should be 73814

11 Did it work (part 2)? > db.loans_and_grants.findOne() {
"_id" : ObjectId("53aa3390c6d7f17eed497c2e"), "type" : "Economic", "recipient" : "Afghanistan", "program" : "Child Survival and Health", "unit" : "Constant 2011 $US", "year" : 2002, "obligations" : }

12 pipeline operator: $match
filters the input collection

13 Example: select only 2001 data
> results = db.loans_and_grants.aggregate([ { $match: { year: NumberLong(2001) }}]) { "result" : [ "_id" : ObjectId("53aa3390c6d7f17eed497c52"), "type" : "Economic", "recipient" : "Afghanistan", "program" : "Development Assistance", "unit" : "Constant 2011 $US", "year" : 2001, "obligations" : }, ... ], “ok”: 1 }

14 Activity 1: Create new collection with data only in constant 2011 dollars

15 Activity 1 Solution, Mongo 2.4
> constant_dollars = db.loans_and_grants.aggregate([ ... { $match: { unit: "Constant 2011 $US" }} ... ]) // bunch of results > constant_dollars.result.forEach( ... function(r) { ... delete(r._id); ... db.loans_and_grants_constant_dollars.save(r) ... })

16 Activity 1 Solution, Mongo 2.6
> constant_dollars = db.loans_and_grants.aggregate([ ... { $match: { unit: "Constant 2011 $US" }}, ... { $out: "loans_and_grants_constant_dollars" } ... ]) // bunch of results

17 Not to be a pain… but did it work?
> db.loans_and_grants_constant_dollars.count() 36907 > db.loans_and_grants_constant_dollars.find({unit: {$ne: "Constant 2011 $US"}}) //bupkis //Yay!

18 pipeline operator: $group
groups documents with common field values and applies aggregate operations

19 Example: Sum aid by country
> db.loans_and_grants_constant_dollars.aggregate([ ... {$group: {_id: "$recipient", obligations: {$sum: "$obligations"}}} ... ]) { "result" : [ "_id" : "Zimbabwe", "obligations" : NumberLong(" ") }, ...

20 Activity 2: Sum aid by year

21 Activity 2 Solution > db.loans_and_grants_constant_dollars.aggregate([ ... {$group: {_id: "$year", obligations: {$sum: “$obligations"}}} ... ]) { "result" : [ "_id" : 1948, "obligations" : NumberLong(" ") }, "_id" : 1982, "obligations" : NumberLong(" ") ...

22 pipeline operator: $sort
it… uh… sorts stuff

23 Example: Sort by year > db.loans_and_grants_constant_dollars.aggregate([ ... { $sort: {year: 1} } ... ]) { "result" : [ "_id" : ObjectId("53aa39d7066ec46e8f699d95"), "type" : "Economic", "recipient" : "Albania", "program" : "Inactive Programs", "unit" : "Constant 2011 $US", "year" : 1946, "obligations" : NumberLong( ) }, "_id" : ObjectId("53aa39d7066ec46e8f69a44c"), "recipient" : "Belgium", "obligations" : ...

24 Example: Sort by year, then amount
> db.loans_and_grants_constant_dollars.aggregate([ ... { $sort: {year: 1, obligations: -1} } ... ]) { "result" : [ "_id" : ObjectId("53aa39da066ec46e8f6a148c"), "type" : "Economic", "recipient" : "World (not specified)", "program" : "Voluntary Contributions to Multilateral Organizations, Total", "unit" : "Constant 2011 $US", "year" : 1946, "obligations" : NumberLong(" ") }, "_id" : ObjectId("53aa39d9066ec46e8f69d38a"), "recipient" : "Italy", "program" : "Inactive Programs", "obligations" : NumberLong(" ") ...

25 Activity 3: Sort by year, then country

26 Activity 3 Solution > db.loans_and_grants_constant_dollars.aggregate([ ... { $sort: { year: 1, country: 1 } }, ... ]) { "result" : [ "_id" : ObjectId("53aa39d7066ec46e8f699d95"), "type" : "Economic", "recipient" : "Albania", "program" : "Inactive Programs", "unit" : "Constant 2011 $US", "year" : 1946, "obligations" : NumberLong( ) }, "_id" : ObjectId("53aa39d7066ec46e8f69a44c"), "recipient" : "Belgium", "obligations" : ...

27 combo! $group, preceded by $sort, allows use of $first and $last operators

28 Example: Find highest aid year per country and program
> db.loans_and_grants_constant_dollars.aggregate([ ... { $sort: {recipient: 1, program: 1, obligations: -1} }, ... { $group: ... { _id: { country: "$recipient", program: "$program" }, ... obligations: {$first: "$obligations"}, year: {$first: "$year"} }} ... ]) { "result" : [ "_id" : { "country" : "Zimbabwe", "program" : "Title I" }, "obligations" : NumberLong( ), "year" : 1992 ...

29 Activity 4: Find highest aid country by year and program

30 Activity 5: Find highest aid country by year across all programs

31 Activity 6: Find the country that has received the most foreign aid

32 Activity 7: Find the country that received the biggest increase between 2001 and 2002

33 Activity 8: Find the country that has received aid in the most years

34 Next Time Wednesday, June 18, 6:30 to 9:00 Webs office, Silver Spring
Send me your ideas! @MongoDC


Download ppt "Welcome to MongoDC: MD Edition"

Similar presentations


Ads by Google