Download presentation
Presentation is loading. Please wait.
Published byHector May Modified over 9 years ago
1
Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam The Netherlands Julia Kiseleva Eindhoven University of Technology The Netherlands Grace Hui Yang Georgetown University USA (with special thanks to Adriel Dean-Hall, Waterloo)
2
Part 3 Hackathon You have two days (until Thursday at 1800) Do something interesting with our data (or just something interesting on this topic) ideally in a group Presentation and prizes on Thursday
3
Hackathon Basic task: Take our profiles (and a bunch of training data and resources) and make recommendations for us. Could be done in a variety of ways (including manually). Or do something else with the data
4
Presentations on Thursday Everyone who participated gets up to five minutes to speak (with or without slides). Tell us what you did Tell us the results
5
The data http://plg.uwaterloo.ca/~claclark/russir2015/
6
Directory “Data” Everything you really need to do the task. contexts2015spb.csv collection_2015_batch_requests_spb.csv batch_requests_combined.json sample_batch_response_combined.json batch_validate.py
7
contexts2015spb.csv Context contains all locations/cities. id,city,state 151,New York City,NY 152,Chicago,IL... 421,Walla Walla,WA 422,Lewiston,ID 423,Saint Petersburg,Russia
8
collection_2015_batch_requests_spb.csv Collection contains all venues (ID,ContextID,URL,title) TRECCS-00000005- 418,418,http://www.greatfallsmt.net/people_offices/park_ rec/gibson.php,"Gibson Park" TRECCS-00000007- 418,418,http://www.bostons.com,"Bostons Restaurant Sports Bar" TRECCS-00000101- 423,https://foursquare.com/v/vinostudia/51401b0ee4b05 2f64a18688c,"Vinostudia" TRECCS-00000102-423,https://foursquare.com/v/le-tour- de-vin/5370e6d8498e666a1bfe1c09,"Le Tour de Vin"
9
batch_requests_combined.json This is the main file: profiles and candidates in json { "body" : { "group" : "Friends", "duration" : "Longer", "season" : "Autumn" "trip_type" : "Holiday", "person" : … "id" : 1234568, "age" : "47", "gender" : "male”}, "location" : { "id" : 423, "lat" : 59.95, "lng" : 30.3, "name" : "Saint Petersburg”}, }, "id" : 901, "candidates" : [ "TRECCS-00000001-423", … "TRECCS-00000102-423”]}
10
batch_requests_combined.json (profile) Preferences elsewhere: "person" : { "preferences" : [ {"documentId" : "TRECCS-00247656-160", "tags" : [ "Bar-hopping", "Clubbing" ], "rating" : "4" }, {"documentId" : "TRECCS-00211603-161", "tags" : [ "Fast Food", "Restaurants" ], "rating" : "0" }, … ],
11
sample_batch_response_combined.json Example of a valid response (+ script to validate the format) { "groupid" : "demo", "runid" : "demoA", "id" : 901, "body" : { "suggestions" : [ "TRECCS-00000099-423", "TRECCS-00000006-423", … "TRECCS-00000079-423” ] }
12
Again,“Data” Everything you really need to do the task. contexts2015spb.csv collection_2015_batch_requests_spb.csv batch_requests_combined.json sample_batch_response_combined.json batch_validate.py
13
Directory “Evaluation” Everything you need to evaluate on the U.S./non-Spb data. TRECCS15_Batch_Candidates_graded.qrels Crowdsourced judgments on the candidates batch_response_to_trec.py Turn a json response into a trec format. trec_eval.8.1.tar.gz Evaluate with trec_eval
14
Directory “Crawl” If you want the crawled URLs (WARC format) crawls_batch_requests_TRECCS.zip All web pages of U.S. venues. collection_2015_spb_nodesc.zip All web pages of Spb venues.
15
Directory “Data” Everything you really need to do the task. contexts2015spb.csv collection_2015_batch_requests_spb.csv batch_requests_combined.json sample_batch_response_combined.json batch_validate.py
16
Additional Information about USA attractions (Directory “infoUS”) cat_dict.json: categories for each attraction id (from a commercial service) rating_dict.json: ratings for each attraction id (from a commercial service)
17
Full TREC collection of USA Attractions (directory TREC) contexts2015.csv: mapping between numeric context ids and cities collection_2015.csv: triples mapping attraction id, context id, attraction URL
18
Discussion Ideas? Groups?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.