Using the Web to Interactively Learn to Find Objects ΕΘΝΙΚΟ ΚΑΙ ΚΑΠΟΔΙΣΤΡΙΑΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ ΜΕΤΑΠΤΥΧΙΑΚΟ ΜΑΘΗΜΑ : ΠΡΟΗΓΜΕΝΗ ΤΕΧΝΗΤΗ ΝΟΗΜΟΣΥΝΗ ΚΟΚΚΙΝΟΥ ΑΝΝΑ AM :Μ1253 Using the Web to Interactively Learn to Find Objects Mehdi Samadi, Thomas Kollar, Manuela Veloso
INTRODUCTION AIM: make mobile robots that are able to intelligently perform tasks APPROACHES: Manually define the knowledge of the robot Enable robot to actively query the WWW
RELATED WORK Yamauchi 1997 (Baseline search) Stachniss,Grisetti and Bulgar 2005 (focus on balancing the trade-off between localization quality and the need to explore new areas) Meger 2008 (describe an integrated robotic platform) Kollar and Roy 2009 (integration between robots and Web) Sjoo 2009, Aydemir 2011, Velez 2011 ,Joho- Senk and Bulgar 2011 (focus on visual object search that does not leverage the Web)
Object Eval (OE) An approach ,which addresses the challenges of the find and deliver task by querying the Web. (stages of find and deliver task: receive command, go to a location, ask for object, get object ,deliver object) PROCCESS: Downloading and categorizing a set of web pages Learning the probability that a location (ex.kitchen) will contain an object (ex. Coffee) Probability => Utility Function
FIND AND DELIVER TASK
UTILITY FUNCTION (part1) Input: Object Name (ex. papers) and Destination Room Output: A plan consisting of locations that robot should visit U( plan | O)= ∑ p( plani |O) × R ( plani|O) (1) U: Utility function O: Object name p(plani|O) : probability of finding an object in a location R: the reward that robot receives when it executes i-th step
UTILITY FUNCTION (part2) The problem can be formulated as finding the plan that maximizes the utility argplanmax U(plan|O) (2) R (plani|O) = D (plani) × I (plani) × F (plani) (3) R: the reward that robot receives when it executes i-th step D: the reward which is dependent on the distance I: the reward which is dependent on the number of interactions F: the reward which uses previous searches to help search for objects ( F = 1 or F = 0.5 or F = 0 )
UTILITY FUNCTION (part3) The probability will be computed as p(plani |O) ≈ [ ∏ ( 1 – p(lj | O))] × p(li | O) (4) where, p(lj =kitchen| O=coffee)=p(locationHasObject(kitchen,coffee)
QUERYING THE WEB Objects present in a location will be found frequently on the Web Object Eval converts predicate instances into a search query (ex: { papers, printer room} ) The text on the web that is most relevant to a predicate instance will be near the search terms OE extracts text snippets from each of the web pages ( up to 10 words before, after and between the terms). Multiple text snippets from the same page are merged into one Each text snippet is transformed into a feature vector
To find which location type should be assigned to a page, we train an SVM classifier (input :feature vector, output :location type) OE only needs positive examples of the predicate
OE EVALUATION-SIMULATION OE searches over candidate plans to maximize the utility function (beam search ,W=10 ,N=10 no loops) At each step a new plani is added to overall plan Query the Web to predict the probability distribution p(li | O) A set of simulated commands were executed for three floors of an office building (80 object types)
RESULTS OE achieves a high F1 value even when it uses a few training examples The search for 20 objects is repeated 5 times starting from different initial location to obtain 100 runs.
Thank you!