Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai 20 minutes = 15 minutes of talk + 5 minutes.

Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai 20 minutes = 15 minutes of talk + 5 minutes of questions Aug 19, 2006

Motivation Distributed errors are hard to find with traditional debugging tools Centralized snapshot algorithms Expensive Geared towards detecting one error at a time Special-purpose debugging code is difficult to write, may itself contain errors 8/19/2006 Distributed Watchpoints

Expressing and Detecting Distributed Conditions
“How can we represent, detect, and trigger on distributed conditions in very large multi-robot systems?” Generic detection framework, well suited to debugging Detect conditions that are not observable via the local state of one robot Support algorithm-level debugging (not code/HW debugging) Trigger arbitrary actions when condition is met Asynchronous, bandwidth/CPU-limited systems 8/19/2006 Distributed Watchpoints

Distributed/Parallel Debugging: State of the Art
Modes: Parallel: powerful nodes, regular (static) topology, shared memory Distributed: weak, mobile nodes Tools: GDB printf() Race detectors Declarative network systems with debugging support (ala P2) Gdb/printf-can’t detect many interesting conditions (without post processing), change run-ordering 8/19/2006 Distributed Watchpoints

Example Errors: Leader Election
Scenario: One Leader Per Two-Hop Radius 8/19/2006 Distributed Watchpoints

Example Errors: Token Passing
Scenario: If a node has the token, exactly one of it’s neighbors must have had it last timestep 8/19/2006 Distributed Watchpoints

Example Errors: Gradient Field
Scenario: Gradient Values Must Be Smooth 8/19/2006 Distributed Watchpoints

Expressing Distributed Error Conditions
Requirements: Ability to specify shape of trigger groups Temporal operators Simple syntax (reduce programmer effort/learning curve) A Solution: Inspired by Linear Temporal Logic (LTL) A simple extension to first-order logic Proven technique for single-robot debugging [Lamine01] Assumption: Trigger groups must be connected For practical/efficiency reasons Without connectivity: n^m With connectivity: n*c^m 8/19/2006 Distributed Watchpoints

Watchpoint Primitives
nodes(a,b,c); n(b,c) & (a.var > b.var) & (c.prev.var != 2) Modules (implicitly quantified over all connected sub-ensembles) Topological restrictions (pairwise neighbor relations) Boolean connectives State variable comparisons (distributed) Temporal operators 8/19/2006 Distributed Watchpoints

Distributed Errors: Example Watchpoints
nodes(a,b,c);n(a.b) & n(b,c) & (a.isLeader == 1) & (c.isLeader == 1) nodes(a,b,c);n(a,b) & n(a,c) & (a.token == 1) & (b.prev.token == 1) & (c.prev.token == 1) nodes(a,b);(a.state - b.state > 1) 8/19/2006 Distributed Watchpoints

Watchpoint Execution √ . nodes(a,b,c)… 2 1 4 3 6 5 8 7 10 9 12 11 14
Explicitly state that matchers can terminate early 2 1 4 3 6 5 8 7 10 9 12 11 14 13 16 15 18 17 20 19 22 21 24 23 26 25 28 27 30 29 32 31 1 9 2 1 9 10 √ . 8/19/2006 Distributed Watchpoints

Performance: Watchpoint Size
1000 modules, running for 100 timesteps Simulator overhead excluded Application: data aggregation with landmark routing Watchpoint: are the first and last robots in the watchpoint in the same state? 8/19/2006 Distributed Watchpoints

Performance: Number of Matchers
Worst case behavior This particular watchpoint never terminates early Number of matchers increases exponentially Time per matcher remains within factor of 2 Details of the watchpoint expression more important than size 8/19/2006 Distributed Watchpoints

Performance: Periodically Running Watchpoints
Useful when we want to increase performance (when detecting persistent errors) at the cost of response time 8/19/2006 Distributed Watchpoints

Future Work Distributed implementation More optimization
User validation Additional predicates 8/19/2006 Distributed Watchpoints

Conclusions Simple, yet highly descriptive syntax
Able to detect errors missed by more conventional techniques Low simulation overhead 8/19/2006 Distributed Watchpoints

Thank You

Backup Slides 8/19/2006 Distributed Watchpoints

Optimizations Temporal span Early termination Neighbor culling
(one slide per) 8/19/2006 Distributed Watchpoints

Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai 20 minutes = 15 minutes of talk + 5 minutes.

Similar presentations

Presentation on theme: "Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai 20 minutes = 15 minutes of talk + 5 minutes."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai 20 minutes = 15 minutes of talk + 5 minutes.

Similar presentations

Presentation on theme: "Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai 20 minutes = 15 minutes of talk + 5 minutes."— Presentation transcript:

Similar presentations

About project

Feedback