Query Processing in Mobile P2P Databases IGERT Seminar Presentation Bo Xu joint work with Ouri Wolfson.

1 Query Processing in Mobile P2P Databases IGERT Seminar Presentation Bo Xu joint work with Ouri Wolfson

Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work

Query Processing Environments Motivation: a general purpose query processing strategy mobile disconnected wireless ad-hoc networks Vehicular Sensor Network (VSN) GPS receiver chemical spill detector still/video camera vibration sensor acoustic detector

Store-and-forward to deal with sparseness q A r Q Q A AqA QA

Issues with Store-and-forward How to manage limited memory, power, and bandwidth? Which reports to save/transmit?

Difficulty of Store-and-forward Assume that the trajectories of all nodes is known a priori at a central server. If memory, energy, and bandwidth are bounded at mobile nodes, then the problem of determining whether a set of data-items can be disseminated to all the mobile nodes is NP-complete. Case: Each mobile node is interested in every data-item Mobile P2P: Trajectories unknown a priori; Heuristics needed

Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work

Mobile P2P Database report 8 query C query B report 4 report 5 Local database Local query query A report 1 report 2 report 3 Pdas, cell-phones, sensors, hotspots, vehicles, with short-range wireless capabilities A B C Applications coexist Variable report sizes A peer can be a produce, consumer, and broker

Queries A query Q maps each report R to a match degree: Examples: Top parking slots given my current location Profile with expertise children-periodontics Similarity between two images match(R,Q)=e - t- d

Query/report Dissemination Two peers within transmission range exchange queries and reports Least relevant reports that do not fit in local broker database are purged Exchange not necessarily synchronous (periodic broadcast)

Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work

Ranking Factors Rank of a report R is determined by Demand: What fraction of peers are querying R Probability that a peer is interested in R Supply: What fraction of peers already have R Probability that a peer has R Size of R

Rank of a report expected benefit = demand(R)*(1 supply(R)) reports database 0.3 0.8 0.5 0.4 0.5 0.7 reportsbenefit 0.5 0.7 Rank(R)= size(R) demand(R)*(1 supply(R))

Report Ranking: sample demand Queries relation is FIFO maintained

Rank of Reports Demand for R Q i s are the members of the queries relation Size of the queries relation determined based on Hoeffdings inequality E.g., if n=108, then with 95% chance the demand estimation error is smaller than 0.08

How does peer O determine supply(R)? A parametric formula giving the supply is beyond the state of the art O machine-learns supply(R) based on meta- data of R: Age of R Number of times O sighted R from other peers etc.

Computing Supply by Machine-learning aro: The age rank order within Os reports database fin: The number of times O has sighted the report from other peers MAchine LEarning based Novelty rAnking (MALENA) report-id Report description arofin R1 … 1 R4 … 2 4 R2 … 3 2 R7 … 4 2 Reports database of O report-id Report description arofin R1 … 1 R4 … 2 4 R2 … 3 2 R7 … 4 2

MALENA BB Request R2 Examples created positive negative

MALENA Implementation Considerations Minimize overhead No need to actually store examples Model incrementally built Bayesian learning a simple but effective method

Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work

mobility model=random way point, average motion speed=1 mile/hour transmission range=100 meters, mean of reports database size=100Kbytes queries database size=100 queries report size uniformly distributed between 1K and 2K bytes 0.1 report produced per second Comparison with RANDI (MDM07) RANDI=MARKET-supply 20 peers within transmission range 1 peer within transmission range MARKET half as good as ideal benchmark MARKET twice better than RANDI

Comparison with LRU and LFU response-time bound (second) throughput (matches/peer) mobility model=iMotes traces mean of reports database size=150Kbytes queries database size=10 queries report size uniformly distributed between 2K and 20K bytes 0.1 report produced per second, transmission size=100Kbytes (results obtained by Fatemeh Vafaee)

Evaluation of MALENA (TAAS09) turn-over: peers enter/exit system injection: number of peers that have a report initially mobility model=iMotes traces, reports database size=100 reports 2 reports produced per second, transmission size=10 reports MALENA always follows the best indicator low-turn-over/low-injection high-turn-over/high-injection

Application: K-nearest-neighbors Query: K-nearest-neighbors of a fixed location (query-point) Reports: current locations of mobile sensors match(Q,R): in reverse proportion to the distance from query point sink query-point

Itinerary based KNN processing Phase I: Query delivered to the sensor closest to query point Phase II: Query traverses an itinerary to collect answers Phase III: Answers returned to sink

Simulation Results mobility model=random way point, average motion speed=1 mile/hour transmission range=100 meters report size=24 bytes, query size=16 bytes mean of reports database size=100 reports one location report produced at each sensor per second MARKET is especially suitable for sparse environments

Talk outline Introduction System Model The MARKET Algorithm Evaluation Extension to CTS Conclusion and Future Work

TrafficInfo: Disseminating Traffic Information in VANETs

What does relevance mean in TrafficInfo A B A B A report is relevant if it changes the route

Which factors indicate relevance of report? Distance to the reported road segment Type of road segment Speed variance …

Conceptual Learning Procedure An example is created for a received report The example is labeled positive if the report changes route and negative otherwise Individual vs. group How to deal with aggregation?

Query processing Conclusion Store-and-forward enables in-network processing in mobile disconnected networks Ranking is important for dealing with memory, bandwidth, and energy constraints sensor-rich environment short-range wireless Mobile P2P+

Future Work Multimedia reports Utilization of metadata Integration of stateless and stateful approaches Starvation/fairness

Thanks! Questions?

802.11 Basics 3 modes: transmitting, receiving, listening (order of power consumption) When listening: if detecting a message destined to host receive-mode Time divided into slots, 20microsecs each Transmission: Listen for 1 time slot If channel free start broadcast (observe collision possible) Broadcast may last for many time slots

Energy Efficiency of a Broadcast X Throughput (Th) = (expected number of neighbors that successfully receive broadcast) (broadcast size) Power efficiency (PE) = successfully receive the broadcast from x Collisions occur at neighbor

Computation of Throughput X Y 1.No green node inside starts to broadcast at the same time slot with X 2.No transmission from any purple node overlaps with that from X Conditions for successful reception at an arbitrary node Y

Energy Constraints Energy consumed by a 802.11 network interface for transmitting a message of size M bytes En=f M+g For 802.11 broadcast, g=266 10 -6 Joule, f=5.27 10 -6 Joule/byte

Experimental MP2P Projects (Pedestrians) 7DS – Columbia University (web pages) iClouds – Darmstadt Univ. (incentives) MoGATU – UMBC (specialized query processing, e.g., collaborative joins) PeopleNet – NUS, IIS-Bangalore (Mobile commerce, information type location baazar) MoB – Wisconsin, Cambridge (incentives, information resources e.g. bandwidth) Mobi-Dik – Univ. of Illinois, Chicago (brokering, physical resources, bandwidth/memory/power management)

Vehicular Projects Inter-vehicle Communication and Intelligent Transportation: CarTALK 2000 is a European project VICS (The Vehicle Information and Control System) is a government-sponsored system in Japan with an 11-year track record FleetNet, an inter-vehicle communications system, is being developed by a consortium of private companies and universities in Germany IVI (Intelligent Vehicle Initiative) and VII (Vehicle Infrastructure Integration), the US DOT MP2P provides data management capabilities on top of these communication systems Grassroots, TrafficView, SOTIS, V3 – P2P dissemination of traffic info to reduce travel times

RANk-based DIssemination (RANDI) Ranking of reports Bandwidth/energy aware Exchange enhances Consumer functionality Broker functionality Consumer: Answer local query (pull) Broker: Transmit reports most likely requested by future-encountered peers (push) Transmission trigger: Encounter New reports

RANDI When two peers meet they conduct a two-phase exchange: local query answers more reports satisfied as a consumer (pull) enhanced as a broker (push) Phase 1: Exchange queries and receive answers (pull) Phase 2: Exchange more reports using available energy/bandwidth (push) Phase 1 Phase 2 Combination of: unicast (thin line) and broadcast (thick lines) to enable overhearing.

RANDI (Contd) To solve problem with static peers: Two interaction modes which combine pull and push Query-response: triggered by discovery of new neighbors Relay: triggered by receipt of new reports Disseminate to existing neighbors new reports

query reports 7DS P2P mode: each node periodically broadcasts its query and receives reports from neighboring peers. No strategy to determine query frequency and transmission size. Cache management based on web- page expiration time.

PeopleNet before exchangeafter exchange Peer APeer BPeer APeer B random-spread before exchangeafter exchange Peer APeer BPeer APeer B random-swap Reports are randomly selected for exchanging and saving upon encountering.

query reports 7DS Each peer periodically broadcasts its query and receives reports from neighboring peers. No strategy to determine query frequency and transmission size. Cache management based on web-page expiration time.

PeopleNet before exchangeafter exchange Peer APeer BPeer APeer B Reports are randomly selected for exchanging and saving upon encountering.

Mobile Local Search: Applications transportation Announce sudden stop, malfunctioning brake light, patch of ice Floating car data Dissemination of multi-media traffic information (picture, video, voice) Search close-by taxi customer, parking slot, ride-share social networking (wearable website) Personal profile of interest at a convention Singles matchmaking Floating BBS mobile electronic commerce Sale on an item of interest at mall Music-file exchange emergency response Search for victims in a rubble asset management and tracking Sensors on containers exchange security information => remote checkpoints tourist and location-based-services Closest ATM

Applications – Common features Mobile/stationary peers Resources of interest in a limited geographic area Short time duration Can be solved by fixed servers, but Unlikely solution Proposed mp2p paradigm can enhance fixed solution (reliability, performance, coverage)

MARKET When two peers meet they conduct a two-phase exchange: Local query answers more reports satisfied as a consumer (pull) enhanced as a broker (push) Phase 1: Exchange subscriptions and receive answers (pull) Phase 2: Exchange more publications using available energy/bandwidth (push) Phase 1 Phase 2 Combination of: unicast (thin line) and broadcast (thick lines) to enable overhearing.

MARKET (Contd) To solve problem with static peers: Two interaction modes which combine pull and push Query-response: triggered by discovery of new neighbors Relay: triggered by receipt of new publications Disseminate to existing neighbors new publications

Query in static disconnected network q A A r In-network query processing may not be possible Q Q Q

Query in static connected sensor network q A A A r Data transmission delay is 0.Answer can be obtained instantaneously Q Q Q Q QAA A A A qA

Query in static disconnected network q A A r In-network query processing may not be possible Q Q Q

Query in mobile disconnected network q A A r One hop case QA qA Query processing enabled by mobility and store-and-forward

Query in mobile disconnected network q A r Multil-hop case Q Q Query can be in network processed, but it is delayed A Query processing alogrithm doesnt control motion. The answer is disseminated only after an answer node receives query AqA QA First stage: query disseminated during encounter

