Adaptive Cleaning for RFID Data Streams
RFID: Radio Frequency IDentification
RFID data is dirty A simple experiment: 2 RFID-enabled shelves 10 static tags 5 mobile tags
RFID Data Cleaning Time Raw readings Smoothed output RFID data has many dropped readings Typically, use a smoothing filter to interpolate SELECT distinct tag_id FROM RFID_stream [RANGE ‘5 sec’] GROUP BY tag_id SELECT distinct tag_id FROM RFID_stream [RANGE ‘5 sec’] GROUP BY tag_id Smoothing Filter
Smoothing filter Middleware Clean RFID Completeness Tag dynamics Read all tags in range
RFID Data Cleaning Time Raw readings Smoothed output RFID data has many dropped readings Typically, use a smoothing filter to interpolate SELECT distinct tag_id FROM RFID_stream [RANGE ‘5 sec’] GROUP BY tag_id SELECT distinct tag_id FROM RFID_stream [RANGE ‘5 sec’] GROUP BY tag_id But, how to set the size of the window? But, how to set the size of the window? Smoothing Filter
Window Size for RFID Smoothing Fido movingFido resting Small window Reality Raw readings Large window Need to balance completeness vs. capturing tag movement
Truly Declarative Smoothing Problem: window size non-declarative Application wants a clean stream of data Window size is how to get it Solution: adapt the window size in response to data
RFID EpochTagIDReadRate Tag 1 Tag 2 Tag 3 Tag 4 Antenna & reader Tags E1E2E3E4E5E6E7E8E9E0 Read Cycle (Epoch) (For Alien readers) Tag List 1. Interrogation cycle 2. Epoch
Controlled condition real condition
SMURF Statistical Smoothing for Unreliable RFID Data Adapts window based on statistical properties Mechanisms for: Per-tag and multi-tag cleaning
Per-Tag Smoothing: Model and Background Epoch t, Tag population N t p i,t : Per epoch sampling prob. Response count of tag i per epoch (total interrogation cycle) EpochTagIDReadRate
Smoothing window size w i epoch Per epoch sampling prob: p i Number of successful observations of tag i Binominal distribution B(wi,pi) Per-Tag Smoothing: Model and Background
Use a binomial sampling model Time (epochs) pipi 1 0 Smoothing Window w i Bernoulli trials p i avg SiSi (Read rate of tag i) E1E2E3E4E5E6E7E8E9E0 Set of epochs where tag i can be seen
We want to ensure that there are enough epochs in Wi such that tag i is observed (if it exists within the reader’s range) Completeness Per-Tag Smoothing: Completeness
If the tag is there, read it with high probability Want a large window pipi 1 0 Reading with a low p i Expand the window Time (epochs) E1E2E3E4E5E6E7E8E9E0
Per-Tag Smoothing: Completeness
Expected epochs needed to read With probability 1- Desired window size for tag i
Per-Tag Smoothing: Transitions Detect transitions as statistically significant changes in the data pipi 1 0 Statistically significant difference Flag a transition and shrink the window The tag has likely left by this point Time (epochs) E1E2E3E4E5E6E7E8E9E0
Significant difference between mean observed sample size Si and expected size Find outlier (2 ) Number of successful epochs in a window SiSi Mean
Per-Tag Smoothing: Transitions # expected readings Is the difference “statistically significant”? # observed readings Statistically significantStatistically significant
Algorithm
SMURF in Action Fido movingFido resting SMURF Experiments with real and simulated data show similar results
Normal sliding windowCompleteness Transition