Audio Location Accurate Low-Cost Location Sensing James Scott Intel Research Cambridge Boris Dragovic Intern in 2004 at Intel Research Cambridge Studying for PhD at University of Cambridge
Overview of talk Audio for fine-grained location Prototype: detecting human sounds Evaluation Application area: 3D user interfaces
Background Fine-grained location systems have been built using ultrasound e.g. Bats (AT&T), Cricket (MIT) Achieving ~3cm 3D accuracy 95% of the time Many end-user devices have audible-range I/O built in, but few have ultrasound Can we use integrated/off-the-shelf audio hardware for location?
Audio-based location with off-the-shelf hardware Many tx Speakers in environment as tx Mobile phones or PDAs as rx Privacy-preserving Many rx Computer microphones as rx Can use mobile devices for tx BUT can also use human sounds for tx
Locating human sounds Need coverage from at least 4 mics Unknowns: X,Y,Z,t (t=time of sound) More is better since occlusion happens Users do not need special “tag” No per-user setup required Lowers costs and increases simplicity User identity is not provided Many apps do not need identity Anonymity is good for privacy Could fuse with identity e.g. from RFID
Aims of prototype Fine-grained location sensing with hardware accessible to end users What accuracy can we obtain for locating human sounds, e.g. finger clicking, hand clapping? Application area: 3D user interfaces using human sounds
Prototype Use standard PC Add 6 PCI sound cards and 6 mics Total cost of sound hardware ~£100 From dabs.com Fedora Core 2 Linux distribution Java software
Signal Detection System Architecture PositioningTiming Signal Detection Signal Detection
Signal Detection Problem: identify the same part of the same sound in audio streams from multiple mics Amplitude-threshold algorithm Keep track of current noise floor Mark sample as “significant” when amplitude is is at least F times noise floor. (F ≈ 2.5) Properties Very good at detecting sharp sounds Equally important: ignores other sounds Robust to noisy environments and cheap mics
Timing Need time sync for sound streams 1ms error ≈ 30cm in space Problem: Linux/Java introduce delays Buffering and scheduling result in variable delays of >1ms Solution: hacked sound driver Timestamp taken at interrupt made available to Java app (via /proc) Does not account for interrupt delay Around 200 lines of C code
Positioning Survey of microphone positions is currently done manually See orthogonal work on self-surveying Use well-studied Levenberg- Marquardt technique to find 3D position (and sound generation time)
1D evaluation First evaluated 1D performance for relative distance Use two mics and a 6x7 grid of test points 20 hand claps and 20 finger clicks at each point Microphones Y X 60cm
1D results: hand clapping
1D results: finger clicking
Implications of 1D results Our mics are usable ~60º either side of axis Our mics have a maximum range of 4m Drops to ~2m in very noisy conditions Implications for deployment Density of microphones required to sense location in a space Finger clicking has median 1D error <5cm At least some of this due to human error
3D experiment setup 20 finger clicks at 4x4x3 test points on 60cm grid Total clicks: ~1000. Very sore fingers. Microphones at 2 heights, and much more spread in X,Y than in Z This might be typical for real deployments Y X 60cm Key: Microphone at 60cm high Microphone at 120cm high
Lollipops!
3D distance error
What do I think it’s good for? 3D user interfaces When I click here in future, do this Extend computer input beyond desk/lap Situated interfaces Add a light switch by the bed Remote control without a losable device Inspiration: SPIRIT (AT&T) which allows Bats to be used as 3D pointers
What do you think it’s good for? Accessible user interfaces Elderly, disabled Activity inferencing Fusion of location with sound recognition Performance art Spotlights follow sounds Tracking planes in an air show! Well, maybe not…
Visualisation To help deploy and demo it UbiComp, Mobisys Allows placement of mics creation of “buttons” in 3D By mouse or finger Used to create an mp3 player demo
Demo video – Accuracy
Demo video – User Interface
Conclusions 3D location sensing for under £100 of consumer sound peripherals Accuracy: better than 28cm (3D) for 90% of finger clicks Improves to 10cm for 2D and repeated clicks Sound-based user interfaces Happy to provide source and specs