Privacy Protection for Life-log Video Jayashri Chaudhari, Sen-ching S. Cheung, M. Vijay Venkatesh Department of Electrical and Computer Engineering Center for Visualization and Virtual Environment University of Kentucky, Lexington, KY SAFE 2007 (11-13 April), Washington, DC
Outline Motivation and Background Proposed Life-Log System Privacy Protection Methodology Face detection and blocking Voice segmentation and distortion Experimental Results Conclusion
What is a Life-Log System? Applications include Law enforcement Police Questioning Tourism Medical Questioning Journalism “A System that records everything, at every moment and everywhere you go” Existing Systems/work 1)“ MyLifeBits Project”: At Microsoft Research 2)“WearCam” Project: At University of Toronto, Steve Mann 3)“Cylon Systems”: at UK (a portable body worn surveillance system)
Technical Challenges Security and Privacy Information management and storage Information Retrieval Knowledge Discovery Human Computer Interface
Technical Challenges Security and Privacy Information management and storage Information Retrieval Knowledge Discovery Human Computer Interface
Why Privacy Protection? Privacy is fundamental right of every citizen There are no clear and uniform rules and regulations regarding video recording Emerging technologies threaten privacy right People are resistant toward technologies like life-log Without tackling these issues the deployment of such emerging technologies is impossible
Research Contributions Practical audio-visual privacy protection scheme for life-log systems Performance measurement (audio) on Privacy protection Usability
Proposed Life-log System “A system that protects the audiovisual privacy of the persons captured by a portable video recording device”
Privacy Protection Scheme Design Objectives Privacy Hide the identity of the subjects being captured Privacy verses usefulness: Recording still should convey sufficient information to be useful Speed Protection scheme should work in real time. √ Usefulness × Privacy × Usefulness √ Privacy √ Usefulness √ Privacy
System Overview audio Audio Segmentation Audio Distortion Face Detection and Blocking Face Detection and Blocking video Synchronization & Multiplexing storage S P S: Subject (The person who is being recorded) P: Producer (The person who is the user of the system)
Voice Segmentation and distortion State k =State k-1 or Subject or Producer Windowed Power, P k Computation Windowed Power, P k Computation P k <T S P k <T U Y Y State k = Producer State k = Subject Storage Pitch Shifting We use the PitchSOLA time-domain pitch shifting method. * “DAFX: Digital Audio Effects” by U. Z. et al.
Pitch Shifting Algorithm Pitch Shifting : Steps 1) Time Stretching by a factor of α using window of size N and stepsize Sa Input Audio N X1(n) Sa X2(n) α*Sa Step 2) Re-sampling by a factor of 1/α to change pitch X2(n) Km Max correlation to preserve formant Mixing
Face Detection and Blocking camera Face Detection Face Detection Face detection is based on Viola & Jones Face Tracking Face Tracking Subject Selection Subject Selection Selective Blocking Selective Blocking Audio segmentation results Subject talking Producer talking
Experimental Results Three types of experiments Analysis of Segmentation algorithm Analysis of Audio distortion algorithm 1) Accuracy in hiding identity 2) Usability after distortion
Segmentation Experiment Experimental Data: Interview Scenario in quite meeting room Three interviews recording of about 1 minute and 30 seconds long Transitions P S PSP P S Silence S: Subject Speaking P: Producer Speaking
Segmentation Results Meeting#Transition# (Ground truth) Correctly identified transitions# Falsely detected Transitions# PrecisionRecall
Speaker Identification Experiment Experimental Data 11 Test subjects, 2 voice samples from each subject One voice sample is used as training and the other is used for testing Public domain speaker recognition software Script1 This script is used for training the speaker recognition software Train Test Script2 This script is used to test the performance of audio distortion in hiding the identity
Speaker Identification Results Person ID Without Distortion (Person ID identified) Distortion 1 (Person ID identified) Distortion 2 (Person ID identified) Distortion 3 (Person ID identified) Error Rate 0%100%90.9%100% Distortion 1: (N=2048, Sa=256, α =1.5) Distortion 2: (N=2048, Sa=300, α =1.1) Distortion 3: (N=1024, Sa=128, α =1.5)
Usability Experiments Experimental Data 8 subjects, 2 voice samples from each subject 1 voice is used without distortion and the other is distorted Manual transcription (5 human tester) 1.Wav (transcription1) This transcription is of undistorted voice --- stored in one dot wav file. 2.Wav (transcription2) This transcription is of distorted voice sample --- in two dot wav ---. Manual Transcription Unrecognized words
Usability after distortion Word Error Rate: Standard measure of word recognition error for speech recognition system WER= (S+D+I) /N S = # substitution D = # deletion I = # insertion N = # words in reference sample Tool used: NIST tool SCLITE
Example Video
Conclusions Proposed Real time implementation of voice- distortion and face blocking for privacy protection in Life-log video Analysis of audio distortion for usability Analysis of audio distortion for privacy protection Future Work: Improvement in Segmentation and face blocking Expanding to the larger dataset Expanding to the noisy environment
Acknowledgment People at Center of Visualization and Virtual Environment Department of Homeland Security Thank you!
√ Usefulness × Privacy × Usefulness √ Privacy √ Usefulness √ Privacy