WISDM Activity Recognition & Biometrics Applications of Classification

Slides:



Advertisements
Similar presentations
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Gary M. Weiss and Jeffrey Lockhart Fordham University, New York, NY 1UbiMI UBICOMP Sept
Lecture 3 Nonparametric density estimation and classification
Chapter 7 – Classification and Regression Trees
Studying Relationships between Human Posture and Health Risk Factors during Sedentary Activities Tejas Srinivasan Mentors: Vladimir Pavlovic Saehoon Yi.
K nearest neighbor and Rocchio algorithm
Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.
Experimental Evaluation
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.
Activity Recognition from User- Annotated Acceleration Data Presented by James Reinebold CSCI 546.
Gary M. Weiss Fordham University
Feature Extraction Spring Semester, Accelerometer Based Gestural Control of Browser Applications M. Kauppila et al., In Proc. of Int. Workshop on.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,
Evaluating Classifiers
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
July 25, 2010 SensorKDD Activity Recognition Using Cell Phone Accelerometers Jennifer Kwapisz, Gary Weiss, Samuel Moore Department of Computer &
September Activity Recognition and Biometric Identification Using Cell Phone Accelerometers WISDM Project Department of Computer & Info. Science.
TEMPLATE DESIGN © Detecting User Activities Using the Accelerometer on Android Smartphones Sauvik Das, Supervisor: Adrian.
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Chapter 9 – Classification and Regression Trees
Human Activity Recognition Using Accelerometer on Smartphones
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Gary M. Weiss Alexander Battistin Fordham University.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
How Good is a Model? How much information does AIC give us? –Model 1: 3124 –Model 2: 2932 –Model 3: 2968 –Model 4: 3204 –Model 5: 5436.
Saisakul Chernbumroong, Shuang Cang, Anthony Atkins, Hongnian Yu Expert Systems with Applications 40 (2013) 1662–1674 Elderly activities recognition and.
A Behavioral Biometrics User Authentication Study Using Android Device Accelerometer and Gyroscope Data Jonathan Lee, Aliza Levinger, Beqir Simnica, Khushbu.
Work supported by NSF Grant No and numerous Fordham University grants
Data Science Credibility: Evaluating What’s Been Learned
Mobile Activity Recognition
7. Performance Measurement
Evaluating Classifiers
Telepath: Sensory Offloading for Wearable Devices
Keystroke Biometric Studies with Short Numeric Input on Smartphones
My Tiny Ping-Pong Helper
Practice & Communication of Science From Distributions to Confidence
Tracking Mobile Web Users Through Motion Sensors: Attacks and Defenses
How Good is a Model? How much information does AIC give us?
Posture Monitoring System for Context Awareness in Mobile Computing
Biometrics.
Estimating with PROBE II
Recognizing Smoking Gestures with Inertial Measurements Unit (IMU)
Vijay Srinivasan Thomas Phan
Transportation Mode Recognition using Smartphone Sensor Data
CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,
From Distributions to Confidence
Mobile Sensor-Based Biometrics Using Common Daily Activities
Chao Xu, Parth H. Pathak, et al. HotMobile’15
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Biometrics.
Evaluation and Its Methods
Introduction to Data Mining, 2nd Edition
Anindya Maiti, Murtuza Jadliwala, Jibo He Igor Bilogrevic
iSRD Spam Review Detection with Imbalanced Data Distributions
Ensembles.
Lecture 6: Introduction to Machine Learning
CSCI N317 Computation for Scientific Applications Unit Weka
Activity Recognition Classification in Action
Xin Qi, Matthew Keally, Gang Zhou, Yantao Li, Zhen Ren
Ensemble learning Reminder - Bagging of Trees Random Forest
Nonparametric density estimation and classification
Evaluation and Its Methods
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
Evaluation and Its Methods
Machine Learning: Lecture 5
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Presentation transcript:

WISDM Activity Recognition & Biometrics Applications of Classification WISDM = Wireless Sensor Data Mining Last modified 1/3/19

Both of these projects started as undergraduate research projects. The Activity Recognition project was first; then we realized we could do biometrics using the same data.

What is Activity Recognition? Identifying a user’s physical activity based on sensor data In WISDM case the mobile sensor data from the smartphone and/or smartwatch accelerometer and gyroscope How would you formulate this as a classification task? Not so obvious if you have not read the paper, since time dimension complicates things

What is Biometrics Identifying a subject based on some physical or behavioral characteristic Physical: fingerprint, iris, etc. In WISDM case identify using motion sensor data from smartphone/smartwatch Two classification tasks Authentication Identification Which is more typical?

Motion-Based Biometrics: Example What blockbuster movie from the last 10 years featured motion-based (gait) biometrics? https://www.youtube.com/watch?v=0iZ-nQ4yFn4 skip to 0:58 seconds

Why do We Care? Biometrics Activity recognition Obvious: for security purposes; why better than passwords? Goal: avoid/eliminate passwords More convenient Harder to fake Activity recognition Context sensitive “smart” devices Fitness and health applications To track what we do for other purposes

The Data WISDM data is collected at 20 Hz from both phone and watch (Android) Timestamped sequence of numbers for each of 3 dimensions for accelerometer and gyroscope

Walking Data Watch Gyroscope Phone accelerometer

Phone Accelerometer (Jogging)

Phone Accelerometer (Standing)

WISDM Activity Recognition Studies 2010 study using only smartphones Good results, but only 6 basic activities (29 subjects) More refined studies over next few years, including impact of personal models 2017 study 18 activities and 51 test subjects Includes eating activities Sensors Evaluates accel and gyro on watch and phone (4 sensors) Evaluates 5 fused sensors

The 2016 Smartwatch Activities General Activities Walking* Jogging* Climbing Stairs* Sitting* Standing* Kicking Soccer Ball General Activities (hand-oriented) Dribbling Basketball Playing Catch with Tennis Ball Typing Handwriting Clapping Brushing Teeth Folding Clothes Eating Activities (hand-oriented) Eating Pasta Eating Soup Eating Sandwich Eating Chips Drinking from a Cup * These used in the 2010 smartphone study These activities and associated data also used for biometrics study

Formulation as Classification Take raw time series sensor data for non-overlapping 10 second chunks and create one example Could have used sliding window with overlap Use higher level features to describe behavior over the 10 second period This is data transformation (also aggregation) Mapping the data to a very different representation This is because most classification algorithms assume examples (record format, fixed # features) not time series data

High Level Features: 43 Total Average[3]: Average acceleration (per axis) Standard Deviation[3]: SD per axis Average Absolute difference[3]: per axis Average Resultant Acceleration[3]: average of square root of sum of squares of 3 values Time Between Peaks[3] Binned Distribution[30]: For each axis take max – min value, create 10 equal sized bins, and record fraction in each bin

Types of Activity Recognition Models Impersonal Models Generated using data from a panel of other users Build model based on 50 subjects and test on 51st Repeat 51 times so evaluate on every subject using all other subjects Personal Models Generated using data from the intended user. Must generate 51 models and carefully partition data for each subject Which do you think performs best?

Activity Recognition Results

2010 Study using Impersonal Model (IB3 Method) 72.4% Accuracy  Predicted Class Walking Jogging Stairs Sitting Standing Lying Down Actual Class 2209 46 789 2 4 45 1656 148 1 412 54 869 3 10 47 553 30 241 8 57 6 448 5 7 301 13 131

2010 Study using Personal Model (IB3 Method) 98.4% accuracy  Predicted Class Walking Jogging Stairs Sitting Standing Lying Down Actual Class 3033 1 24 4 1788 42 1292 870 2 6 5 11 509 8 7 442

2010 Study Accuracy Results % of Records Correctly Classified Personal Impersonal Straw Man IB3 J48 NN Walking 99.2 97.5 99.1 72.4 77.3 60.6 37.7 Jogging 99.6 98.9 99.9 89.5 89.7 89.9 22.8 Stairs 96.5 91.7 98.0 64.9 56.7 67.6 16.5 Sitting 98.6 97.6 97.7 62.8 78.0 10.9 Standing 96.8 96.4 97.3 85.8 92.0 93.6 6.4 Lying Down 95.9 95.0 96.9 28.6 26.2 60.7 5.7 Overall 98.4 96.6 98.7 74.9 71.2 What do you think is meant by straw man? What might the straw man represent in this case?

Personal Model Accuracy (RF) RF= Random Forest

Impersonal Model Accuracy (RF)

Accuracy of Different Classification Algorithms

Personal Model Learning Curves The x-axis represents the amount of training data per activity

Impersonal Model Learning Curves The x-axis represents the amount of training data per activity per panelist (50 panelists)

Personal Model Learning Curves (RF with varying sensors)

Impersonal Model Learning Curves (RF with varying sensors)

Impersonal Model Learning Curves (varying number of panelists)

Hybrid Models: A Big Problem Much related work uses hybrid models rather than personal and impersonal models A single data set is used and then split into training and test Data from any subject can be in both the training and test set Easy methodology can use random sampling or cross-validation Not impersonal, but not personal either (hybrid) Hybrid models discussed as if they are impersonal models But they are not! This is cheating! So many researchers make methodological mistake Not same as using same data for training and test, but similar One might assume that small overlap is not a problem (if 50 people in the dataset then each test subject has only a 2% overlap with training data) Our prior research shows that it is a huge problem and hybrid models perform like personal models

Actitracker The phone-based research was incorporated into a deployed app/system called Actitracker The development effort to handle real-time activity recognition was substantial Actitracker is no longer supported

WISDM Biometrics Uses same data as for activity recognition Initial research only focused on walking activity (gait), but eventually extended to all 18 activities Uses the same transformation process to generate 10-second examples Used three classification algorithms kNN, DT, Random Forest (RF does best again) Identification: uses 10-fold cross validation

Identification Results Activity Single Sensor Fused Sensor Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 96.1 94.7 75.1 67.0 96.8 78.9 96.5 95.3 97.4 Jogging 92.5 75.0 74.3 96.0 82.1 95.7 95.2 98.0 Stairs 90.8 81.2 52.4 39.2 92.7 58.7 92.6 80.9 95.1 Sitting 90.1 56.3 70.4 30.1 91.5 69.3 93.1 55.9 92.0 Standing 85.8 47.1 64.1 27.0 86.8 61.2 90.5 46.6 89.9 Typing 94.8 71.7 51.2 94.6 84.2 95.6 76.5 Teeth 92.2 69.5 70.0 93.7 76.1 74.5 95.4 Soup 94.3 56.5 74.1 50.4 95.8 76.6 96.3 66.9 96.6 Chips 93.3 56.8 62.6 38.7 93.2 62.4 66.3 94.9 Pasta 94.1 56.9 67.2 38.1 94.0 71.6 61.1 Drinking 93.9 57.4 63.9 41.3 93.8 65.3 60.6 Sandwich 92.9 62.8 61.9 37.6 62.1 95.9 68.5 Kicking 87.4 54.3 38.3 88.6 59.8 92.1 72.7 Catch 90.0 69.1 71.3 90.3 75.4 82.0 Dribbling 88.3 66.0 72.3 74.8 89.5 80.3 94.4 Writing 92.8 79.6 47.6 79.1 94.2 73.0 Clapping 72.8 83.4 73.9 85.3 86.1 96.7 Folding 90.7 65.8 60.0 38.8 63.0 93.6 Avg. 67.3 68.7 49.8 73.2

Majority Voting Strategy Results just displayed are based on classification of one test example 10 seconds of data But for biometrics can assume sensor data comes from the same person Can use more than 10 seconds Our majority voting uses 5 examples/50 seconds Should yield improved results

Identification Results (Voting) Activity Single Sensor Fused Sensor Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 100.0 94.1 80.4 90.2 Jogging 90.0 88.0 98.0 Stairs 70.0 43.8 96.0 75.0 91.7 Sitting 62.7 88.2 33.3 86.3 64.7 Standing 39.2 82.4 20.0 84.0 50.0 Typing 89.8 94.0 95.9 Teeth 96.1 Soup 66.7 62.0 80.0 Chips 76.0 41.2 Pasta 56.0 48.0 71.4 Drinking 58.8 60.8 Sandwich 68.0 38.0 82.0 73.5 Kicking 68.6 32.0 Catch 78.0 85.7 91.8 Dribbling Writing Clapping Folding 76.5 78.4 Avg. 98.8 74.4 85.8 55.8 98.9 88.3 99.7 83.2 99.6

Continuous Biometrics Idea: biometrics using motion data from normal (unstructured) daily tasks We can only approximate this since only 18 activities and even distribution of each Next set of results merge all 18 activities Without label: activities merged with no activity label Predicted label: activity recognition model used to predict activity and then use that label With label: the known label is used. This is not realistic, but used as upper bound

Continuous Biometrics Results (Identification only) Sensors Used Without Label Predicted Label With Label Phone Accel 96.8 96.0 97.6 Phone Gyro 61.6 63.1 65.1 Watch Accel 76.0 75.4 77.3 Watch Gyro 39.8 42.4 43.9 Phone 97.0 96.2 97.5 Watch 77.1 80.6 77.9 Accel 99.2 98.9 99.3 Gyro 72.3 72.9 73.0 All 99.1 Average 79.9 80.5 81.2

Authentication Experiments Binary classification problem: “you” or “imposter” 1 model per user (51 models given 51 users) “Imposters” in test set should not be in train set Main evaluation metric is Equal Error Rate Balances two types of errors: false acceptance rate and false rejection rate Don’t worry about understanding this metric

Authentication EER (%) without Voting (RF) Activity Single Sensor Fused Sensor Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 11.2 11.3 17.5 18.8 9.3 16.1 12.6 10.2 7.9 Jogging 11.5 13.2 18.1 19.3 10.3 15.1 13.8 9.8 Stairs 12.3 16.4 24.3 26.1 11.8 21.6 13.9 16.5 13.5 Sitting 13.6 26.3 21.8 33.4 12.8 22.3 10.7 27.2 13.0 Standing 14.7 26.0 22.6 33.3 15.6 23.0 11.9 27.9 15.4 Typing 19.4 16.8 26.2 18.0 10.4 19.0 8.7 Teeth 19.7 18.6 22.7 12.1 17.2 11.4 19.9 12.2 Soup 9.6 22.4 17.6 24.6 10.1 8.6 21.7 Chips 23.3 19.2 29.5 11.7 20.3 20.4 Pasta 12.4 18.4 28.8 14.4 10.9 Drinking 12.0 24.2 20.0 30.1 12.9 20.1 Sandwich 24.1 30.2 22.1 23.6 Kicking 12.5 18.5 26.7 21.1 16.7 14.0 Catch 10.8 20.6 20.8 13.4 Dribbling 18.9 21.0 12.7 17.9 15.7 Writing 13.3 15.3 27.1 Clapping 20.5 15.8 9.7 14.6 10.6 Folding 16.6 19.6 24.7 17.1 8.3 17.0 Avg. 20.2 19.5 25.8

Authentication EER (%) with Voting (RF) Activity Single Sensor Fused Sensor Phone Accel Phone Gyro Watch Accel Watch Gyro Phone Watch Accel Gyro All Walking 9.4 9.8 13.2 17.2 8.8 13.9 11.3 10.0 6.8 Jogging 7.8 10.8 16.2 15.2 9.7 12.7 9.0 11.2 8.3 Stairs 13.4 12.5 19.3 23.9 9.3 18.9 8.4 14.1 6.9 Sitting 10.4 23.7 14.5 32.1 17.0 21.1 10.2 Standing 12.1 22.1 16.7 31.6 10.9 21.5 7.7 Typing 15.4 13.0 20.7 8.9 14.0 8.6 13.3 Teeth 10.1 20.0 14.4 14.9 8.2 Soup 7.3 19.2 22.3 6.1 17.5 8.0 Chips 9.9 14.7 25.9 10.3 18.1 8.5 Pasta 14.3 26.6 18.5 19.6 5.4 Drinking 16.6 25.1 19.9 8.1 Sandwich 17.9 25.7 11.4 17.7 Kicking 10.6 19.4 21.0 24.1 11.0 18.8 Catch 16.3 15.5 Dribbling 16.4 16.1 11.8 11.5 Writing 8.7 15.7 10.7 21.3 9.2 11.6 16.0 Clapping 12.9 14.8 Folding 7.9 18.6 23.4 17.3 7.1 Avg. 17.6 15.6 22.4 9.6 15.3

Some Conclusions For both AR and Biometrics performance is better when using phone and watch Similar if use all 4 sensors or just accelerometers on both devices Accelerometer much better than gyroscope when used alone For biometrics, clapping and typing could be useful given they are practical Personal models perform best for activity recognition Majority voting improves biometrics

Room for Additional Research WISDM Lab has completed most of the activity recognition research, but some biometrics is still going on Let me know if you are interested Possible topics for (challenging) course projects True continuous biometrics Biometric authentication using only positive class Current problem with building an authentication model using lots of training data from imposters Class Imbalance