Download presentation
Presentation is loading. Please wait.
1
CS 2750 Project Report Jason D. Bakos
2
Project Goals Data Sensor readings from 11 different people walking in a controlled environment An accelerometer records floor vibration data from footfalls A microphone records sounds from footballs This data is recorded 10 times for each person Data gathered from 11 different people
3
Project Goals Use this data to perform multiple classification Human gait analysis Eventually want to determine if a person is in duress Most important aspect: learn the nature of the data to determine how best to classify it
4
Data Preprocessing Data size Data is collected at 15KHz for approximately 10 seconds 150,000 samples Must get data out of time domain Must capture a “walk” as a single data point Time series => cross sectional
5
Data Preprocessing Extract the largest intensity step from the data Closest to sensors Transform data to frequency domain Fourier transform Used MatLab FFT – output is real array Integrated over time Bin resultant data into bins These are now the features
6
Data Preprocessing Extracting footstep Method 1 Find max value in time-domain Center fixed window around data 2000, 4000, 6000 Method 2 Actively find footstep Create new vector by recording sliding abs “mean”- window Extract largest hill (using gradient descent and threshold) Index from meanarray into data array Meanwindow sizes 1000, 2000, 3000
7
Data Preprocessing Mean window of 1000
8
Data Preprocessing Mean window of 2000
9
Data Preprocessing Mean window of 3000
10
Analysis of Preprocessed Data Cluster analysis Unsupervised learning 3 steps Distance calculation Linkage analysis Clustering
11
Analysis of Preprocessed Data Distance Calculation 4 distance measures Euclid Standard distance Standardized Euclid Shorter distance between points who have relatively smaller variances City Block Similar to Euclid, used for comparison Minkowski Another way to measure distance, used for comparison Result is array, distance from each point to every other point
12
Analysis of Preprocessed Data Linkage Analysis Hierarchically link datapoints Methods Shortest distance Average distance Uses center points of clusters Centroid distance Draws “sphere” around center point, uses furthest point as radius – use distance from edges of sphere Incremental sum-of-squares Similar to centroid, used for comparison Result is matrix
13
Analysis of Preprocessed Data Clustering Force datapoints into a fixed number of clusters Result is cluster vector and dendrogram
14
Analysis of Preprocessed Data How to judge how well the clustering worked? My answer Since there is exactly 10 samples from 11 people, define “uniformity” as a metric
15
Analysis of Preprocessed Data
16
Checked all 12 charts fix2000, fix4000, fix6000, win1000, win2000, win3000 for vibration and audio Euclid/Sum-of-squares is best for vibration and audio win3000 is best for vibration fix2000 is best for audio
17
Analysis of Preprocessed Data
18
Indirect Learning Used parametric Naïve Bayes model to do multi-way classification 11 classes Used 50-bin data Assumed data was multivariate Gaussian Chose class based on maxium posterior of C Used multiple train/test splits to train 3 models with bagging (voting)
19
Indirect Learning
20
Bad results Worse than random predictor Conclusion Data is not Gaussian
21
Direct Learning Trained neural network with same data Used softmax network to perform multiway classification 1000 epochs, log sigmoid, gradient descent Tried different parameters for neural network
22
Direct Learning Vibration Audio
23
Direct Learning No improvement after 50 neurons per level (vib and aud) 4 levels is best (including output level) Results terrible for test sets
24
Conclusion Need Better feature extraction Better classifiers Or… maybe different sensors are needed Video
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.