Download presentation
Presentation is loading. Please wait.
Published byEvan Harvey Modified over 9 years ago
1
Team Members: Mohammed Hoque Troy Tancraitor Jonathan Lobaugh Lee Stein Joseph Mallozi Pennsylvania State University
2
Presentation Overview Problem StatementProblem Statement Architectural designArchitectural design Robotic layerRobotic layer Processing layersProcessing layers Communications layerCommunications layer Testing and ResultsTesting and Results Realization of RequirementsRealization of Requirements
3
The Big Picture We built a robot with vision and hearing capabilities.We built a robot with vision and hearing capabilities. Objectives:Objectives: The robot must be able to detect a specific user from a known set of users, based on audio and video information.The robot must be able to detect a specific user from a known set of users, based on audio and video information. The system should facilitate an improvedThe system should facilitate an improved platform for human computer interaction. platform for human computer interaction. Limitations:Limitations: The recognition is solely limited to the five group members and project advisors.The recognition is solely limited to the five group members and project advisors.
4
High-level Architectural Design Robotic interface Audio processing Image processing Neural network Communications layer Processing Layer Robotic Layer
5
Motor controllerSensor controller Visual controller System controller Motor # LCD display Audio Input device # LED array Interface controller Visual input device Environ. Input device # USB 2.0 I/O Layer Controller Layer Interface Layer Robotic Layer Diagram
6
Processing Layer Diagram Interface Image correction Audio finger printing Neural networks Image Parsing USB 2.0 Real-time control system Image processing Audio processing
7
Image Layer Diagram Face Detection Image Extraction Image processing Real-time control system Neural Network
8
O(x,y) Data Input ImageOutput Image Template image x,y I(x,y) Correlation Facial Vector Extraction Process The location of the distinctive features of the face are identified using template matching, and the resultant position vectors are fed to the neural network.The location of the distinctive features of the face are identified using template matching, and the resultant position vectors are fed to the neural network.
9
Template Matching Feature Extraction Example
10
Face Processing Results
11
Eye and nose profiles
12
Audio layer Finds the normalized highest amplitude of the waveforms ( should be less than or equal to 1).Finds the normalized highest amplitude of the waveforms ( should be less than or equal to 1). Looks for amplitudesLooks for amplitudes greater than the 80% greater than the 80% of the highest of the highest amplitudes. amplitudes. Creates an array of Creates an array of amplitudes of 40 amplitudes of 40 points after finding points after finding the first one crossing the first one crossing the 80% boundary. the 80% boundary.
13
Audio Layer (cont) Pads secondary array with ending zeros (984) to get a 1024-point array.Pads secondary array with ending zeros (984) to get a 1024-point array. Performs a Fast Fourier Transform on 1024 points.Performs a Fast Fourier Transform on 1024 points. Sends the first 400 absolute points to neural- network for processing.Sends the first 400 absolute points to neural- network for processing.
14
Audio results
15
Neural network layer Feed-forward back propagation 3 layer networkFeed-forward back propagation 3 layer network Neural networks provide a way of allowing the system to generalize the input data and determine the output.Neural networks provide a way of allowing the system to generalize the input data and determine the output. Two separate networks; one trained for audio data, the other trained for image data.Two separate networks; one trained for audio data, the other trained for image data. InputsInputs Vectors containing facial feature position information (Eyes and nose)Vectors containing facial feature position information (Eyes and nose) Audio vector of 400 known FFT pointsAudio vector of 400 known FFT points OutputOutput Percent similarity of known users based on audio/ imagingPercent similarity of known users based on audio/ imaging Utilized Matlab’s neural network toolbox to train the system weights.Utilized Matlab’s neural network toolbox to train the system weights.
16
Communications Layer USB Class Command Class Buffer Class Communication Package Robot -RTS
17
read write readwrite readwrite read writeread PL PL LP LP LP Init() pop push Lock and Key Algorithm Setup in the image and audio arrays.Setup in the image and audio arrays. Developed to prevent data collision.Developed to prevent data collision.
18
Communications Layer (cont.) Controls the flow of data between the robotic layer and the processing layer.Controls the flow of data between the robotic layer and the processing layer. Communicates with the devices.Communicates with the devices. Command class interfaces all of the software packagesCommand class interfaces all of the software packages Interprets and sends commands to the robotic layer.Interprets and sends commands to the robotic layer.
19
Real-time control system Manages input and output of data to the interface layerManages input and output of data to the interface layer Determines robot’s actions during idle processing timeDetermines robot’s actions during idle processing time Controls robot to follow a face once detected.Controls robot to follow a face once detected. Controls programmed responses to robotic interface; audio, optical, and mechanicalControls programmed responses to robotic interface; audio, optical, and mechanical
20
Realization of Requirements Ease of UseEase of Use Centralized control Minimize connections needed ( connections <=2) Standardized adaptors CostCost School funded expenses = $102.00 Team funded expenses = $420.00 Previously owned = $200.00 Total = $722.00 EfficiencyEfficiency Performs audio and image processing in less than.5 seconds. Performs audio and image processing in less than.5 seconds.
21
Realization of Constraints (cont) EconomicalEconomical Total expenses exceeded the initial budget of $250. Total expenses exceeded the initial budget of $250. EnvironmentalEnvironmental Audio System is sensitive to background noise. Imaging system works the best in ideal lighting condition. It also works the best when the user directly looks at the camera from a reasonable distance (3-4 feet). AccuracyAccuracy The overall accuracy rate of our system is close to 80% which is what we initially had set up.
22
Conclusions / Questions The next evolutionary step would be to expand the interaction between humans and machines to a social interface. This system is created to help bridge the gap between humans and computers. Someday this distinction may change the meaning of computers to our everyday lives.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.