Download presentation
Presentation is loading. Please wait.
1
Top Level System Block Diagram BSS Block Diagram Abstract In today's expanding business environment, conference call technology has become an integral tool in the implementation of communication between remote offices. A limitation with the use of conference calls, however, is the inability to distinguish multiple voices when several speakers are talking simultaneously through the conference phone, resulting in confusion and inefficiency. The goal of this project is to implement an adaptive noise filtering system capable of isolating the voices of two speakers in a conference room setting in real-time with little to no delay. This will be accomplished through blind source separation, a digital signal processing algorithm capable of performing the voice separation. It is anticipated that the system will be able to process two live speakers and separate their individual voices in approximately real-time, regardless of their orientation to the microphones. Group 6 Authors Leon Hermans EE ’09 Tsanyu Jay Huang EE ‘08 Advisors Dr. Saleem A. Kassam Special Thanks Dr.Xinyu Liu, Siddarth Deliwala Demo Times Thursday, April 24, 2008 10:30 AM – 12:00 PM, 1:30 – 3:00 PM University of Pennsylvania Dept. of Electrical and Systems Engineering See report for image citations. Adaptive Voice Filtering for a Conference Call Setting Adaptive BSS Model: EASI Algorithm The objective of the EASI algorithm is to calculate a separating matrix B n that converges to B such that B *A is very close to the identity matrix, this is not possible to prove in real-time unless given previous knowledge of matrix A. Source Signals s[n] Received Signals x[n] Mixing matrix A Separating matrix B n Estimated Signals y[n] Measures output signal independence Previous Separating Matrix Measures output signal correlation The vector x = [x 1, x 2, …, x n ] represents the observed signal mixtures. In our demo, x contains the additive signal mixtures collected with the microphones. The recovered subcomponents approximating the source signals are contained in vector y = [y, y 2, …, y n ]. Through a linear transformation matrix B, the observed vector x is transformed into the maximized independence components of vector y. DSP Conference Phone CPU Conference RoomRemote User Conference Phone CPU With Filtering Without Filtering The conference room will have one microphone placed in front of each of the local users. Sound produced by any user will be captured by all microphones due to their proximity to the sound source, thus producing mixtures when multiple users speak simultaneously. One important factor considered in the design phase was the delay observed between the various microphones, due to the speed of sound (approximately 344 m/s). This delay was calculated to be at most 15 ms between the ends of a five meter long conference table. This value is well below the industry standard of 150 ms acceptable delay for most user applications in the telecommunications field. MATLAB Implementation An EASI BSS implementation was coded in MATLAB and run with pre-recorded, artificially mixed inputs. The mixtures were generated by combining WAV files according to a randomly generated matrix. This matrix was stored in order to compute the final B*A coefficients but was not used in calculating the separating matrix B. The output was presented as a collection of separated WAV files and a final B matrix. Excluding scale factor differences, the results closely matched the expected B*A = I (identity matrix). Each row of B*A corresponds to a source that the EASI algorithm separated from the mixtures given. If n input mixtures are used then the algorithm will produce n outputs. The TMS320C6713 is a high performance floating-point digital signal processor suitable for multi-channel and multi-function applications. The board operates at 225 MHz and takes 24 bit resolution samples at 96 kHz from up to two mono or one stereo input channels. The addition of the AUDIO_4 Daughtercard expands the capabilities of the board to accept four synchronized 16 bit mono channels (or two stereo) simultaneously using its onboard oscillator. It samples at a reduced rate of 48 kHz. It includes selectable 20 dB preamps for microphone input sources. Single Row of B*A coefficients plotted over entire sample using non-optimal step size (lambda) and g function. Single Row of B*A coefficients plotted over entire sample using optimal step size lambda and g function, optimality determined using multiple trial runs across several g functions with varying step sizes. TI C6713 DSK Digital Signal Processing Board & AUDIO_4 Daughtercard I/O Data Flow Diagram for AUDIO_4 Daughter Card (one of two available codecs is shown) C6713 Implementation The EASI Algorithm was coded in C for easy conversion and implementation onto the TI C6713 DSP Board. Different procedures were developed for testing the success of the algorithm: LINE IN : Mixtures are prerecorded artificially mixed 2 channel WAV files: Advantages: Accurate output, steady b values. Disadvantages: No more than two sources unless with multi-channel (3+) mixer. Signals must be premixed, therefore cannot be used in real time. I/O Port Mapping for AUDIO_4 Daughter Card MIC IN: Live recording of output played via speakers from multiple sound files: Advantages: Scalable number of sources provided necessary hardware / software support and number of speakers and microphones. Disadvantages: Very susceptible to outside noise, poor performance on non-dominant speakers.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.