Education and Research in the Center for Signal and Image Processing http://www.eedsp.gatech.edu/
CSIP Summary Our Ph.D. graduates have impact worldwide in DSP education and research Distinguished faculty 17 faculty (7 IEEE Fellows, 2 National Academy members) Co-authors of over 25 books on DSP & its applications Over 80 current Ph.D. students Located in GCATT building with excellent, modern facility Support from Georgia Research Alliance has provided outstanding well equipped labs.
Beowulf Cluster 26 dual processors 1 Gbyte memories
CSIP Faculty Yucel Altunbasak David V. Anderson Thomas P. Barnwell Mark A. Clements Faramarz Fekri Monson H. Hayes Joel R. Jackson Fred Juang Aaron Lanterman Chin Lee Vijay K. Madisetti Francois Malassenet James H. McClellan Russell M. Mersereau Ronald W. Schafer Douglas B. Williams G. Tong Zhou
Past and Present Funding Industry: Texas Instruments, Intel, BAE Systems, Hewlett-Packard, Mathworks, National Semiconductor, Analog Devices, Lucent, Harris, Hughes, Prentice-Hall Federal: NSF, U.S. Army, DARPA, ONR, NASA,MPO State: Georgia Research Alliance Private Foundation: John and Mary Franklin Foundation Total Funding: Current funding from government and industry totals about $6.5M
Current Research Areas - I Speech Processing Robust automatic speech recognition New architectures for speech recognition High-quality low-bit-rate speech coding for voice over IP Blind separation of speech signals Audio Signal Processing Music analysis and synthesis Compressed-domain processing of audio Acoustic Signal Processing Noise and reverberation removal Microphone array processing Spatialization
Current Research Areas - II Video Signal Processing Target tracking in video Video streaming with error concealment and MDC Graphics streaming for the Internet Automated analysis of video Video indexing for smart VCR Super-resolution of video Face Recognition Video compression Image Processing Image-based graphical rendering Image interpolation for digital color cameras
Current Research Areas - III Multimedia & Multi-modal Signal Processing “Intelligent Environments” Automatic storage/retrieval of speech and audio Audio-visual speech recognition Speech-driven facial animation Application of multimedia processing in education Communications Signal Processing Chaos in wireless communication systems Space-time coding and OFDM Compensation for selective fading effects Finite field wavelet transforms and applications to error control coding and cryptography Compensation of nonlinear power amps
Current Research Areas - IV Signal Modeling Multi-scale sinusoidal modeling Biological Signal Processing Automated measurement and modeling of behavior in biological systems Military Signal Processing Buried mine detection using GPR, seismic & EMI Target Tracking in sensor networks Hyperspectral imaging and target classification SAR imaging Medical Signal Processing Segmentation of cardiac MRI images DSP for hand-held communication devices
Industrial Partnership Examples Texas Instruments Leadership Univ. Program Members with MIT and Rice U. Seven projects - 7 faculty and 7 Ph.D. students Wireless video, CFA interpolation, speech coding, speech recognition, chaotic systems, face recognition, MIMO communication systems Hewlett Packard Laboratories Four faculty and six students Focus on PDAs: low-power analog front-ends, structured audio, applications in education. Also, 3D video for video conferencing, HP Labs researcher in residence
Linearization of RF Power Amplifiers G. Tong Zhou, J. Stevenson Kenney Power amplifiers (PAs) are inherently nonlinear. Desire: high efficiency PAs, leading to low cost. Downside of high efficiency: high nonlinearity. Nonlinearity causes: (1) high bit error rate; (2) adjacent channel interference: must satisfy FCC. DSP-based predistortion linearization. Challenging issue: memory nonlinear effects in high power amplifiers (e.g., base station PAs). Indirect Learning Architecture adapts to changing characteristics RF TESTBED
Indirect Learning Architecture A/D Advantage: No need to model or identify the PA.
8-Tone Test Result 8-tone, 1.2MHz signal, Siemens CGY0819 dual-band PA Purple: w/o PD; green: w/ memoryless PD (K=7); cyan: w/ memory polynomial PD (K=7, Q=10) 35 dB of spectral regrowth suppression w/ memory polynomial PD
Video Resolution Enhancement Y. Altunbasak and R. Mersereau Future broadcasting will be all digital. High definition displays will dominate the market. However, most programming is expected to be in SDTV format. HDTV NTSC Set SDTV Multi-frame Spatial PC Video Resolution Enhancement PC Monitor HDTV There is a clear need and technical opportunity to design systems to enhance the quality of the SDTV signal so that it matches the quality and capabilities of high definition displays.
Applications - Digital Cameras Subsequent multiple pictures (JPEG format) Reconstructed high-resolution picture Also applicable to high-quality printing from video sources such as DVD players, set-top boxes, TV sets, software MPEG players and camcorders. Requires a resolution enhancing print driver.
Face Recognition Monson H. Hayes Major problem is lighting and pose variations.
Results and Next Step We have developed a new face recognition system based on a segmented linear subspace model Robust to varying illuminations and tolerant to different poses, Has recognition accuracy equaling or exceeding (>99%) other state-of-the-art systems, and Has a fraction of the complexity. Next Step: Face Recognition from Video Face detection (patent awarded). Pose detection (find best frontal view). Face recognition (robust to varying illuminations, poses, facial expressions). The Intriguing Question How can we incorporate the multitude of images that are extracted from video to enhance the recognition system?
Finite Field Wavelet Transforms F. Fekri and D. Williams Goal: Establishment of a new research field that brings together researchers from signal processing, error control coding, data security and multicarrier signaling systems. Error Control Coding Finite Field Wavelets And I would like to conclude my talk by this slides that summarize my research plan: in which the finite field wavelet plays a central role to do coding, data security and muliuser access and combine them under unifying theory. Thus I intend to propose a program that will systematically explore these application areas and I will encourage and welcome collaboration with other faculty members. OFDM Modulation Security coding
New Research Directions in Data Security LL HL LH row-wise column-wise LL LH HL HH HH New Research Directions in Error Control Coding
Passive Radar Systems Aaron Lanterman Target Tracking Positions Exploit “illuminators of opportunity” such as commercial TV and FM radio broadcasts for covert operation Target Tracking Positions Velocities Radar Imaging Radar Cross Section Passive Radar System Target Classification Signature Prediction via Computational EM Target Library
Imaging With 100.0 on Your FM Dial Falcon-100 Target Shape Formatted Raw Data Image Formed Via Processing VFY-218
Detection of Obscured Targets Jim McClellan & Waymond Scott Landmines No single sensor has proven capable of reliable detection across many types of “targets” Can multiple sensors be used cooperatively to produce a system with robust performance? A three sensor experiment Electromagnetic Induction (EMI) Sensor Ground Penetrating Radar (GPR) Sensor Seismic Sensor Multimodal processing Imaging & Inversion Cooperative Fusion of multiple sensors
EMI Sensor and GPR Tx Rx EMI Sensor: 0.6 - 60 kHz GPR: 500 MHz – 8 GHz Physical Properties of Target Permittivity Contrast Low Conductivity (Dielectric) High Conductivity (Metal) Mechanical Contrast EMI No Weak Yes GPR Yes* Seismic EMI Sensor: 0.6 - 60 kHz GPR: 500 MHz – 8 GHz Tx Rx 4.5”
Seismic Sensor: Surface Waves Man-made items often resonate
Comparison of EMI, GPR and Seismic Responses: VS-1.6, 6.5 cm deep x depth y t
Comparison of EMI, GPR & Seismic Responses Uncrushed Aluminum Can, 2 cm deep x depth y t
Cooperative Analog/Digital Signal Processing D. Anderson and P. Hasler Target: Complex signal processing functionality with extremely low power Approach: Perform substantial amounts of the processing in programmable analog VLSI Real world (analog) DSP Processor A/D Convertor Computer (digital) Specialized A/D Real world (analog) ASP IC Computer (digital) DSP Processor A/D
Cooperative Analog/Digital Signal Processing Advantages of CADSP: Better problem “fit” Orders of magnitude improvement in power consumption / efficiency Simpler A/D converter requirements, Smaller size. Current Applications Include: Audio noise suppression Audio source localization / beam-steering Focal plane image / video processing Speech Recognition Field Programmable Analog Processor Arrays
Digital Media Asset Management Mark Clements Sam Nunn Archives: Cooperative Effort between CSIP, IMTC, GT and Emory Libraries. Fast searching of audio based on phonetic content. Typical speed of search: 72,000x real time (20 hours of content searched in 1 elapsed second). Basis for startup company Fast-Talk which has received over $10M venture funding. New results demonstrate rapid searching of music by lyrics and melodies using same approach. The speech part can be the basis for voice-mail management, data-mining, call center monitoring and alerting, market research, distance learning tool. The music part can be for indexing content by melody, detection copyright infringement, accessing music by “humming a tune.”
An Integrated Auditory-Cognitive Model speech Auditory Model Neural Transduction Model 3-D Cortical Representation s f Cortical Scene Analysis (Phonemic Detection) Language Model Sound Units Semantics & Schema Multi-target tracking Reinforcement Syntactic & Semantic Analysis (error correcting) Cortical Scene Analysis (Phonological Tracking) Segment Units Understanding results Enhancement results Re-generation Recognition results
Immersive Telecollaboration Presentation Capturing, transmission and reconstruction of audio and visual information (conventional view) Projection and rendering of the interaction in a 3-dimensional space (virtual view) Participation Coexistence of all participants in a shared virtual space (“shared reality”) Control and manipulation of shared virtual objects (“virtual collaboration” for hands-on experience)
Perceptual Spatialization Sound spatialization makes talker-tracking easier in multi-party conferencing environments, resulting in improved effectiveness in communication Spatial separation plays a role. Compare mono with stereo Binaural Hearing & Cocktail Party Effect Stream segregation also plays a role. Compare one talker (m1+m2) with two (m1+f2) (m1 m2 f2 ) Stereophonic Conferencing Demonstration
Multi-channel Source Separation x1 W11 s’1 W21 W12 s2 x2 W22 s’2 mixing un-mixing (room impulse responses) One possible approach (Ikram of Gatech and Morgan of Bell Labs): x = H s R’ = x xH s’ = W x Find un-mixing filter matrix W such that s’ = W R’ WH is diagonalized by minimizing the squared Frobenius norm of the off-diagonal matrix of s’
Sound Source Localization 1. Time Delay Estimation 2. Source Location Estimation Various methods: triangulation - solve a set of hyperbolic equations spherical intersection - solve a set of linearized spherical equations spherical interpolation - similar to SI, but with reduced constraint one-step-least-squares – transforms the problem into an estimation/minimization problem; works the best talker Further challenge Applications: Conferencing with participant tracking Improved sound and sight pickup Developed at Bell Labs & Georgia Tech
Summary The premier academic program in the country in the signal processing field is in the Georgia Tech School of Electrical and Computer Engineering. We have many outstanding graduate students. Internships Long-term contributors We have lots of outstanding technology waiting to be developed. We have a demonstrated capability to work with industry. Contact jim.mcclellan@ece.gatech.edu if you want to come for a visit.