Education and Research in the Center for Signal and Image Processing

Slides:

Advertisements

Similar presentations

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.

Advertisements

Free-viewpoint Immersive Networked Experience February 2010.

Time-Frequency Analysis Analyzing sounds as a sequence of frames

Introduction to Ultra WideBand Systems

29.1 Chapter 29 Multimedia Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.

School of Computing Science Simon Fraser University

EE442—Multimedia Networking Jane Dong California State University, Los Angeles.

i Sight1 April 1998 i Sight2 Objective u Present i Sight Company. u Present i Sight Technologies. u Description of technologies. u Status of each technology.

CSc 461/561 CSc 461/561 Multimedia Systems 0. Introduction.

Analysis of compressed depth and image streaming on unreliable networks Pietro Zanuttigh, Andrea Zanella, Guido M. Cortelazzo.

TCP/IP Protocol Suite 1 Chapter 25 Upon completion you will be able to: Multimedia Know the characteristics of the 3 types of services Understand the methods.

WATERLOO ELECTRICAL AND COMPUTER ENGINEERING 10s: Communications and Information Systems 1 WATERLOO ELECTRICAL AND COMPUTER ENGINEERING 10s Communications.

Applications of Signals and Systems Fall 2002 Application Areas Control Communications Signal Processing.

1 Motivation Video Communication over Heterogeneous Networks –Diverse client devices –Various network connection bandwidths Limitations of Scalable Video.

Education and Research in the Center for Signal and Image Processing

Video Streaming © Nanda Ganesan, Ph.D..

DSP. What is DSP? DSP: Digital Signal Processing---Using a digital process (e.g., a program running on a microprocessor) to modify a digital representation.

Introduction to Multimedia. The beginning ( History )… 1945 : “…a device in which one stores all his books, records and communications, and which is mechanized.

CS 1308 Computer Literacy and the Internet. Creating Digital Pictures  A traditional photograph is an analog representation of an image.  Digitizing.

Applications of Signals and Systems Application Areas Control Communications Signal Processing (our concern)

Data Compression and Network Video by Mark Pelley Navin Dodanwela.

Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.

Copyright 1998, S.D. Personick. All Rights Reserved1 Telecommunications Networking I Lectures 2 & 3 Representing Information as a Signal.

Profiles and levelstMyn1 Profiles and levels MPEG-2 is intended to be generic, supporting a diverse range of applications Different algorithmic elements.

Lector: Aliyev H.U. Lecture №15: Telecommun ication network software design multimedia services. TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES THE DEPARTMENT.

CMPD273 Multimedia System Prepared by Nazrita Ibrahim © UNITEN2002 Multimedia System Characteristic Reference: F. Fluckiger: “Understanding networked multimedia,

Advanced Computer Technology II FTV and 3DV KyungHee Univ. Master Course Kim Kyung Yong 10/10/2015.

Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”

1 Presented by Jari Korhonen Centre for Quantifiable Quality of Service in Communication Systems (Q2S) Norwegian University of Science and Technology (NTNU)

MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES

Outline Kinds of Coding Need for Compression Basic Types Taxonomy Performance Metrics.

Communications Systems. 1Analogue modulation: time domain (waveforms), frequency domain (spectra), amplitude modulation (am), frequency modulation (fm),

Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp

Scalable Video Coding and Transport Over Broad-band wireless networks Authors: D. Wu, Y. Hou, and Y.-Q. Zhang Source: Proceedings of the IEEE, Volume:

Cognitive Radio: Next Generation Communication System

Dasar-Dasar Multimedia

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

Chapter 1. SIGNAL PROCESSING:  Signal processing is concerned with the efficient and accurate extraction of information in a signal process.  Signal.

INTRODUCTION. Electrical and Computer Engineering  Concerned with solving problems of two types:  Production or transmission of power.  Transmission.

Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.

IT-101 Section 001 Lecture #15 Introduction to Information Technology.

Multimedia Systems Dr. Wissam Alkhadour.

What is the database of a server. Web server. Print Server

“An Eye View On the Future Generation Of Phones”

Grid Optical Burst Switched Networks

Visual Information Retrieval

Unit I: Introduction.

Education and Research in the Center for Signal and Image Processing

4G-WIRELESS NETWORKS PREPARED BY: PARTH LATHIGARA(07BEC037)

Multisensor Landmine Detection System

WAVELET VIDEO PROCESSING TECHNOLOGY

Education and Research in the Center for Signal and Image Processing

Acoustic mapping technology

Education and Research in the Center for Signal and Image Processing

Education and Research in the Center for Signal and Image Processing

Digital Communications

Overview Communication is the transfer of information from one place to another. This should be done - as efficiently as possible - with as much fidelity/reliability.

Mobile Phone Techniques

Data Compression.

Yinsheng Liu, Beijing Jiaotong University, China

5G Communication Technology

Overview What is Multimedia? Characteristics of multimedia

Coding Approaches for End-to-End 3D TV Systems

Chapter 16. Direct Broadcast Satellite Services

Mark Epstein Senior Vice President Qualcomm

Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian

Govt. Polytechnic Dhangar(Fatehabad)

Presentation transcript:

Education and Research in the Center for Signal and Image Processing http://www.eedsp.gatech.edu/

CSIP Summary Our Ph.D. graduates have impact worldwide in DSP education and research Distinguished faculty 17 faculty (7 IEEE Fellows, 2 National Academy members) Co-authors of over 25 books on DSP & its applications Over 80 current Ph.D. students Located in GCATT building with excellent, modern facility Support from Georgia Research Alliance has provided outstanding well equipped labs.

Beowulf Cluster 26 dual processors 1 Gbyte memories

CSIP Faculty Yucel Altunbasak David V. Anderson Thomas P. Barnwell Mark A. Clements Faramarz Fekri Monson H. Hayes Joel R. Jackson Fred Juang Aaron Lanterman Chin Lee Vijay K. Madisetti Francois Malassenet James H. McClellan Russell M. Mersereau Ronald W. Schafer Douglas B. Williams G. Tong Zhou

Past and Present Funding Industry: Texas Instruments, Intel, BAE Systems, Hewlett-Packard, Mathworks, National Semiconductor, Analog Devices, Lucent, Harris, Hughes, Prentice-Hall Federal: NSF, U.S. Army, DARPA, ONR, NASA,MPO State: Georgia Research Alliance Private Foundation: John and Mary Franklin Foundation Total Funding: Current funding from government and industry totals about $6.5M

Current Research Areas - I Speech Processing Robust automatic speech recognition New architectures for speech recognition High-quality low-bit-rate speech coding for voice over IP Blind separation of speech signals Audio Signal Processing Music analysis and synthesis Compressed-domain processing of audio Acoustic Signal Processing Noise and reverberation removal Microphone array processing Spatialization

Current Research Areas - II Video Signal Processing Target tracking in video Video streaming with error concealment and MDC Graphics streaming for the Internet Automated analysis of video Video indexing for smart VCR Super-resolution of video Face Recognition Video compression Image Processing Image-based graphical rendering Image interpolation for digital color cameras

Current Research Areas - III Multimedia & Multi-modal Signal Processing “Intelligent Environments” Automatic storage/retrieval of speech and audio Audio-visual speech recognition Speech-driven facial animation Application of multimedia processing in education Communications Signal Processing Chaos in wireless communication systems Space-time coding and OFDM Compensation for selective fading effects Finite field wavelet transforms and applications to error control coding and cryptography Compensation of nonlinear power amps

Current Research Areas - IV Signal Modeling Multi-scale sinusoidal modeling Biological Signal Processing Automated measurement and modeling of behavior in biological systems Military Signal Processing Buried mine detection using GPR, seismic & EMI Target Tracking in sensor networks Hyperspectral imaging and target classification SAR imaging Medical Signal Processing Segmentation of cardiac MRI images DSP for hand-held communication devices

Industrial Partnership Examples Texas Instruments Leadership Univ. Program Members with MIT and Rice U. Seven projects - 7 faculty and 7 Ph.D. students Wireless video, CFA interpolation, speech coding, speech recognition, chaotic systems, face recognition, MIMO communication systems Hewlett Packard Laboratories Four faculty and six students Focus on PDAs: low-power analog front-ends, structured audio, applications in education. Also, 3D video for video conferencing, HP Labs researcher in residence

Linearization of RF Power Amplifiers G. Tong Zhou, J. Stevenson Kenney Power amplifiers (PAs) are inherently nonlinear. Desire: high efficiency PAs, leading to low cost. Downside of high efficiency: high nonlinearity. Nonlinearity causes: (1) high bit error rate; (2) adjacent channel interference: must satisfy FCC. DSP-based predistortion linearization. Challenging issue: memory nonlinear effects in high power amplifiers (e.g., base station PAs). Indirect Learning Architecture adapts to changing characteristics RF TESTBED

Indirect Learning Architecture A/D Advantage: No need to model or identify the PA.

8-Tone Test Result 8-tone, 1.2MHz signal, Siemens CGY0819 dual-band PA Purple: w/o PD; green: w/ memoryless PD (K=7); cyan: w/ memory polynomial PD (K=7, Q=10) 35 dB of spectral regrowth suppression w/ memory polynomial PD

Video Resolution Enhancement Y. Altunbasak and R. Mersereau Future broadcasting will be all digital. High definition displays will dominate the market. However, most programming is expected to be in SDTV format. HDTV NTSC Set SDTV Multi-frame Spatial PC Video Resolution Enhancement PC Monitor HDTV There is a clear need and technical opportunity to design systems to enhance the quality of the SDTV signal so that it matches the quality and capabilities of high definition displays.

Applications - Digital Cameras Subsequent multiple pictures (JPEG format) Reconstructed high-resolution picture Also applicable to high-quality printing from video sources such as DVD players, set-top boxes, TV sets, software MPEG players and camcorders. Requires a resolution enhancing print driver.

Face Recognition Monson H. Hayes Major problem is lighting and pose variations.

Results and Next Step We have developed a new face recognition system based on a segmented linear subspace model Robust to varying illuminations and tolerant to different poses, Has recognition accuracy equaling or exceeding (>99%) other state-of-the-art systems, and Has a fraction of the complexity. Next Step: Face Recognition from Video Face detection (patent awarded). Pose detection (find best frontal view). Face recognition (robust to varying illuminations, poses, facial expressions). The Intriguing Question How can we incorporate the multitude of images that are extracted from video to enhance the recognition system?

Finite Field Wavelet Transforms F. Fekri and D. Williams Goal: Establishment of a new research field that brings together researchers from signal processing, error control coding, data security and multicarrier signaling systems. Error Control Coding Finite Field Wavelets And I would like to conclude my talk by this slides that summarize my research plan: in which the finite field wavelet plays a central role to do coding, data security and muliuser access and combine them under unifying theory. Thus I intend to propose a program that will systematically explore these application areas and I will encourage and welcome collaboration with other faculty members. OFDM Modulation Security coding

New Research Directions in Data Security LL HL LH row-wise column-wise LL LH HL HH HH New Research Directions in Error Control Coding

Passive Radar Systems Aaron Lanterman Target Tracking Positions Exploit “illuminators of opportunity” such as commercial TV and FM radio broadcasts for covert operation Target Tracking Positions Velocities Radar Imaging Radar Cross Section Passive Radar System Target Classification Signature Prediction via Computational EM Target Library

Imaging With 100.0 on Your FM Dial Falcon-100 Target Shape Formatted Raw Data Image Formed Via Processing VFY-218

Detection of Obscured Targets Jim McClellan & Waymond Scott Landmines No single sensor has proven capable of reliable detection across many types of “targets” Can multiple sensors be used cooperatively to produce a system with robust performance? A three sensor experiment Electromagnetic Induction (EMI) Sensor Ground Penetrating Radar (GPR) Sensor Seismic Sensor Multimodal processing Imaging & Inversion Cooperative Fusion of multiple sensors

EMI Sensor and GPR Tx Rx EMI Sensor: 0.6 - 60 kHz GPR: 500 MHz – 8 GHz Physical Properties of Target Permittivity Contrast Low Conductivity (Dielectric) High Conductivity (Metal) Mechanical Contrast EMI No Weak Yes GPR Yes* Seismic EMI Sensor: 0.6 - 60 kHz GPR: 500 MHz – 8 GHz Tx Rx 4.5”

Seismic Sensor: Surface Waves Man-made items often resonate

Comparison of EMI, GPR and Seismic Responses: VS-1.6, 6.5 cm deep x depth y t

Comparison of EMI, GPR & Seismic Responses Uncrushed Aluminum Can, 2 cm deep x depth y t

Cooperative Analog/Digital Signal Processing D. Anderson and P. Hasler Target: Complex signal processing functionality with extremely low power Approach: Perform substantial amounts of the processing in programmable analog VLSI Real world (analog) DSP Processor A/D Convertor Computer (digital) Specialized A/D Real world (analog) ASP IC Computer (digital) DSP Processor A/D

Cooperative Analog/Digital Signal Processing Advantages of CADSP: Better problem “fit” Orders of magnitude improvement in power consumption / efficiency Simpler A/D converter requirements, Smaller size. Current Applications Include: Audio noise suppression Audio source localization / beam-steering Focal plane image / video processing Speech Recognition Field Programmable Analog Processor Arrays

Digital Media Asset Management Mark Clements Sam Nunn Archives: Cooperative Effort between CSIP, IMTC, GT and Emory Libraries. Fast searching of audio based on phonetic content. Typical speed of search: 72,000x real time (20 hours of content searched in 1 elapsed second). Basis for startup company Fast-Talk which has received over $10M venture funding. New results demonstrate rapid searching of music by lyrics and melodies using same approach. The speech part can be the basis for voice-mail management, data-mining, call center monitoring and alerting, market research, distance learning tool. The music part can be for indexing content by melody, detection copyright infringement, accessing music by “humming a tune.”

An Integrated Auditory-Cognitive Model speech Auditory Model Neural Transduction Model 3-D Cortical Representation s f Cortical Scene Analysis (Phonemic Detection) Language Model Sound Units Semantics & Schema Multi-target tracking Reinforcement Syntactic & Semantic Analysis (error correcting) Cortical Scene Analysis (Phonological Tracking) Segment Units Understanding results Enhancement results Re-generation Recognition results

Immersive Telecollaboration Presentation Capturing, transmission and reconstruction of audio and visual information (conventional view) Projection and rendering of the interaction in a 3-dimensional space (virtual view) Participation Coexistence of all participants in a shared virtual space (“shared reality”) Control and manipulation of shared virtual objects (“virtual collaboration” for hands-on experience)

Perceptual Spatialization Sound spatialization makes talker-tracking easier in multi-party conferencing environments, resulting in improved effectiveness in communication Spatial separation plays a role. Compare mono with stereo Binaural Hearing & Cocktail Party Effect Stream segregation also plays a role. Compare one talker (m1+m2) with two (m1+f2) (m1 m2 f2 ) Stereophonic Conferencing Demonstration

Multi-channel Source Separation x1 W11 s’1 W21 W12 s2 x2 W22 s’2 mixing un-mixing (room impulse responses) One possible approach (Ikram of Gatech and Morgan of Bell Labs): x = H s R’ = x xH s’ = W x Find un-mixing filter matrix W such that s’ = W R’ WH is diagonalized by minimizing the squared Frobenius norm of the off-diagonal matrix of s’

Sound Source Localization 1. Time Delay Estimation 2. Source Location Estimation Various methods: triangulation - solve a set of hyperbolic equations spherical intersection - solve a set of linearized spherical equations spherical interpolation - similar to SI, but with reduced constraint one-step-least-squares – transforms the problem into an estimation/minimization problem; works the best  talker  Further challenge Applications: Conferencing with participant tracking Improved sound and sight pickup Developed at Bell Labs & Georgia Tech

Low Complexity Rate-Distortion Optimal Coding Mode Selection Hyungjoon Kim and Yucel Altunbasak Proposed Rate-Distortion Model Model based Model Selection Candidate modes Mode 0 Mode 1 Mode N D = 2 e - R Rate for DCT coefficients Distortion Standard deviation Model parameter R-D cost calculation Provides 10-15% bit-rate savings Patented, licensed, and commercialized Based on General Gaussian R-D model Calculation of D has low computational complexity Adaptive model parameter Minimum cost Mode k Best mode Experimental Results 1. Distortion-based coding mode selection gives low compression efficiency especially at low bit-rate 2. We developed low complexity model-based R-D optimal coding mode selection 3. Model-based approach improves coding efficiency and visual quality significantly with small increase in computation over TM5 R-D model-based approach (Proposed) Distortion-based approach (TM5+Rho)

R-D Optimized Multi-Server Streaming Ali C. Begen and Yucel Altunbasak Client Server Goal: Developing media-aware and network-adaptive packet delivery and error recovery mechanisms for multipoint-to-point networks Approach: Client-driven rate-distortion optimized streaming Suitable For: Multi-homed clients, wireless systems, CDNs

Mobile Video Streaming Umut Demircin and Yucel Altunbasak Video Rate Bandwidth Available Error Propagation and Frame Freeze Challenges: Fluctuating wireless channel error-rate and bandwidth Video error propagation Solution Approaches: Video and channel aware FEC code rate and link-layer ARQ adaptation. Rate reduction and error-resiliency video transcoding. R-D optimized packet scheduling Diversity methods

Video Resolution Enhancement Yucel Altunbasak Sequence of limited dynamic range images Composite image with higher dynamic range and resolution Compressed-domain resolution enhancement Bit-depth and contrast enhancement Resolution enhancement for FACE video Three patents, one licensed Zoom:

Hyper-Spectral Super-Resolution Panchromatic Multi-spectral Hyper-spectral Hyper-spectral images offer huge amounts of data. Spectrum is sampled at more than 200 wavelengths. Spatial resolution is the key parameter in many related applications. To improve spatial resolution we combine A precise physical model of the imaging model, and The intrinsic low dimensionality of hyper-spectral data. The result is an efficient and noise-robust super-resolution (SR) method. Bilinear interpolation Separate band SR Our method

Demosaicing Yucel Altunbasak Sensors CFA Optical system Scene Digital cameras use a single sensor array with a color filter array (CFA) to sample different spectral components. At each pixel location, only one color sample is taken, and the other colors must be interpolated. This color plane interpolation is known as demosaicing.

Advanced Collaborative Systems Fred Juang and Ghassan AlRegib 3D Collaboration System Shared Virtual Space Shared Reality: Allows the virtual world to coexist and to interact with the real world undividedly and seamlessly. Sensors are used to capture users’ motions in the real world and are used to control objects in the virtual world synchronously. Smart Objects: A new data structure creates smart objects and introduces efficient usability. Details Count: The multimodal micro-tool supports dexterous and real-time control of remote virtual objects. Speech-enabled commands facilitate micro-level manipulation and control. Display Display 3D Networking 3D Collaboration System Participant in one location Realtime Registration Kinematics Control Current System built at GT 42

Distributed Processing & Communications Protocols for Distributed Sensor Systems Ghassan AlRegib Distributed Detection Parameter Estimation (which sensor to send and what to send) Communication Protocols (how sensors communicate among each other) Analog Waveform Digitized Observations/ Decisions Data Models Application Data Processing Data Communication Distributed Parameter Estimation -- System Overview: Distributed Sensors: observe, quantize and transmit their observations Fusion Center: perform the final estimation based on the received messages Goal: Minimize the estimation MSE under the constrained total bit rate

Bit Allocation for Textured 3D Models Ghassan AlRegib Target: Best display quality of the 3D model during progressive streaming Approach: Optimal bit allocation between geometry and texture in bitstream Original model Case I Case II

Streaming Meshes over Lossy Networks Ghassan AlRegib Sender receiver Best-effort service Packet losses Delay constraints 3D data Interactive 3D applications Large amounts of data Real-time interactivity High-resolution visualization Technical problems Packet losses Stringent delay constraints Best-effort network: bandwidth bottleneck, congestion… Server-side: Multi-resolution compression Network side: FEC-based packet loss protection Feedback-based retransmission Congestion control RS: Reed-Solomon code From: 28.35 dB To: 37.76 dB

Summary The premier academic program in the country in the signal processing field is in the Georgia Tech School of Electrical and Computer Engineering. We have many outstanding graduate students. Internships Long-term contributors We have lots of outstanding technology waiting to be developed. We have a demonstrated capability to work with industry. Contact jim.mcclellan@ece.gatech.edu if you want to come for a visit.