Download presentation
Presentation is loading. Please wait.
Published byAlexia Harrington Modified over 9 years ago
1
EE 225D Audio Signal Processing in Humans and Machines Prof. N. Morgan and friends MW 4:00-5:30 http://www.icsi.berkeley.edu/eecs225d/spr14/overview. html http://www.icsi.berkeley.edu/eecs225d/spr14/slides/
2
Textbook Speech and Audio Signal Processing Gold, Morgan, and Ellis Wiley&Sons, 2 nd edition, 2011
3
Prerequisites EE123 or equivalent, and Stat 200A or equivalent; or grad standing and consent of instructor
4
Speech and audio signal processing: why does this material matter? Speech w/o visual vs visual w/o speech Requires DSP, machine learning Multidisciplinary tasks are good training Many applications!
5
What should we be able to do (automatically)? Human example suggests, plenty What was said Who said it When they said it What it meant How to respond
6
Why is it hard? Speaker variability (within and between) Noise, reverberation, channel Confusable vocabulary Meaning and tone
7
Course Philosophy I People can do these tasks effortlessly Include psychoacoustics and physiology Also some acoustics But of course, also DSP and machine learning
8
Course Philosophy II First part of the course is basic stuff The rest is applications Much of the course grade based on an original project Some practice in oral presentation Middle of the course has students presenting the material (slides from previous classes can help)
9
Section I: Broad background Synthesis/vocoding history (chaps 2&3) Recognition history (chap 4) Machine recognition basics (chap 5) Human recognition basics (chap 18)
10
Section II: Scientific background Pattern classification (chaps 8 and 9) Acoustics (chaps 10 and 13) Linguistic sound categories (chap 23) (Auditory neurophysiology late in the course)
11
Section IIIa: Engineering Apps Speech recognition Signal processing “front end” (chaps 19-22) Deterministic sequence recognition (chap 24) Statistical modeling and inference (chaps 25,26) Discriminant methods and adaptation (chaps 27,28) Speech recognition and understanding (chap 29)
12
Section IIIb: Engineering Apps Other speech applications Speech synthesis (chap 30) Speaker verification (chap 41)
13
Section IIIc: Engineering Apps Other audio applications Perceptual audio coding (chap 35) Music signal analysis (chap37) Source separation (chap 39)
14
Section IV: Hearing [presented by Prof. Oded Ghitza, Boston University] Auditory physiology (chap 14) Psychoacoustics (chap 15,16)
15
Section V: Student Projects Project proposal: By spring break, iterate on proposed project Last week of class, students present their projects, modeled after ICASSP or Interspeech Finals week, submit written version of project, schedule demos Any topic in speech/music/general audio potentially OK, including tutorial or original research
16
Course grading Quizzes/homeworks (for first half): 20% Student presentations/participation: 20% Project proposal: 10% Project oral presentation: 20% Project write-up & results: 30%
17
Course location After today, 6 th floor ICSI 1947 Center Street, between Milvia and MLK Class will start at 4:15 instead of 4:10 (15 minute walk from Cory) Office hour, one hour before each class
18
Course location
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.