EE 225D Audio Signal Processing in Humans and Machines Prof. N. Morgan and friends MW 4:00-5:30 html.

Slides:



Advertisements
Similar presentations
1 Speech Sounds Introduction to Linguistics for Computational Linguists.
Advertisements

Transitioning to Semesters CSE MS Program Prof. Gagan Agrawal Grad Studies Chair.
September 2, 2009ECE 366, Fall 2009 Introduction to ECE 366 Selin Aviyente Associate Professor.
Interactive Sound Rendering SIGGRAPH 2009 Dinesh Manocha UNC Chapel Hill
Digital audio and computer music COS 116: 2/26/2008.
Welcome to CS 450 Internet Security: A Measurement-based Approach.
EE491D Special Topics in Communications Adaptive Signal Processing Spring 2005 Prof. Anthony Kuh POST 205E Dept. of Elec. Eng. University of Hawaii Phone:
Research Trends in Software Engineering – CS661 Shafay Shamail Malik Jahan Khan.
Auditory User Interfaces
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Why is ASR Hard? Natural speech is continuous
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
Cpt S 471/571: Computational Genomics Spring 2015, 3 cr. Where: Sloan 9 When: M WF 11:10-12:00 Instructor weekly office hour for Spring 2015: Tuesdays.
So far: Historical introduction Mathematical background (e.g., pattern classification, acoustics) Feature extraction for speech recognition (and some neural.
ECE 284: Special Topics in Computer Engineering On-Chip Interconnection Networks Prof. Bill Lin Spring 2014.
Software Engineering II (Spring 2008) Instructor: Instructor:Dr. Damla Turgut Office: Office:450 ENGR 1 Bldg Office Phone: Office Phone:(407)
COMP Introduction to Programming Yi Hong May 13, 2015.
CS Welcome to CS 4311 Software Engineering II Spring 2015.
Deaf and Hard of Hearing Students 101 And the Interpreters that come with them.
HCI / CprE / ComS 575: Computational Perception
20-753: Fundamentals of Web Programming 1 Lecture 1: Introduction Fundamentals of Web Programming Lecture 1: Introduction.
HOARSE Mid Term Review Coordinator’s Report Phil Green University of Sheffield, UK.
Polly Huang, NTU EEAdmin1 Network Simulation and Testing Polly Huang EE NTU
CEN352 Digital Signal Processing Lecture No. 1 Department of Computer Engineering, College of Computer and Information Sciences, King Saud University,
1 Statistics 416 Statistical Design and Analysis of Microarray Experiments 1/12/2009 Copyright © 2009 Dan Nettleton.
CS 6961: Structured Prediction Fall 2014 Course Information.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
Syllabus CS479(7118) / 679(7112): Introduction to Data Mining Spring-2008 course web site:
ICASSP Speech Discrimination Based on Multiscale Spectro–Temporal Modulations Nima Mesgarani, Shihab Shamma, University of Maryland Malcolm Slaney.
Speech Science IX How is articulation organized? Version WS
1 [Oh dear! We need to update our graphic!] Introduction to Computer Science Fall 2009 Tom Horton.
GIS for Environmental Modeling GEO 479/559 Spring.
Advanced Topics in Speech Processing (IT60116) K Sreenivasa Rao School of Information Technology IIT Kharagpur.
CSCE 496/896 Self-Managing Computer Systems Ying Lu 106 Schorr Center
Learning styles Information found from CareerCollegeReadiness/Curriculum/NavGr1 0LessonsRGRev pdf.
Digital audio and computer music COS 116, Spring 2010 Adam Finkelstein Slides and demo thanks to Rebecca Fiebrink.
CHEM 198 SENIOR RESEARCH Requirements and Tips for a Successful Semester.
It sure is smart but can it swing? (Digital audio and computer music)
CS Welcome to CS 5383, Topics in Software Assurance, Toward Zero-defect Programming Spring 2007.
This material is approved for public release. Distribution is limited by the Software Engineering Institute to attendees. Sponsored by the U.S. Department.
CSE 1105 Week 1 CSE 1105 Introduction to Computer Science & Engineering Time: Wed 4:00 – 4:50 Thurs 9:30 – 10:20 Thurs 4:00 – 4:50 Place: 100 Nedderman.
English and Digital Literacies Unit 7.3: Instructional Materials for the Digital Enrichment of Greek EFL Textbooks Bessie Mitsikopoulou School of Philosophy.
This material is approved for public release. Distribution is limited by the Software Engineering Institute to attendees. Sponsored by the U.S. Department.
S PEECH T ECHNOLOGY Answers to some Questions. S PEECH T ECHNOLOGY WHAT IS SPEECH TECHNOLOGY ABOUT ?? SPEECH TECHNOLOGY IS ABOUT PROCESSING HUMAN SPEECH.
Basic structure of sphinx 4
System Maintenance Modifications or corrections made to an information system after it has been released to its customers Changing an information system.
EE5393: Course Information Instructor Prof. Marc Riedel office: EE/CSi tel: Credits: 3 Meeting time:
CSCE 990 Advanced Distributed Systems Seminar Ying Lu 104 Schorr Center
Fundamentals of Lifespan Development MARCH 3 RD – ORAL PRESENTATION & GROUP PAPER.
Technology for deaf people. City Lit This session is relevant to: Assignment 4 Technology for deaf people 4a Emerging technology Analyse the current developments.
– Ecology and Evolution Spring 2004 M,W,F 11 – 11:50 CEH 218.
Introduction to Machine Learning Original Version by Prof. Nati Srebro-Bartom Modifications by Nir Ailon Lecture 0: What is Machine Learning? What.
Department of Geology & Geophysics, Spring 2016 GG610: Graduate Seminar Instructor: Clint Conrad804 Web Site (linked from Conrad’s.
CSE6339 DATA MANAGEMENT AND ANALYSIS FOR COMPUTATIONAL JOURNALISM CSE6339, Spring 2012 Department of Computer Science and Engineering, University of Texas.
Digital Signal Processing Rahil Mahdian LSV Lab, Saarland University, Germany.
CMPT 201 Computer Science II for Engineers
ECE 533 Digital Image Processing
Introduction to Statistical Signal Processing
Course Projects Speech Recognition Spring 1386
Data Mining: Concepts and Techniques Course Outline
Voice Removal from Music
Automating Early Assessment of Academic Standards for Very Young Native and Non-Native Speakers of American English better known as The TBALL Project.
DS595/CS525: Urban Network Analysis Prof. Yanhua Li
Cpt S 471/571: Computational Genomics
Instructor: Joel Grodstein
Phonetics: Sound Principles
Language Arts Grade 11 Week 5 Lesson 2
Course Goals and Strategy
Presentation transcript:

EE 225D Audio Signal Processing in Humans and Machines Prof. N. Morgan and friends MW 4:00-5:30 html

Textbook Speech and Audio Signal Processing Gold, Morgan, and Ellis Wiley&Sons, 2 nd edition, 2011

Prerequisites EE123 or equivalent, and Stat 200A or equivalent; or grad standing and consent of instructor

Speech and audio signal processing: why does this material matter? Speech w/o visual vs visual w/o speech Requires DSP, machine learning Multidisciplinary tasks are good training Many applications!

What should we be able to do (automatically)? Human example suggests, plenty What was said Who said it When they said it What it meant How to respond

Why is it hard? Speaker variability (within and between) Noise, reverberation, channel Confusable vocabulary Meaning and tone

Course Philosophy I People can do these tasks effortlessly Include psychoacoustics and physiology Also some acoustics But of course, also DSP and machine learning

Course Philosophy II First part of the course is basic stuff The rest is applications Much of the course grade based on an original project Some practice in oral presentation Middle of the course has students presenting the material (slides from previous classes can help)

Section I: Broad background Synthesis/vocoding history (chaps 2&3) Recognition history (chap 4) Machine recognition basics (chap 5) Human recognition basics (chap 18)

Section II: Scientific background Pattern classification (chaps 8 and 9) Acoustics (chaps 10 and 13) Linguistic sound categories (chap 23) (Auditory neurophysiology late in the course)

Section IIIa: Engineering Apps Speech recognition Signal processing “front end” (chaps 19-22) Deterministic sequence recognition (chap 24) Statistical modeling and inference (chaps 25,26) Discriminant methods and adaptation (chaps 27,28) Speech recognition and understanding (chap 29)

Section IIIb: Engineering Apps Other speech applications Speech synthesis (chap 30) Speaker verification (chap 41)

Section IIIc: Engineering Apps Other audio applications Perceptual audio coding (chap 35) Music signal analysis (chap37) Source separation (chap 39)

Section IV: Hearing [presented by Prof. Oded Ghitza, Boston University] Auditory physiology (chap 14) Psychoacoustics (chap 15,16)

Section V: Student Projects Project proposal: By spring break, iterate on proposed project Last week of class, students present their projects, modeled after ICASSP or Interspeech Finals week, submit written version of project, schedule demos Any topic in speech/music/general audio potentially OK, including tutorial or original research

Course grading Quizzes/homeworks (for first half): 20% Student presentations/participation: 20% Project proposal: 10% Project oral presentation: 20% Project write-up & results: 30%

Course location After today, 6 th floor ICSI 1947 Center Street, between Milvia and MLK Class will start at 4:15 instead of 4:10 (15 minute walk from Cory) Office hour, one hour before each class

Course location