Deep Factorization for Speech Signal

Slides:

Advertisements

Similar presentations

Test practice Multiplication. Multiplication 9x2.

Advertisements

1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.

1 CPC group SeminarThursday, June 1, 2006 Classification techniques for Hand-Written Digit Recognition Venkat Raghavan N. S., Saneej B. C., and Karteek.

1990s DARPA Programmes WSJ and BN Dapo Durosinmi-Etti Bo Xu Xiaoxiao Zheng.

LSD2/KDM1B and Its Cofactor NPAC/GLYR1 Endow a Structural and Molecular Model for Regulation of H3K4 Demethylation Rui Fang, Fei Chen, Zhenghong Dong,

Improved Neural Network Based Language Modelling and Adaptation J. Park, X. Liu, M.J.F. Gales and P.C. Woodland 2010 INTERSPEECH Bang-Xuan Huang Department.

Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues Yu-Chen Kao and Berlin Chen Presenter : 張庭豪.

Update and Plan for Spring 2011 Yi Guo, Zheng Wang, Wenlin Zhang RavenShield Weekly Meeting Jan. 24, 2011.

1 Heterogeneous Cross Domain Ranking in Latent Space Bo Wang 1, Jie Tang 2, Wei Fan 3, Songcan Chen 1, Zi Yang 2, Yanzhu Liu 4 1 Nanjing University of.

Yajie Miao Florian Metze

2007 Multimedia System Final Paper Presentation Music Recognition 蘇冠年蔡尚穎.

Ekapol Chuangsuwanich and James Glass MIT Computer Science and Artificial Intelligence Laboratory,Cambridge, Massachusetts 02139,USA 2012/07/2 汪逸婷.

K I T E Tong Lee Lisa Vang. WHAT?  Early kites was use for military purposes  The earliest Chinese kites were made of wood and called muyaun. They date.

DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.

Multiple controller management in software defined networking 2014 IEEE Symposium on Computer Applications and Communications (SCAC) Ying Li, Ligang Dong,

Conformational Dynamics inside Amino-Terminal Disease Hotspot of Ryanodine Receptor Xiaowei Zhong, Ying Liu, Li Zhu, Xing Meng, Ruiwu Wang, Filip Van Petegem,

Xiao Shi COMP 2903 OCT 28, SUMMARY  Introduction of Baidu  Contrast between Baidu and Google  Difference between Baidu and Google  Conclusion.

DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.

Recurrent neural network based language model Tom´aˇs Mikolov, Martin Karafia´t, Luka´sˇ Burget, Jan “Honza” Cˇernocky, Sanjeev Khudanpur INTERSPEECH 2010.

Xinhao Wang, Jiazhong Nie, Dingsheng Luo, and Xihong Wu Speech and Hearing Research Center, Department of Machine Intelligence, Peking University September.

INTEGRATING TECHNOLOGY INTO THE LANGUAGE CLASSROOM THE STORY OF A VIDEO FILE AND AN INTERACTIVE WHITEBOARD Anamaria Comes The Foreign Language Center Brasov,

Distributed Nodes Cellular Systems for IEEE m Document Number: IEEE S802.16m-07_214r2, Date Submitted: Source: Yunzhou Li, Shidong Zhou,

Using Conversational Word Bursts in Spoken Term Detection Justin Chiu Language Technologies Institute Presented at University of Cambridge September 6.

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized commercial reproduction of this slide is prohibited Supplemental PowerPoint Slides Chondrocyte-Specific.

Benedikt Loesch and Bin Yang University of Stuttgart Chair of System Theory and Signal Processing International Workshop on Acoustic Echo and Noise Control,

Nugroho, M.Pd. Menu Introduction Task Process Resources Evaluation Conclusion.

Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.

Organized by GEIA China Working Group Presented by Qiang Zhang, Tsinghua University The 17 th GEIA Conference, Nov , Beijing, China Objectives: Understand.

Jingyuan Zhang 1, Bokai Cao 1, Sihong Xie 1, Chun-Ta Lu 1, Philip S. Yu 1,2, Ann B. Ragin 3 Identifying Connectivity Patterns for Brain Diseases via Multi-side-view.

VISUALIZATION PROJECT Weijie Chen Richie Hoffman Kate Bennett Owen Li.

NTNU Speech and Machine Intelligence Laboratory 1 Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models 2016/05/31.

Study on Deep Learning in Speaker Recognition Lantian Li CSLT / RIIT Tsinghua University May 26, 2016.

Survey on state-of-the-art approaches: Neural Network Trends in Speech Recognition Survey on state-of-the-art approaches: Neural Network Trends in Speech.

Mengyuan Zhao CSLT, RIIT, Tsinghua University

Olivier Siohan David Rybach

Unit 1 Memories and Ideas

Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition Pengrui Wang, Jie Li, Bo Xu Interactive Digital.

Deeply learned face representations are sparse, selective, and robust

2 Research Department, iFLYTEK Co. LTD.

The design of smart glasses for VR applications The CU-GLASSES

UBL Chinese Localization Subcommittee 10 May 2004

Course Projects Speech Recognition Spring 1386

A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.

Web Design And Development Company

Mean Euclidean Distance Error (mm)

To be supervised by Prof. KH Wong

Distributed Learning of Multilingual DNN Feature Extractors using GPUs

Convolutional Neural Networks for sentence classification

ريكاوري (بازگشت به حالت اوليه)

A Novel Multiple Access System Based on TDS-OFDM

Recurrent Neural Networks

Seizure - European Journal of Epilepsy

Volume 74, Issue 6, Pages (December 2018)

Sequential Data Cleaning: A Statistical Approach

Prevalence and treatment gap of active convulsive epilepsy: A large community-based survey in rural West China Jia Hu, Yang Si, Dong Zhou, Jie Mu, Juan.

Decision Making Based on Cohort Scores for

Memory-augmented Chinese-Uyghur Neural Machine Translation

Sandwich Writing Frame

Teaching Study Student status What are we doing? Courses planning,

Organizing Your Speech

Video Imagination from a Single Image with Transformation Generation

Chapter 3 Sampling.

LANGUAGE EDUCATION.

3. Adversarial Teacher-Student Learning (AT/S)

Combination of Feature and Channel Compensation (1/2)

A Population-Based Study of Acquired Bilateral Nevus-of-Ota-Like Macules in Shanghai, China Bei-Qing Wang, Zheng-Yu Shen, Ye Fei, Hong Li, Jian-Hang.

Deep Neural Network Language Models

Recent Developments on Super-Resolution

Introduction Face detection and alignment are essential to many applications such as face recognition, facial expression recognition, age identification,

CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.

Presentation transcript:

Deep Factorization for Speech Signal Lantian Li, Dong Wang, Yixiang Chen, Ying Shi, Zhiyuan Tang, Thomas Fang Zheng Center for Speech and Language Technologies (CSLT), RIIT, Tsinghua University, China http://cslt.riit.tsinghua.edu.cn Introduction Spectrum reconstruction Experiments (2) Speech signal Multiple informative factors. Various speech processing tasks. Difficulty of these tasks Factor blending Uncertainties An intuitive idea Speech factorization such as JFA Convolutional view Architecture of spectrum recovery AER by IDF and CDF on CHEAVD Spectrum reconstruction Cascaded deep factorization From significant to less significant Frame-level conditional training From IDF to CDF Experiments (1) ASR by IDF on WSJ WER: 9.16% SRE by IDF and CDF on Fisher 1k spks top-1 IDR(%) Conclusions Frame-level CDF for speech signal Valuable for learning less significant factor