曹昱研究员学术专题演讲109/06/01

  • 2020-05-25
  • 杨 文敏

国立政治大学统计学系
     
主讲人:曹昱研究员 (中研院资讯科技创新研究中心)
题   目:Speech Signal Processing for Assistive Hearing and Speaking Devices
时   间:民国10961 (星期一) 下午130 
地    点:国立政治大学逸仙楼050101教室
摘    要:
           With the rapid advancement in speech processing technologies and in-depth understanding of human speech perception mechanism, significant improvement has been made in the design of assistive hearing devices [assistive listening device (ALD), hearing aids (HAs), and cochlear implants (CIs)] to benefit the speech communication for millions of hearing-impaired patients and subsequently enhance their quality of life. However, there are still many technical challenges, such as designing noise-suppression algorithms catered for ALD, HA, and CI users, deriving optimal compression strategies, improving the music appreciation, optimizing speech processing strategies for users speaking tonal languages, to name a few. In the first part of my talk, I will present our recent research achievements using machine learning and signal processing on improving speech perception abilities for ALD, HA, and CI users. In the second part of my talk, I will present our recent progress of developing machine-learning-based assistive speaking devices. Oral cancer ranks in the top five of all cancers in Taiwan. To treat the oral cancer, surgical processes are often required to have parts of the patients’ articulators removed. Because of the removal of parts of the articulator, a patient’s speech may be distorted and difficult to understand. To overcome this problem, we propose two voice conversion (VC) approaches: the first one is the joint dictionary training non-negative matrix factorization (JD-NMF), and the second one is the end-to-end generative adversarial network (GAN)-based unsupervised VC model. Experimental results show that both approaches can be applied to convert the distorted speech such that it is clear and more intelligible.