Wavelet Denoising of Speech Signals for Use in Speech Transcription Systems

 

Goal : Denoise speech signals to increase effectiveness of speech transcription systems



Graphical Results of Wavelet Denoising

Specifications/Research :

      As a final project for UCSD's ECE 251c: Wavelets and Filter Banks, myself and two colleagues of mine, Benjia Zhang and YungYi Sun, decided to investigate the usage of the wavelet transform to denoise speech signals in order to improve the efficacy of speech transcription systems. The project entailed applying varying intensities of stationary and non-stationary noise to a sample audio signal, preforming various wavelet transforms and coefficient thresholding methods, and then validating the results using MSE from the original signal along with word recognition improvement using OpenAI's Whisper automatic speech recognition. 

Final Report for the Project

Conclusion:

    This project was quite rewarding, as not only did we glean a much deeper and more insightful understanding of the wavelet transform, but we also succeeded in our initial goal of signal denoising. Through analysis of the current literature as well as rigorous experimentation, we got to explore the advantages and disadvantages of many different kinds of wavelet filters and their variations through their vanishing moments. We also got a chance to implement both the standard soft/hard thresholds as well as more thoughtful and unique hyperbolic tangent-based and decomposition level-based ones. This all culminated in dramatic improvement of SNR as well as improvement of automatic speech recognition performance. 



Comments