Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech
Self archived versionpublished version
MetadataShow full item record
CitationSahidullah, Md. Gonzalez Hautamäki, Rosa. Lehmann, Thomsen Dennis Alexander. Kinnunen, Tomi. Tan, Zheng-Hua. Hautamäki, Ville. Parts, Robert. Pitkänen, Martti. (2016). Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech. Proceedings of the 17th Annual Conference ot the International Speech Communication Association, 1720-1724. 10.21437/Interspeech.2016-1153.
Accuracy of automatic speaker recognition (ASV) systems degrades severely in the presence of background noise. In this paper, we study the use of additional side information provided by a body-conducted sensor, throat microphone. Throat microphone signal is much less affected by background noise in comparison to acoustic microphone signal. This makes throat microphones potentially useful for feature extraction or speech activity detection. This paper, firstly, proposes a new prototype system for simultaneous data-acquisition of acoustic and throat microphone signals. Secondly, we study the use of this additional information for both speech activity detection, feature extraction and fusion of the acoustic and throat microphone signals. We collect a pilot database consisting of 38 subjects including both clean and noisy sessions. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based system. We have achieved considerable improvement in recognition accuracy even in highly degraded conditions.