Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones
Self archived versionfinal draft
MetadataShow full item record
CitationSahidullah, Md. Thomsen, Dennis Alexander Lehmann. Gonzalez Hautamäki, Rosa. Kinnunen, Tomi. Tan, Zheng-Hua. Parts, Robert. Pitkänen, Martti. (2017). Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones. IEEE/ACM Transactions on Audio, Speech, and Language Processing, [Manuscript submitted Dec 15, 2016], 1-13.
While having a wide range of applications, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, in particular, replay attacks that are effective and easy to implement. Most prior work on detecting replay attacks uses audio from a single acoustic microphone (AM) only, leading to difficulties in detecting high-end replay attacks close to indistinguishable from live human speech. In this paper, we study the use of a special body-conducted sensor, throat microphone (TM), for combined voice liveness detection (VLD) and ASV in order to improve both robustness and security of ASV against replay attacks. We first investigate the possibility and methods of attacking a TM-based ASV system, followed by a pilot data collection. Secondly, we study the use of spectral features for VLD using both single-channel and dual-channel ASV systems. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based systems on a dataset of 38 speakers collected by us.We have achieved considerable improvement in recognition accuracy, with the use of dual-microphone setup. In experiments with noisy test speech, the false acceptance rate (FAR) of the dual-microphone GMM-UBM based system for recorded speech reduces from 69.69% to 18.75%. The FAR of replay condition further drops to 0% when this dual-channel ASV system is integrated with the new dual-channel voice liveness detector.