Towards Controlling False Alarm - Miss Trade-Off in Perceptual Speaker Comparison via Non-Neutral Listening Task Framing
Tiedosto(t)
Rinnakkaistallenteen versio
final draftPäivämäärä
2020Tekijä(t)
Yksilöllinen tunniste
10.1109/ASRU46091.2019.9003978Metadata
Näytä kaikki kuvailutiedotLisätietoa
Rinnakkaistallenne
Viittaus
Gonzalez Hautamäki, Rosa. Kinnunen, Tomi. (2020). Towards Controlling False Alarm - Miss Trade-Off in Perceptual Speaker Comparison via Non-Neutral Listening Task Framing. 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU): Proceedings, 749-756. 10.1109/ASRU46091.2019.9003978.Oikeudet
Tiivistelmä
Speaker comparison by listening is a valuable resource, for instance, in human voice discrimination studies, and voice conversion (VC) systems evaluations. Usually, listeners are provided with application-neutral guidelines that encourage retaining overall high speaker discrimination accuracy. Nonetheless, listeners are subject to misses (declaring same-speaker trial as different-speaker) and false alarms (vice versa) with possibly non-symmetric outcomes. In automatic speaker verification (ASV) applications, the consequences of a miss and a false alarm are rarely equal, and decision making policy is adjusted towards a given application with a desired miss/false alarm trade-off. We study whether listener decisions could similarly be controlled to provoke more accept (or reject) decisions, by framing the voice comparison task in different ways. Our neutral, forensic, user-convenient bank and secure bank scenarios are played by disjoint panels (through Amazon's Mechanical Turk), all judging the same speaker trials originated from RedDots and 2018 Voice Conversion Challenge (VCC 2018) data. Our results indicate that listener decisions can be influenced by modifying the task framing. As a subjective task, the challenge is how to drive the panel decisions to the desired direction (to reduce miss or false alarm rate). Our preliminary results suggest potential for novel, application-directed speaker discrimination designs.