Voice Conversion Challenge 2020 database v1.0
Date
2020-12-18Author(s)
Unique identifier
10.5281/zenodo.4345689Metadata
Show full item recordResearch data
Citation
Zhao Yi,National Institute of Informatics, Japan. Wen-Chin Huang,Nagoya University, Japan. Xiaohai Tian,National University of Singapore, Singapore. Junichi Yamagishi,National Institute of Informatics, Japan. Rohan Kumar Das,National University of Singapore, Singapore. Tomi Kinnunen,University of Eastern Finland, Finland. Zhenhua Ling,University of Science and Technology of China, P.R.China. Tomoki Toda,Nagoya University, Japan. , Voice Conversion Challenge 2020 database v1.0, 2020, 10.5281/zenodo.4345689.Abstract
Voice conversion (VC) is a technique to transform a speaker identity included in a source speech waveform into a different one while preserving linguistic information of the source speech waveform. In 2016, we have launched the Voice Conversion Challenge (VCC) 2016 [1][2] at Interspeech 2016. The objective of the 2016 challenge was to better understand different VC techniques built on a freely-available common dataset to look at a common goal, and to share views about unsolved problems and challenges faced by the current VC techniques. The VCC 2016 focused on the most basic VC task, that is, the construction of VC models that automatically transform the voice identity of a source speaker into that of a target speaker using a parallel clean training database where source and target speakers read out the same set of utterances in a professional recording studio. 17 research groups had participated in the 2016 challenge. The challenge was successful and it established new standard evaluation methodology and protocols for bench-marking the performance of VC systems. In 2018, we have launched the second edition of VCC, the VCC 2018 [3]. In the second edition, we revised three aspects of the challenge. First, we educed the amount of speech data used for the construction of participant's VC systems to half. This is based on feedback from participants in the previous challenge and this is also essential for practical applications. Second, we introduced a more challenging task refereed to a Spoke task in addition to a similar task to the 1st edition, which we call a Hub task. In the Spoke task, participants need to build their VC systems using a non-parallel database in which source and target speakers read out different sets of utterances. We then evaluate both parallel and non-parallel voice conversion systems via the same large-scale crowdsourcing listening test. Third, we also attempted to bridge the gap between the ASV and VC communities. Since new VC systems developed for the VCC 2018 may be strong candidates for enhancing the ASVspoof 2015 database, we also asses spoofing performance of the VC systems based on anti-spoofing scores. In 2020, we launched the third edition of VCC, the VCC 2020 [4][5]. In this third edition, we constructed and distributed a new database for two tasks, intra-lingual semi-parallel and cross-lingual VC. The dataset for intra-lingual VC consists of a smaller parallel corpus and a larger nonparallel corpus, where both of them are of the same language. The dataset for cross-lingual VC consists of a corpus of the source speakers speaking in the source language and another corpus of the target speakers speaking in the target language. As a more challenging task than the previous ones, we focused on cross-lingual VC, in which the speaker identity is transformed between two speakers uttering different languages, which requires handling completely nonparallel training over different languages. This repository contains the training and evaluation data released to participants, target speaker’s speech data in English for reference purpose, and the transcriptions for evaluation data. For more details about the challenge and the listening test results please refer to [4] and README file. [1] Tomoki Toda, Ling-Hui Chen, Daisuke Saito, Fernando Villavicencio, Mirjam Wester, Zhizheng Wu, Junichi Yamagishi "The Voice Conversion Challenge 2016" in Proc. of Interspeech, San Francisco. [2] Mirjam Wester, Zhizheng Wu, Junichi Yamagishi "Analysis of the Voice Conversion Challenge 2016 Evaluation Results" in Proc. of Interspeech 2016. [3] Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling, "The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods", Proc Speaker Odyssey 2018, June 2018. [4] Yi Zhao, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling, and Tomoki Toda. "Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion" Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 80-98, DOI: 10.21437/VCC_BC.2020-14.
Keywords
Link to the original item
https://zenodo.org/record/4345689Collections
Related items
Showing items related by title, author, creator and subject.
-
Plant size, latitude, and phylogeny explain within-population variability in herbivory
Wetzel, William (Montana State University); Hahn, Philip (University of Florida); Inouye, Brian (Florida State University); Underwood, Nora (Florida State University); Whitehead, Susan (Virginia Tech); Abbott, Karen (Case Western Reserve University); Bruna, Emilio (University of Florida); Cacho, N. Ivalu (National Autonomous University of Mexico); Dyer, Lee (University of Nevada Reno); (2023)Interactions between plants and herbivores are central in most ecosystems, but their strength is highly variable. The amount of variability within a system is thought to influence most aspects of plant-herbivore biology, ...Dataset
-
Role of iodine oxoacids in atmospheric aerosol nucleation: data resources
He, Xu-Cheng,University of Helsinki; Tham, Yee Jun,University of Helsinki; Dada, Lubna,University of Helsinki; Wang, Mingyi,Carnegie Mellon University; Finkenzeller, Henning,University of Colorado Boulder; Stolzenburg, Dominik,University of Helsinki; Iyer, Siddharth,University of Helsinki; Simon, Mario,Goethe University Frankfurt; Kürten, Andreas,Goethe University Frankfurt; (2020)Data for manuscript "Role of iodine oxoacids in atmospheric aerosol nucleation"Dataset
-
FLUXNET-CH4: A global, multi-ecosystem dataset and analysis of methane seasonality from freshwater wetlands (Appendix B and Figure 3)
Delwiche, Kyle B.,Department of Earth System Science, Stanford University, Stanford, California; Knox, Sarah Helen,Department of Geography, The University of British Columbia, Vancouver, British Columbia, Canada; Malhotra, Avni,Department of Earth System Science, Stanford University, Stanford, California; Fluet-Chouinard, Etienne,Department of Earth System Science, Stanford University, Stanford, California; McNicol, Gavin,Department of Earth System Science, Stanford University, Stanford, California; Feron, Sarah,Department of Earth System Science, Stanford University, Stanford, California; Department of Physics, University of Santiago de Chile, Santiago, Chile; Ouyang, Zutao,Department of Earth System Science, Stanford University, Stanford, California; Papale, Dario,Dipartimento per la Innovazione nei Sistemi Biologici, Agroalimentari e Forestali, Università degli Studi della Tuscia, Largo dell'Universita, Viterbo, Italy; euroMediterranean Center on Climate Change CMCC, Lecce, Italy e Forestali, Universita;; Trotta, Carlo,euroMediterranean Center on Climate Change CMCC, Lecce, Italy; (2021)This dataset contains metadata for methane flux sites in Version 1.0 of FLUXNET-CH4. The dataset also has seasonality parameters for select freshwater wetlands, which were extracted from the raw datasets published at ...Dataset