Open Speech and Language Resources



LibriTTS-R

Identifier: SLR141

Summary: Sound quality improved version of the LibriTTS corpus which is a large-scale corpus of English speech designed for TTS use

Category: Speech

License: CC BY 4.0

Downloads (use a mirror closer to you):
doc.tar.gz [318K]   (Documents of LibriTTS-R )   Mirrors: [US]   [EU]   [CN]  
dev_clean.tar.gz [1.3G]   (Development set, clean speech )   Mirrors: [US]   [EU]   [CN]  
dev_other.tar.gz [975M]   (Development set, more challenging speech )   Mirrors: [US]   [EU]   [CN]  
test_clean.tar.gz [1.2G]   (Test set, "clean" speech )   Mirrors: [US]   [EU]   [CN]  
test_other.tar.gz [1.0G]   (Test set, "other" speech )   Mirrors: [US]   [EU]   [CN]  
train_clean_100.tar.gz [8.1G]   (Training set derived from the original materials of the train-clean-100 subset of LibriSpeech )   Mirrors: [US]   [EU]   [CN]  
train_clean_360.tar.gz [28G]   (Training set derived from the original materials of the train-clean-360 subset of LibriSpeech )   Mirrors: [US]   [EU]   [CN]  
train_other_500.tar.gz [46G]   (Training set derived from the original materials of the train-other-500 subset of LibriSpeech )   Mirrors: [US]   [EU]   [CN]  
libritts_r_failed_speech_restoration_examples.tar.gz [106K]   (Lists of files where speech restoration failed )   Mirrors: [US]   [EU]   [CN]  
md5sum.txt [509 bytes]   (Checksums of the individual files )   Mirrors: [US]   [EU]   [CN]  

About this resource:

LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus (http://www.openslr.org/60/) which is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, published in 2019. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. To improve sound quality, a speech restoration model, Miipher proposed by Yuma Koizumi [2], was used.

For more information, refer to the paper [1]. If you use the LibriTTS-R corpus in your work, please cite the dataset paper [1] where it was introduced.

Audio samples of the ground-truth and TTS generated samples are available at the demo page: https://google.github.io/df-conformer/librittsr/

[1] Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, and Ankur Bapna, "LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus," arXiv, 2023.
[2] Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, and Michiel Bacchiani, "Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations," arXiv, 2023.