MobvoiHotwords
Identifier: SLR87
Summary: Chinese hotwords detection dataset, provided by Mobvoi CO.,LTD
Category: Speech
License: Apache License v.2.0
Downloads (use a mirror closer to you):
mobvoi_hotword_dataset.tgz [17G] (Wave files of keyword and non-keyword data
) Mirrors:
[US]
[EU]
[CN]
mobvoi_hotword_dataset_resources.tgz [6.3M] (Label, speaker and channel information of above wave files
) Mirrors:
[US]
[EU]
[CN]
About this resource:
For keyword data, keyword utterances contain either 'Hi xiaowen' or 'Nihao Wenwen' are collected. For each keyword, there are about 36k utterances. All keyword data is collected from 788 subjects, ages 3-65, with different distances from the smart speaker (1, 3 and 5 meters). Different noises (typical home environment noises like music and TV) with varying sound pressure levels are played in the background during the collection. The keyword data is identical to the keyword data used in the paper below:
@article{DBLP:journals/spl/HouSOHX19, author = {Jingyong Hou and Yangyang Shi and Mari Ostendorf and Mei{-}Yuh Hwang and Lei Xie}, title = {Region Proposal Network Based Small-Footprint Keyword Spotting}, journal = {{IEEE} Signal Process. Lett.}, volume = {26}, number = {10}, pages = {1471--1475}, year = {2019}, url = {https://doi.org/10.1109/LSP.2019.2936282}, doi = {10.1109/LSP.2019.2936282} }There are also ~220 hours non-keyword data can be used as negative training samples, collected from the same smart speaker.