Open Speech and Language Resources


Identifier: SLR1

Summary: Sixty recordings of one individual saying yes or no in Hebrew; each recording is eight words long.

Category: Speech

License: No formal license but free to use for any purpose.

Downloads (use a mirror closer to you):
waves_yesno.tar.gz [4.7M]   ( This is the entire dataset. )   Mirrors: [US]   [EU]   [CN]  

About this resource:

This dataset was created for the Kaldi project (see, by a contributor who prefers to remain anonymous. The main point of the dataset is to provide an easy and fast way to test out the Kaldi scripts for free.

The archive "waves_yesno.tar.gz" contains 60 .wav files, sampled at 8 kHz. All were recorded by the same male speaker, in Hebrew. In each file, the individual says 8 words; each word is either the Hebrew for "yes" or "no", so each file is a random sequence of 8 yes-es or noes. There is no separate transcription provided; the sequence is encoded in the filename, with 1 for yes and 0 for no, for instance:

# tar -xvzf waves_yesno.tar.gz

External URL: