Open Speech and Language Resources


Identifier: SLR136

Summary: An emotive single-speaker dataset for narrative storytelling. EMNS is dataset containing transcriptions, emotion, emotion intensity, and description of acted speech.

Category: Speech, text-to-speech, automatic speech recognition

License: Apache 2.0

Downloads (use a mirror closer to you):
raw_webm.tar.xz [192M]   (Unprocessed raw recording )   Mirrors: [US]   [EU]   [CN]  
raw_alignment.tar.xz [470K]   ( Alignment for raw audio recordings )   Mirrors: [US]   [EU]   [CN]  
cleaned_alignment.tar.xz [440K]   ( Trimed silance from start and end of recording. )   Mirrors: [US]   [EU]   [CN]  
cleaned_webm.tar.xz [42M]   ( Alignment for processed audio recordings )   Mirrors: [US]   [EU]   [CN]  
metadata.csv [311K]   ( Pipe seporated csv, containing transcription, description, emotion, emotion intensity and path to audio recording.)   Mirrors: [US]   [EU]   [CN]  

About this resource:

Emotive Narrative Storytelling (EMNS) corpus introduces a dataset consisting of a single speaker, British English speech with high-quality labelled utterances tailored to drive interactive experiences with dynamic and expressive language. Each audio-text pairs are reviewed for artefacts and quality. Furthermore, we extract critical features using natural language descriptions, including word emphasis, level of expressiveness and emotion.

EMNS data collection tool:

EMNS cleaner:

You can cite the data using the following BibTeX entry:
  title={{EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels}},
  author={Kari, Noriy and Xiaosong, Yang and Jian, Zhang},