openslr.org

Open Speech and Language Resources

EMNS

Identifier: SLR136

Summary: An emotive single-speaker dataset for narrative storytelling. EMNS is dataset containing transcriptions, emotion, emotion intensity, and description of acted speech.

Category: Speech, text-to-speech, automatic speech recognition

License: Apache 2.0

Downloads (use a mirror closer to you):
raw_webm.tar.xz [192M] (Unprocessed raw recording ) Mirrors: [EU] [EU] [CN]
raw_alignment.tar.xz [470K] ( Alignment for raw audio recordings ) Mirrors: [EU] [EU] [CN]
cleaned_alignment.tar.xz [440K] ( Trimed silance from start and end of recording. ) Mirrors: [EU] [EU] [CN]
cleaned_webm.tar.xz [42M] ( Alignment for processed audio recordings ) Mirrors: [EU] [EU] [CN]
metadata.csv [311K] ( Pipe seporated csv, containing transcription, description, emotion, emotion intensity and path to audio recording.) Mirrors: [EU] [EU] [CN]

About this resource:

Emotive Narrative Storytelling (EMNS) corpus introduces a dataset consisting of a single speaker, British English speech with high-quality labelled utterances tailored to drive interactive experiences with dynamic and expressive language. Each audio-text pairs are reviewed for artefacts and quality. Furthermore, we extract critical features using natural language descriptions, including word emphasis, level of expressiveness and emotion.

EMNS data collection tool: https://github.com/knoriy/EMNS-DCT

EMNS cleaner: https://github.com/knoriy/EMNS-cleaner

You can cite the data using the following BibTeX entry:

@Unpublished{EMNS_corpus,
  title={{EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels}},
  author={Kari, Noriy and Xiaosong, Yang and Jian, Zhang},
  month={march},
  year={2023},
}