DreamVoiceConversion

📝 DreamVoiceDB: Voice Timbre Dataset

The DreamVoiceDB dataset is an integral part of our research on voice timbre and its diverse characteristics. This page provides a detailed overview of the dataset creation process, the criteria for speaker selection, the methodology for keyword annotation, and our approach to enhancing the dataset’s richness and authenticity.

Overview

Voice timbre is shaped by a myriad of factors, including age, gender, physical properties of the vocal tract and vocal cords, and perceptual characteristics. To effectively encapsulate the multifaceted nature of voice timbre, we employed a comprehensive three-stage process.

Figure: Survey Method

Schematic diagram of DreamVoiceDB survey method

Figure 1: Schematic diagram of DreamVoiceDB survey method.

Details

You can find our sample survey page used for data collection here - DreamVoiceDB Survey Link

Keywords and Descriptors

The DreamVoiceDB dataset utilizes a variety of keywords and descriptors to annotate voice timbre. These are grouped into relatively objective and subjective categories, along with additional aspects related to suitability for various voice-related professions.

Relatively Objective Keywords [Multtple Options Single Choice]

Reference Audio can be found in the following powerpoint presentation: DreamVoiceDB Reference Audio

Relatively Subjective Keywords [Multiple Options Multiple Choice]

Additional Aspects

This includes evaluating the voice’s fit for professions such as singing, acting, public speaking, and other vocations where vocal qualities are paramount. Specific examples include:

Each keyword and descriptor plays a crucial role in accurately capturing and conveying the unique characteristics of each voice sample in the dataset.

Analysis and Enhancement

Generated Descriptors Examples:

Speaker ID WAV File Keywords Prompt
2562 File 2562 Male, Adult, Dark An adult male voice, with a dark timbre.
30 File 30 Female, Teenager, Bright A teenage girl’s voice, radiating brightness and energy.
2156 File 2156 Male, Senior, Strong A senior male voice, characterized by strength and power.

Dataset Statistics

Distribution HeatMaps Keyword Distribution

Access to Dataset and Analysis