Partial Additive Speech

This is a pytorch implementation of Partial Additive Speech data augmentation method.

Title - PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification (poster presentation at CKAIA2023)
Authors - Wonbin Kim, Hyun-seo Shin, Ju-ho Kim, Jungwoo Heo, Chan-yeong Lim, Ha-Jin Yu

Abstract

Background noise reduces speech intelligibility and quality, making speaker verification (SV) in noisy environments a challenging task. To improve the noise robustness of SV systems, additive noise data augmentation method has been commonly used. In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environments. The experimental results demonstrate that PAS outperforms traditional additive noise in terms of equal error rates (EER), with relative improvements of 4.64% and 5.01% observed in SE-ResNet34 and ECAPA-TDNN. We also show the effectiveness of proposed method by analyzing attention modules and visualizing speaker embeddings.

Preprocessing

Before run train and test process, Datasets must be prepared. To do that, you need to modify and run preprocess.py. The modifying just requires set path variables to detect where are VoxCeleb1 dataset and MUSAN noise.

Run

1. With docker
change paths in launch.sh file and run it.

/data/vox1_musan - root folder of musan noise
/data/voxceleb1 - root folder of VoxCeleb1 data. this folder should have sub directories, train and test.

2. Without docker
change paths in config.py directly and run main.py.

Citation

@misc{kim2023pas,
      title={PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification}, 
      author={Wonbin Kim and Hyun-seo Shin and Ju-ho Kim and Jungwoo Heo and Chan-yeong Lim and Ha-Jin Yu},
      year={2023},
      eprint={2307.10628},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
img		img
label		label
loss		loss
models		models
trainer		trainer
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.py		config.py
launch.sh		launch.sh
main.py		main.py
preprocess.py		preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Partial Additive Speech

Abstract

Preprocessing

Run

Citation

About

Languages

rst0070/Partial_Additive_Speech

Folders and files

Latest commit

History

Repository files navigation

Partial Additive Speech

Abstract

Preprocessing

Run

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages