Skip to main content
Top

End-to-end speaker identification research based on multi-scale SincNet and CGAN

  • 02-08-2023
  • Review
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The article discusses the challenges and advancements in speaker identification using deep neural networks. It introduces a novel end-to-end recognition system based on multi-scale SincNet and conditional generative adversarial networks (CGANs). This system addresses issues such as short speech recognition, data scarcity, and overfitting by capturing important narrowband speaker features and generating high-quality synthetic samples. The method is validated through extensive experiments on public datasets like LIBRISPEECH and TIMIT, demonstrating superior performance compared to traditional methods. The article also highlights the potential of this approach in real-world applications, such as the VoxCeleb corpus, and suggests future directions for further optimization.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Business + Economics"

Online-Abonnement

Springer Professional "Business + Economics" gives you access to:

  • more than 100.000 books
  • more than 340 journals

from the following specialised fileds:

  • Construction + Real Estate
  • Business IT + Informatics
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Insurance + Risk



Secure your knowledge advantage now!

Springer Professional "Engineering + Technology"

Online-Abonnement

Springer Professional "Engineering + Technology" gives you access to:

  • more than 75.000 books
  • more than 390 journals

from the following specialised fileds:

  • Automotive
  • Business IT + Informatics
  • Construction + Real Estate
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology





 

Secure your knowledge advantage now!

Springer Professional "Business + Economics & Engineering + Technology"

Online-Abonnement

Springer Professional "Business + Economics & Engineering + Technology" gives you access to:

  • more than 130.000 books
  • more than 540 journals

from the following subject areas:

  • Automotive
  • Construction + Real Estate
  • Business IT + Informatics
  • Electrical Engineering + Electronics
  • Energy + Sustainability
  • Finance + Banking
  • Management + Leadership
  • Marketing + Sales
  • Mechanical Engineering + Materials
  • Surfaces + Materials Technology
  • Insurance + Risk


Secure your knowledge advantage now!

Title
End-to-end speaker identification research based on multi-scale SincNet and CGAN
Authors
Guangcun Wei
Yanna Zhang
Hang Min
Yunfei Xu
Publication date
02-08-2023
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 30/2023
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-023-08906-1
This content is only visible if you are logged in and have the appropriate permissions.

Premium Partner

    Image Credits
    Neuer Inhalt/© ITandMEDIA, Nagarro GmbH/© Nagarro GmbH, AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH, USU GmbH/© USU GmbH, Ferrari electronic AG/© Ferrari electronic AG