Approaches for Automatic Speaker Recognition in a Binaural Humanoid Context

Abstract

This paper presents two methods of Automatic Speaker Recognition (ASkR). ASkR has been largely studied in the last decades, but in most cases in mono-microphone or microphone array contexts. Our systems are placed in a binaural humanoid context where the signals captured by both ears of a humanoid robot will be exploited to perform the ASkR. Both methods use Mel-Frequency Cepstral Coding (MFCC), but one performs the classification with Predictive Neural Networks (PNN) and the other performs it with Gaussian Mixture Models (GMM). Tests are made on a database simulating the functioning of the human ears. They study the influence of noise, reverberations and speaker spatial position on the recognition rate.

Publication
in 19th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN)