From monaural to binaural speaker recognition for humanoid robots

Abstract

This paper addresses speaker recognition in a binaural context. Such an auditory sensor is naturally well suited to humanoid robotics as it only requires two microphones embedded in artificial ears. But the state of the art shows that, contrary to monaural and multi-microphone approaches, binaural systems are not so much studied in the specific task of automatic speaker recognition. Indeed, these sensors are mostly used for speech recognition, or speaker localization. This study will then focus on the benefits of the binaural context in comparison with monaural techniques. The proposed approach is first evaluated in simulation through a HRTF database reproducing the head shadowing effect and with a 10-speakers database. Next, the method is assessed with an experimental binaural 15-speakers database recorded in our own almost-anechoic room for various SNR conditions. Results show that the speaker positions during the learning step of the proposed approach strongly influence the recognition rates.

Publication
in 2010 (10th) IEEE-RAS International Conference on Humanoid Robots