Binaural speaker recognition for humanoid robots

Abstract

In this paper, an original study of a binaural speaker identification system is presented. The state of the art shows that, contrarily to monaural and multi-microphone approaches, binaural systems are not so much studied in the specific task of automatic speaker recognition. Indeed, these systems are mostly used for speech recognition, or speaker localization. This study will focus on the benefits of the binaural context in comparison with monaural techniques. It demonstrates the interest of the binaural systems typically used in humanoid robotics. The system is first tested with monaural signals, and then with a binaural sensor, in many signal to noise ratios, speech durations and speaker directions. Up to 11 percent of improvement in recognition ratios of 23 ms frames can be obtained. The used database is a set of audio tracks recorded for 10 speakers, and filtered by HRTFs to obtain binaural signals in the directions of interest, for the binaural training and testing steps. This way, we study the sensitivity of the system to the speaker's location in an environment where a maximum of 10 speakers is present.

Publication
in 2010 (11th) International Conference on Control Automation Robotics & Vision