A binaural sound source localization method using auditive cues and vision

Abstract

A fundamental task for a robotic audition system is sound source localization. This paper addresses the localization problem in a robotic humanoid context, providing a novel learning algorithm that uses binaural cues to determine the sound source's position. Sound signals are extracted from a humanoid robot's ears. Binaural cues are then computed to provide inputs for a neural network. The neural network uses pixel coordinates of a sound source in a camera image as outputs. This learning approach provides good localization performances as it reaches very small errors for azimuth and elevation angles estimates.

Publication
in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)