world-admin/Sound-Source-Localization-for-Spoken-Conversation-Humanoid
The main aim of the project proposal is to develop different modules involved in a humanoid that can navigate autonomously in a specific environment. Furthermore, the humanoid should be able to converse with patients/elderly in their native language and carry out limited electro-mechanical activities. The main idea is to attain an off-the-shelf (ready-made) humanoid and develop independent modules such as Automatic Speech Recognition (ASR) interfaced to Chabot, Text to Speech synthesis (TTS), Sound source localization (SSL), computer vision, artificial intelligence (AI), and necessary computer networking, and security applications. Though the proposed application targets health care and the aging population, it can easily extend the applications to other areas. The proposed work can be broadly categorized as hardware and software stack. The idea is to maintain common hardware and make the necessary changes to the software stack to meet different objectives. This type of architecture enables the building of different applications with easy customizations rapidly. For Spoken Conversation humanoids who can guide the elderly, assist and look after them, assistive robots can be perceived in two main ways: tools or partners. Considering the past and latest research, assistive humanoids that are invented to provide physical assistance for the purpose of the elderly people are often invented in the context of a tool analogy. The orientation of conversational robots to hide their interlocutors is essential for natural and efficient Human-Robot Interaction (HRI).Knowing the origin of the sound source is a very important skill for a robot because this skill plays an important role during the interaction, for instance, in calling a person over, or assisting them while they do their work, etc. This assistive humanoid can detect the direction of a user, and orient itself towards him/her, in a complex auditive environment, using only voice and a 4-microphone system. This functionality is integrated within Spoken Human Robot Interaction using dialogue modules and theoretical architecture.