Prasanta Kumar Ghosh and his team decode the intricacies of human speech
Gone are the days when electrical engineering just meant long trails of wires and beeping machines. In the age of Siri and Alexa, the field has grown to encompass diverse domains. One such area of research is the study of human speech. Researchers working in this area collect, process and interpret not just sound, but also other cues from a speaker. This has been the focus of the SPIRE lab at the Department of Electrical Engineering (EE).
Headed by Prasanta Kumar Ghosh, Associate Professor, the lab is keenly interested in the orchestra of organs that generate the majority of sounds in our body – the respiratory and the vocal systems. These organs not only enable us to speak, but the sounds and cues from them also contain tremendous information about their health. Combine these vocal signals with facial expressions and body movements when we talk, and the result is a rich set of data that allows scientists to ask and answer some challenging research questions.
One of the challenges in speech research is evaluating one’s English pronunciation, particularly when the speaker is from India, a country with a diversity of languages. For example, as Prasanta points out, a Gujarati and Kannadiga speaking the same English sentence will sound different. “This is because of the influence of the accent of their mother tongues, and you can tell that I am a Bengali the same way,” he says. Since English is a must for most jobs, ideally the speaker should have a neutral accent so that customers can understand what they are saying. Often, some training is needed to reduce the influence of native language accents, he adds. “English Gyani”, a software tool being developed in the SPIRE lab, aims to do just that. Using automated speech recognition algorithms, it acts as a personalised tutor that provides interactive feedback to the user, based on their voice recordings.
Another major research focus is tracking the movements of body parts involved in articulating speech – the jaws, lips, teeth and tongue. For this, the researchers use a specialised device called the Electromagnetic Articulograph. Special sensors placed on these body parts record where and how they move during speech. This data can assist doctors to detect, diagnose and monitor several neurodegenerative disorders including Amyotrophic Lateral Sclerosis (ALS) and Parkinson’s Disease (PD), in which struggling to speak is a typical sign of disease progression. Towards this goal, they have recently launched an app called “Neurocompanion”, which enables the telemonitoring of patients using their voice recordings. This research is being conducted in collaboration with the National Institute of Mental Health and Neurosciences (NIMHANS), funded by the Department of Science and Technology, Government of India.
The SPIRE lab also collaborates with the All India Institute of Speech and Hearing (AIISH) at Mysore, in developing tools for several types of voice therapies. One goal is to quantify the extent of damage to the voice box, using data from high-speed videos of the vocal fold vibrations while a person speaks. Another is to develop a tool to help people with severely damaged or absent voice boxes, by converting their whisper-like speech into normal-sounding speech.
Prasanta’s interest in pursuing a career in this area was seeded when he was a Master’s student at IISc. “Here, I got a flavour for doing research for the first time. And the teachers were really great, in terms of not just teaching, but helping us to see beyond what was taught in class,” he says. In particular, TV Sreenivas from the Department of Electrical Communication Engineering and YV Venkatesh from EE at IISc inspired him greatly, he adds.
After IISc, brief stints as a researcher at Microsoft and IBM helped him choose academia over industry. “Everything else was great [in the industry], but what was lacking was the freedom to define my own research path. In a company, I can’t really create my own tree of research, with all of the branches that I’m interested in. I wanted to create a research environment where I can solve problems that are impactful for society,” he says.
Prasanta has also received recognition as a teacher, including the Centre for Excellence in Teaching award from the EE department at the University of Southern California (USC) where he did his PhD, and the Professor Priti Shankar Teaching Award from IISc in 2017. He believes that the goal of teaching should be to train students to develop the ability to weave new ideas out of the concepts that they are taught. His teaching strategies are strongly influenced by those of Bart Kosko, who taught him at USC. He also encourages his students to give talks at other institutions, or intern at companies to get a feel for the work culture there.
Prasanta strongly believes that to mature as a researcher, students should be involved in pursuing a variety of questions. “The depth and breadth of research … that’s important. While the breadth helps a student to gain and share knowledge across different problems, the depth helps him or her to dig deeper into their own research,” he says. For that purpose, he ensures that each student is involved part-time in other projects as well. This also helps build a social network within the lab; if a student is stuck on a particular topic, they are in touch with their colleagues through meetings for various other projects that they are involved in, and don’t feel isolated or depressed, Prasanta explains.
“Pre-pandemic, we went on a lot of lab outings – had lunch outside and did other activities together. That connects all of us as a family, and helps us to grow socially as well,” he says. “Otherwise, we will all become like robots.”