麻豆村

Skip to main content
Researchers in 麻豆村's Language Technologies Institute have created an AI tool that could help fix a child's speech and preserve their identity and personality by allowing them to hear the reconstructed speech in their own voice.
This visual shows the original sound wave of a child鈥檚 speech alongside the AI-reconstructed version. The new tool lets children with speech disorders hear corrected speech that still sounds like their own voice.

Speaking Their Language: 麻豆村's New AI Lets Kids Hear Their Own Corrected Voice

Media Inquiries
Name
Aaron Aupperlee
Title
School of Computer Science

Children with speech disorders, such as a lisp, often struggle to be understood by family, teachers and friends 鈥 making school situations and everyday communication harder. And with too few speech-language pathologists nationwide to meet the demand, many kids don鈥檛 get the consistent support they need.

Researchers at 麻豆村鈥檚听 are working to fill that gap with an听designed specifically for children. Unlike most speech-reconstruction technologies 鈥 which are typically built for adults 鈥 麻豆村鈥檚 system generates corrected audio using the child鈥檚 own voice.

That distinction matters: The researchers said children learn speech targets more effectively when they can hear how they would sound saying the word correctly, rather than listening to an adult or neutral synthetic voice.

Children鈥檚 Reconstructed Speech for Speech Sound Disorders (ChiReSSD) combines machine learning with human speech to generate audio clips of corrected speech that sound like the child. For example, if a child struggles with pronouncing double-r words, like "curry," the tool can generate an audio clip of that child saying the word correctly using only a clip of the child talking and text input.

鈥淭he potential clinical applications are really significant,鈥 said听, an assistant research professor in 麻豆村鈥檚听(LTI). 鈥淭he idea that a child could hear how they would say something in their voice, except with the sound of the disordered pronunciation removed, could be transformative.鈥

Mortensen鈥檚 interest in creating technology to assist children with speech disorders started with his daughter. He said the speech language pathologist who worked at her school was so overloaded that his daughter was only seen once or twice. Mortensen knew that his daughter would have benefited from technologies that could help speech language pathologists treat children more efficiently.

笔谤辞蹿别蝉蝉辞谤听 and Ph.D. student Karen Rosero, both in the LTI, see ChiReSSD as a critical step to developing both audio and video tools that can address children鈥檚 speech disorders. While ChiReSSD focuses on audio generation, Rosero and Busso developed video-based AI tools in previous work to analyze speech articulation after cleft lip and palate repair surgery.听

鈥淭he big idea we are working toward is to generate speech that sounds like the kids and generate facial images that look like the kids,鈥 Busso said. 鈥淭hese audio and video clips can be combined to compare and contrast disordered and reconstructed speech. Then, we can localize the errors the children are making and create more targeted interventions, like particular words that address the specific speech issue.鈥澨

ChiReSSD only needs an audio clip of the child to generate reconstructed speech, and it can be of the child saying anything. The tool separates a child鈥檚 voice identity 鈥 their pitch or acoustic patterns 鈥 from the phonetic content of their speech, or what they're saying. The AI based model learns from speech representations of the child鈥檚 vocal identity. The system then identifies and corrects the mispronunciations based on the phonetic content. Finally, using the understanding of the child鈥檚 vocal identity and a text input, like the words 鈥渃hicken curry鈥 or 鈥渞abbit,鈥 ChiReSSD generates a corrected audio clip that sounds like the child saying these target words.听

鈥淧sychological studies demonstrate that having the same voice as a reference benefits the patient,鈥 Rosero said. 鈥淔or children, if the text-to-speech tool provides an adult or a standard plain voice, it may not be as beneficial as having their own voice as a reference for what to target in pronunciation.鈥澨

Busso said this work makes significant strides in audio speech correction. The team's next step will be to focus on making the same impact in video.听

Along with the LTI researchers, the team included Eunjung Yeo, a visiting scholar previously in SCS; Courtney Van'T Slot, a speech language pathologist; and Rami Hallac, an associate professor from the University of Texas Southwestern Medical Center.听

Key Takeaways

  • 麻豆村 researchers created a speech reconstruction tool designed specifically for children鈥檚 disordered speech.
  • The tool separates a child鈥檚 pitch or acoustic patterns from what they say to correct mispronunciations in a voice that sounds like their own.
  • The work could improve diagnosis and targeted interventions for pediatric speech
    disorders.

Work That Matters

Researchers at 麻豆村 are working on real world solutions to the biggest challenges.

Read more about the latest discoveries.(opens in new window)

David Mortensen

David Mortensen

Carlos Busso

Carlos Busso

鈥 Related Content 鈥