
Echo
A 3D model and AI-based application designed for children with Apraxia of Speech.
Role
UX Designer & researcher
Duration
Nov 23 - Jan 24
Context
Overview
Echo is a mobile platform designed for children with Apraxia of Speech to practice their speech anytime, anywhere. While not a substitute for therapy, Echo helps children maintain progress between sessions, especially when in-person visits are difficult due to factors like travel or busy schedules.
Problem
In-person speech therapy for children with Apraxia of Speech can be challenging for parents, often delaying progress. Without proper intervention, CAS can lead to muteness.
Goal
The project's goal is to sustain or improve progress in children with CAS by allowing them to practice anytime, anywhere.
Outcome and results
User testing revealed the need for better feedback and a more natural 3D model, as some found it unsettling. Experts suggested improving visual design with better color contrast and adding personalization features, like naming the character, to enhance user connection. Continuous collaboration with therapists was also recommended to ensure effective practice and build confidence in children.
Understanding Childhood Apraxia of Speech (CAS)
Every 2 out of 1000 children has CAS.
And the lack of treatment leads to permanent muteness.
Childhood Apraxia of Speech (CAS) is a neurological brain disorder that occurs when the brain's messages to the mouth are not transmitted accurately. Envision a child attempting to articulate "puppy" only to grapple with coordinating their lips and tongue, resulting in a pronunciation closer to "pu-ee." The child comprehends the intended word, but the imprecise brain instructions impede the muscles from executing the correct movements.
Relatively low incidence doesn't diminish the problem's significance as reduced speech practice can lead to muteness in the long term.

Research insights
These challenges, compounded by the difficulty in maintaining therapy frequency, present a substantial barrier to consistent progress in a child's speech development.
#speech therapy challenge
Speech therapy requires a child to attend the sessions for 4-5 times a week
#speech therapy challenge
Practical challenges like time constraints, costs, and distance make it difficult for working parents to meet therapy frequency requirements.
#speech therapy challenge
Teaching parents to help with home practice is common, but misidentifying speech issues can unintentionally reinforce incorrect patterns.
#how might we
help children with Childhood Apraxia of Speech (CAS) to receive quality therapy with optimal frequency (4–5 sessions a week), regardless of their location?
Devising the solution
3D avatar to help children practice speech using a mobile application aimed to leverage mobile phones as a tool for supplementary therapy, with the goal of increasing overall therapy session frequency. The design integrates Dynamic Temporal and Tactile Cueing (DTTC) - a technique endorsed by SLPs, with a positive impact on retention rates among children. Additionally, SLP can guide the child by selecting exercises for practice, and inputting them into the app for interactive sessions with the 3D avatar, fostering a dynamic and engaging learning experience.

#the four-stage process integrated with the 3D avatar
Simultaneous production
Avatar helps in word learning through imitation & multi-sensory cues.
Immediate repetition
Optimizes speech for fluency and adaptability with reduced cues.
Repetition after delay
Uses gamification, and introduces delay for adaptability.
Spontaneous production
Minimal cue, and encourages autonomous speech generation.


Data collection and feedback mechanism
The recorded data of a child's online therapy is collected via video and audio, with the use of Automatic Speech recognition and Facial Landmarking. This data serves a dual purpose: providing constructive feedback to the child and enabling therapists to make informed adjustments to the therapy approach.

Knowledge of performance
As utilized by SLPs, focuses on guiding the child regarding positioning speech muscles and making corrections as needed.

Knowledge of result
A concept employed by SLPs, entails providing feedback on the child's performance during speech attempts. This could include positive reinforcement.



2.
List of words that a child has to practice.



3.
Practice session with a 3D model.


4.
Feedback Mechanism


5.
Games to keep the children engaged through out the practice session.
Improvements after evaluation
The solution was evaluated with the help of experts, adults affected by CAS, and heuristic evaluation. These testing methods highlighted the areas of improvement with a huge setback as well.
#challenge during the process
Testing an AI-based product may not always yield accurate results, so opted for testing, that could provide valuable insights into the solution.
#improvement
Uncanniness of 3D model.
Addressing concerns about the 3D model's naturalness and optimizing facial landmarking for feedback are priorities for app refinement.
#improvement
Enhancing feedback mechanism.
Speech therapists recommend enhancing feedback within the app.
#improvement
Clear feedback
Clear feedback, especially regarding voice recording, is crucial for user engagement, particularly for children.
#improvement
Connection with the 3D model.
Building a strong connection with the 3D model is crucial, achieved through personalized touches like allowing children to name the character and using front-facing mobile cameras for customized training.
#improvement
Early feedback
Continuous communication with therapists via video calls and tailored exercises ensures early corrections for speech practices, fostering user confidence.
#improvement
Color contrast
Expert evaluators recommend subtle visual design improvements, emphasizing enhanced color contrast for improved quality and intuitiveness.

ECHO doesn't hit the spot

Perfect Lip-Syncing:
Achieving natural synchronization is complex due to nuanced sounds and subtle facial movements.
Personalization:
Customizing for different accents or speech impairments requires advanced AI and large datasets.
High-Quality Realism:
Realistic appearance and facial movements demand significant computational resources and 3D modeling expertise.

Technological Limitations:
Perfect lip-syncing and realistic facial movements are still challenging to achieve in real-time.
High Computational Needs:
Creating detailed 3D avatars requires significant computational resources.
Customization Issues:
Accurate personalization for different accents or speech impairments needs advanced AI and diverse datasets.

Accessibility:
Making the technology affordable and accessible for everyday use, especially in education, is still a challenge.
Cost and Accessibility:
The technology can be expensive, making it hard to offer at an affordable price for widespread use.