OpenAI’s Upcoming Advanced Voice Mode with New Features and Voices
OpenAI has announced the rollout of its Advanced Voice Mode (AVM) for ChatGPT, a feature aimed at enhancing the conversational experience with the AI model. This significant update will initially be available to paying customers in the Plus and Teams tiers, with plans for broader access to Enterprise and Edu users starting next week.
New Features and Voices
The Advanced Voice Mode introduces several notable enhancements, including the ability for users to interrupt responses mid-sentence and for the AI to respond to emotional cues based on the user’s tone of voice. As OpenAI stated, “the updated version responds to your emotions and tone of voice and allows you to interrupt it mid-sentence.” This functionality is expected to provide a more natural interaction experience, which many users have been eagerly anticipating.
In addition to these interactive features, AVM also includes five new voices: Arbor, Maple, Sol, Spruce, and Vale. This brings the total number of voices to nine, nearly matching competitors like Google’s Gemini Live. The names of these new voices reflect a nature theme, aligning with OpenAI’s goal of making the AI interaction feel more organic.
Design and Accessibility
The design of AVM has been revamped, now represented by a blue animated sphere, which replaces the previously shown black dots. Users will be notified through a pop-up in the ChatGPT app when the AVM is available to them. According to OpenAI, “you’ll also notice a new design for Advanced Voice Mode with an animated blue sphere.”
Access to this feature is being rolled out gradually, with all Plus and Team users expected to have access by the end of fall. However, geographic limitations apply, as the AVM is currently not available in the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein.
Safety and Development
The company has emphasised its commitment to safety in the deployment of the new feature. OpenAI has conducted extensive testing with external experts to ensure that AVM functions appropriately across diverse linguistic and cultural backgrounds. A spokesperson noted, “we tested the model’s voice capabilities with external red teamers, who collectively speak a total of 45 different languages, and represent 29 different geographies.”
Despite this careful approach, the development of AVM has not been without controversy. The initial demonstration of the voice mode in May sparked criticism when one of the showcased voices, named Sky, was found to bear a striking resemblance to the voice of actress Scarlett Johansson. Following this feedback, OpenAI promptly removed the voice, stating that it did not intend for any of its voices to mimic those of real individuals.
Conclusion
OpenAI’s Advanced Voice Mode represents a significant step forward in creating a more human-like interaction with its ChatGPT model. With the introduction of new voices, emotional responsiveness, and an upgraded design, users can expect a more engaging and personalised experience. As OpenAI continues to roll out these features, the implications for AI-driven conversation are profound, potentially setting a new standard in the industry.