OpenAI Advanced Voice Feature is Now Available to Select Users
OpenAI has begun the phased rollout of its new voice assistant feature for ChatGPT, marking a significant enhancement in AI interactivity. The alpha version of this feature, named Advanced Voice Mode, is now available to a select group of ChatGPT Plus subscribers, with plans to extend access to all premium users in the autumn of 2024.
Controversy Over Voice Similarity
The introduction of Advanced Voice Mode has been highly anticipated following a controversial demo in May, which showcased a voice option named “Sky.” This voice was noted for its striking resemblance to actress Scarlett Johansson’s voice, leading to significant backlash. Johansson, known for her role in the AI-themed film “Her,” expressed concern that the voice was eerily similar to hers. She sought legal counsel after OpenAI’s CEO Sam Altman reportedly attempted to engage her for permission to use her voice.
OpenAI has since clarified that the voice used in the demo was not Johansson’s but a distinct professional actress’s. Nevertheless, the company decided to exclude the “Sky” voice from the alpha release and apologised to Johansson.
Features and Privacy Measures
Advanced Voice Mode leverages OpenAI’s GPT-4o model, offering hyper-realistic audio responses that facilitate real-time, delay-free conversations. This mode addresses previous challenges in achieving natural AI interactions, including the ability to handle interruptions mid-sentence.
To prevent misuse, OpenAI has limited the system to four preset voices: Juniper, Breeze, Cove, and Ember. These voices were created in collaboration with paid voice actors. The company has implemented measures to ensure that ChatGPT cannot impersonate specific individuals or public figures, aiming to avoid the creation of deceptive deepfakes. Additionally, OpenAI has added filters to block requests for generating copyrighted audio, responding to the growing concerns about AI’s impact on intellectual property.
Testing and Future Plans
OpenAI has extensively tested GPT-4o’s voice capabilities with over 100 external testers speaking 45 languages. A detailed report on these safety efforts will be released in early August. The company is taking a cautious approach to rolling out the feature to monitor its use and ensure compliance with privacy and copyright standards.
The broader rollout of Advanced Voice Mode will occur in the fall as OpenAI continues refining its technology and addressing emerging issues.
Conclusion
OpenAI’s launch of the Advanced Voice Mode for ChatGPT marks a significant step forward in AI communication technology. Despite early controversies surrounding voice similarities and intellectual property concerns, the company’s new feature aims to enhance user interaction while incorporating rigorous privacy and security measures.