OpenAI has introduced its much-anticipated Advanced Voice Mode for ChatGPT, adding a new layer to how users can interact with the AI.

This feature allows for more natural, real-time conversations, making interactions feel smoother and less robotic, with a conversational experience closer to speaking with another person.

Choosing a voice (uses system settings – Dutch)

Testing the Advanced Voice Mode

Powered by the GPT-4o, this update integrates text, vision, and audio capabilities, providing faster and more emotionally responsive dialogues. The new voice features enable ChatGPT to engage in conversations where the AI can recognize non-verbal cues like tone and pacing, adjusting its responses accordingly.

Here’s a look at some of the key aspects of the update:

Key Features of Advanced Voice Mode

  1. Dynamic Conversations: Users can now interrupt the AI mid-sentence, giving the feel of a more fluid and natural dialogue. This real-time interaction aims to reduce the mechanical nature of typical AI responses.
  2. Emotional Responsiveness: ChatGPT can detect users’ tone and adjust its replies to be more empathetic or enthusiastic, depending on the context, making the interaction more personalized.
  3. Variety of Voices: OpenAI has expanded its selection to nine distinct voices, allowing users to choose from personalities like “Arbor” (easygoing), “Cove” (direct), and “Vale” (inquisitive), to name a few.
  4. Multimodal Functionality: The advanced voice mode leverages multimodal capabilities, meaning it can handle various types of inputs (text, speech) in a unified manner. This makes interactions more dynamic.
  5. Gradual Rollout: As of now, the Advanced Voice Mode is available for Plus and Team subscribers in the U.S., with plans to expand to Enterprise users. Unfortunately, users in the EU, Switzerland, and other regions will have to wait for this feature to be made available.
  6. Future Plans: OpenAI has teased future updates, such as video and screen-sharing functionalities, which would further enhance ChatGPT’s interactivity and usefulness in business and personal contexts.

FAQs and User Information

  • Starting a Voice Chat: Initiating a voice conversation is simple. Users can tap the voice icon on the app’s bottom-right corner. Advanced Voice Mode features a blue orb, while the standard voice mode uses a black circle.
  • Voice Options: Users can choose from nine voices, including Breeze, Ember, and Juniper, each offering a unique style and tone.
  • Background Use: Users can keep conversations running in the background while using other apps, making multitasking easier.
  • Resuming Conversations: Advanced voice chats can be resumed, although they cannot yet support multimedia elements like images. Standard voice mode, on the other hand, can continue across different devices seamlessly.
  • Privacy and Audio Retention: Audio clips from advanced voice chats are retained alongside transcriptions in your chat history. Clips are deleted 30 days after you delete the chat. Users have the option to share audio clips to improve the model but can opt out at any time.

Limitations and Access

  • Availability: Advanced Voice Mode is currently available only on iOS and Android mobile apps as of version 1.2024.261 or later. However, it has not been fully rolled out in regions like the European Union.
  • Voice Conversations and GPTs: Advanced voice conversations are not yet available for GPT-powered custom models. Standard voice conversations are available with “Shimmer” voice specifically for GPT interactions.

Takeaways

  • Advanced Voice Mode: Allows for real-time, fluid conversations, with the ability to interrupt AI for more dynamic interaction.
  • Emotionally Responsive AI: Adjusts tone and responses based on user input, enhancing personalization.
  • Nine Distinct Voices: A range of voices to choose from, giving users more control over the interaction style.
  • Gradual Rollout: Available to U.S. ChatGPT Plus and Team subscribers, with more regions and user tiers to come.
  • Privacy Controls: Users have control over whether their audio is retained and can opt out of sharing audio clips for model training.

References:

Business Insider – OpenAI ChatGPT Advanced Voice Mode

OpenAI Help Center – Voice Mode FAQ

XDA Developers – ChatGPT’s Advanced Voice Mode

MIT Technology Review – OpenAI’s Advanced Voice Mode

https://www.businessinsider.com/openai-chatgpt-advanced-voice-mode-heres-what-to-expect-2024-9