OpenAI has excited the capabilities of ChatGPT, its generative AI bot, by quite a few notches. Now, ChatGPT can have voice-based conversations and see and understand images.
ChatGPT can now hear, speak, and see with whom it interacts.
Here’s how ChatGPT’s new features work.
Voice Conversations
Users can now enjoy dynamic and interactive dialogues with their AI assistant, unlocking exciting possibilities. Whether you’re on the move, seeking a bedtime story for your family, or settling a dinner table debate, ChatGPT’s voice capabilities are primed to assist.
To initiate voice interactions, navigate to the Settings menu in the mobile app, select “New Features,” and opt into voice conversations. Once activated, tap the headphone icon in the top-right corner of the home screen to choose from five distinct voices.
Professional voice actors have meticulously crafted these voices to deliver a human-like auditory experience. Additionally, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, augmenting the overall conversational quality.
Images and ChatGPT
Users can now present one or more images to ChatGPT for troubleshooting, content exploration, or complex data analysis. Whether you’re attempting to diagnose why your grill won’t start, plan a meal based on the contents of your fridge, or decode a data graph for work, ChatGPT is here to assist.
To use this feature, tap the photo button to capture or select an image. On iOS or Android, tap the plus button initially to include multiple photos or employ the drawing tool to guide your assistant.
These image capabilities harness the power of multimodal models, including GPT-3.5 and GPT-4, which apply linguistic reasoning skills to a broad spectrum of visual content, encompassing photos, screenshots, and documents containing text and images.
Safety and Responsiveness
Voice and image capabilities will be rolled out in a phased manner to Plus and Enterprise users over the next two weeks. Voice functionality is available on iOS and Android platforms, accessible through the settings, while image capabilities will be available on all platforms.
There are a lot of potential risks linked to these advanced capabilities. The emphasis is on voice chat, and the technology has been developed in collaboration with voice actors to ensure authenticity and safety.
Regarding image input, OpenAI has taken measures to limit ChatGPT’s capacity to analyze and make direct statements about individuals to respect their privacy. Real-world usage and user feedback will enhance these safeguards while upholding the tool’s utility.