OpenAI launched on Monday its latest language model, an upgrade that enables ChatGPT to have human-like conversations with users.
Dubbed GPT-4o ("o" stands for "Omni"), this latest iteration promises faster and more dynamic capabilities than Voice Mode that enable all users, including those on the free tier, to interact with it through visual, audio, and text inputs.
This allows users to let GPT-4o "see" through their phone camera, read written text, and detect emotional cues.
Users can also interrupt the AI assistant during interactions for more fluid and natural conversations.
In a livestream demo, GPT-4o guided a user through a real-time tutorial on how to take deep breaths, successfully picking up on the user’s emotional state when they were breathing too fast.
Another demo saw the AI model narrate a story in different styles, from dramatic to robotic tones.
GPT-4o also assisted a user in solving a math equation by analyzing the problem visually rather than simply providing an answer.
This demo is insane.
— Mckay Wrigley (@mckaywrigley) May 13, 2024
A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.
Imagine giving this to every student in the world.
The future is so, so bright. pic.twitter.com/t14M4fDjwV
“This is the first time that we are really making a huge step forward when it comes to the ease of use,” OpenAI CTO Mira Murati said during a livestream presentation held at OpenAi's San Francisco office.
“This interaction becomes much more natural and far, far easier.”
Before Monday's livestream, rumors spread that OpenAI's next launch would involve the integration of its own search engine into ChatGPT to rival Google and Perplexity.
OpenAI CEO Sam Altman was quick to debunk these rumors through an X post, "not gpt-5, not a search engine, but we’ve been hard at work on some new stuff we think people will love! feels like magic to me."
GPT-4o's Standout Features
ChatGPT's latest model boasts a wide range of enhanced capabilities, including:
- Improved accuracy and comprehension to better grasp context, nuances, and user intent
- Faster response times for quicker processing and real-time translation
- An expanded knowledge base for a more dependable resource of current events
- Better conversational memory for personalized and contextually relevant responses
- Additional customization options to tailor its behavior and tone
These GPT-4o features will be rolled out in the coming weeks, with its text and image capabilities already available to both free and paid users.
OpenAI also announced the release of a desktop version for ChatGPT subscribers on Mac.
For many companies, this latest version of ChatGPT enables seamless integration of AI that has the potential to improve customer relations.
"This technology has the potential to transform the chatbots used by brands on their apps and websites, making AI interactions much more human-like, and mimicking personal shoppers or customer support representatives," William Chen, director of product management, AI and emerging tech at Agora, shared.
However revolutionary GPT-4o is, it might also deter some brands that are already hesitant about adopting the technology, according to AI startup Rembrand CMO Cory Treffiletti.
"People can get the AI to say what they want," said Treffiletti. "And brands want more control over their interaction with consumers."
READ NEXT: OpenAI's Sora Transforms Text Into Videos
Editing by Katherine 'Makkie' Maclang