OpenAI recently introduced a significant upgrade to its popular ChatGPT platform, unveiling GPT-4o during a live stream event featuring Mira Murati, OpenAI’s Chief Technology Officer. This enhanced model represents a notable leap forward from GPT-4, showcasing improved speed and capabilities across text, vision, and audio modalities.
GPT-4o is now available to all users at no cost, with paid users receiving expanded capacity limits up to five times greater than those offered to free users. OpenAI plans to roll out additional features for GPT-4o iteratively, starting with text and image capabilities and gradually incorporating voice and video functionalities.
READ MORE: Afghan Spectator Expelled from Ground for Misconduct Towards Shaheen Shah Afridi
Unlike its predecessor, GPT-4o is fully multimodal, meaning it can process and generate content from various inputs including text, voice, and images. According to OpenAI CEO Sam Altman, developers will have access to an API that is twice as fast and half the cost of GPT-4 Turbo for experimenting with GPT-4o.
In its current state, ChatGPT’s voice mode has limitations, such as responding to only one prompt at a time and relying solely on audio input. However, upcoming features will enable ChatGPT to function as a voice assistant similar to the AI depicted in the film “Her” (2013), allowing real-time interaction and the ability to observe surroundings through a camera.
A demonstration video showcases the capabilities of GPT-4o, including two models engaging in natural conversation and even singing together. This upgrade marks a significant step forward in AI technology, enhancing ChatGPT’s utility and versatility across different mediums.