Microsoft announced the launch of GPT-4o, OpenAI’s new flagship model on Azure AI. This groundbreaking multimodal model integrates text, vision, and audio capabilities, setting a new standard for generative and conversational AI experiences. GPT-4o is available now in Azure OpenAI Service, which you can try in preview.
GPT-4o is a step towards much more natural human-computer interaction. It accepts any combination of text, audio, image, and video as input and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, similar to human response time in a conversation.
GPT-4o offers a shift in how AI models interact with multimodal inputs. GPT-4o provides a richer, more engaging user experience by seamlessly combining text, images, and audio.
Azure OpenAI Service customers can explore GPT-4o’s extensive capabilities through a preview playground in Azure OpenAI Studio starting in two regions in the US. This initial release focuses on text and vision inputs to provide a glimpse into the model’s potential, paving the way for further capabilities like audio and video.
GPT-4o is engineered for speed and efficiency. Its advanced ability to handle complex queries with minimal resources can translate into cost savings and performance.
The introduction of GPT-4o opens numerous possibilities for businesses through Enhanced customer service, Advanced Analytics, and Content innovation in various sectors.
Microsoft is eager to share more about GPT-4o and other Azure AI updates at Microsoft Build 2024 to help developers further unlock the power of generative AI.
Also Read: Airtel and Google Cloud Collaborate to Accelerate Cloud and GenAI Solutions