At the DevDay 2024 developer event, OpenAI presents the new real-time API. It supports natural language conversations in six different voices.
During OpenAI’s developer event in San Francisco, the AI company unveiled four major API updates for developers. One of the most important is the Realtime API. This API supports natural speech-to-speech conversations based on six different voices. It is available as a public beta for developers.
Real-time API
OpenAI will unveil four new APIs for developers at its DevDay 2024 developer event, the most important of which is the Realtime API. This API supports natural speech-to-speech conversations based on six preset voices. This allows developers to integrate functionality into their applications, similar to ChatGPTs Advanced voice mode. This API is available in public beta.
According to OpenAI, the Realtime API can streamline the process of creating voice assistants. First, developers had to use different models for speech recognition, text processing, and text-to-speech conversion. With the new API, they can handle this entire process at once.
Additionally, OpenAI introduces two new APIs to help developers balance performance and cost when building AI applications. Model Distillation allows developers to refine smaller models based on the output of more advanced models. Additionally, prompt caching can speed up inference by storing frequently used prompts. Finally, “Vision Fine-tuning” allows developers to customize GPT-4o by providing custom images and text.
Developer event
OpenAI’s annual developer event took place in San Francisco on Monday. This event is by invitation only. Sam Altman, CEO of OpenAI, decided to take a global approach this year. The event is organized in several locations and lasts only one day. The next locations are London (October 30) and Singapore (November 21).