Languages —

Thursday, May 30, 2024


    OpenAI unveiled a new artificial intelligence model called GPT-4o.

    Share This Post

    OpenAI, an American artificial intelligence (AI) startup, announced on Monday its latest model, GPT-4o, which is far quicker than prior models.

    GPT-4o, with the “o” standing for “omni,” is a step toward more natural human-computer interaction since it accepts any combination of text, audio, and image inputs and creates any combination of text, audio, and image outputs, according to the business.

    “It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation,” it stated.

    Furthermore, the business claims that GPT-4o outperforms previous models in terms of visual and audio understanding, as well as reasoning across audio, vision, and text in real time.

    While GPT-4 loses a lot of information since it cannot directly observe tone, numerous speakers, or background sounds and cannot output laughing, singing, or emotion, GPT-4o delivers all inputs and outputs that are processed by the same neural network.

    According to Microsoft-backed OpenAI, GPT-4o has also undergone intensive collaboration with over 70 experts in disciplines such as social psychology, prejudice and fairness, and disinformation to detect hazards brought by the newly added modalities.

    Share This Post