HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

GPT-4o System Card

GPT-4o System Card

Abstract

GPT-4o is an autoregressive omni model that accepts as input any combinationof text, audio, image, and video, and generates any combination of text, audio,and image outputs. It's trained end-to-end across text, vision, and audio,meaning all inputs and outputs are processed by the same neural network. GPT-4ocan respond to audio inputs in as little as 232 milliseconds, with an averageof 320 milliseconds, which is similar to human response time in conversation.It matches GPT-4 Turbo performance on text in English and code, withsignificant improvement on text in non-English languages, while also being muchfaster and 50% cheaper in the API. GPT-4o is especially better at vision andaudio understanding compared to existing models. In line with our commitment tobuilding AI safely and consistent with our voluntary commitments to the WhiteHouse, we are sharing the GPT-4o System Card, which includes our PreparednessFramework evaluations. In this System Card, we provide a detailed look atGPT-4o's capabilities, limitations, and safety evaluations across multiplecategories, focusing on speech-to-speech while also evaluating text and imagecapabilities, and measures we've implemented to ensure the model is safe andaligned. We also include third-party assessments on dangerous capabilities, aswell as discussion of potential societal impacts of GPT-4o's text and visioncapabilities.

Benchmarks

BenchmarkMethodologyMetrics
optical-character-recognition-on-ocrbench-v2-englishGPT-4o
Accuracy: 47.6
spatial-reasoning-on-6-dof-spatialbenchGPT-4o
Orientation-abs: 25.8
Orientation-rel: 44.2
Position-abs: 28.4
Position-rel: 49.4
Total: 36.2
video-question-answering-on-tvbenchGPT4o 8 frames
Average Accuracy: 39.9
visual-question-answering-vqa-on-vlm2-benchGPT-4o
Average Score on VLM2-bench (9 subtasks): 60.36
GC-mat: 37.45
GC-trk: 39.27
OC-cnt: 80.62
OC-cpr: 74.17
OC-grp: 57.50
PC-VID: 66.75
PC-cnt: 90.50
PC-cpr: 50.00
PC-grp: 47.00

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
GPT-4o System Card | Papers | HyperAI