Top performing AI systems in coding, math, and language-based knowledge tests

Coding performance is measured with the APPS benchmark; math performance with the MATH benchmark; and language-based knowledge tests with theMMLU benchmark.

Top performing AI systems in coding, math, and language-based knowledge tests

Interactive visualization requires JavaScript