This is an automated archive.
The original was posted on /r/singularity by /u/canthony on 2023-08-18 20:24:39+00:00.
Here is a graph of AI performance on benchmarks over time:
And here is a graph of those rolled up into capabilities:
There is no sign that progress is stopping (except for those benchmarks that have already neared the maximum possible). In fact, for the past 5 years we generally find that AI performance exceeds all expectations and predictions.
This chart was an attempt that was made in 2020 to predict future AI performance on certain benchmarks:
And here is where performance actually was midway through 2023 (X marks actual performance):
People will point to the failures of models and say "See this mistake?" or "They still aren't good at reasoning." That's to be expected; if you look at the reasoning based metrics on that top chart, like MMLU, they still haven't quite reached human performance. But they are getting there, very, very fast.
If you still aren't convinced, I suggest you come up with your own bar. Pick a benchmark that you like, or a quantifiable metric that measures what you care about it. But then set your goalposts and don't move them, and see when AI has met your criteria.
EDIT:
As pointed out by u/agcuevas, a performance of 84.3 is now reported on MATH as of Aug 15th.