ARC-AGI-2 Leaderboard
o3 is qualitatively different to all other models …
If you have AGI (Year) in your flair, what specific AGI definition are you using?
o3 scores <5% on ARC-AGI-2 (but the test looks ... harder?)
Truth
New/updated models by Google soon
Deepseek releases new V3 checkpoint (V3-0324)
The mysterious "Halfmoon" image generation model was revealed to be made by a company called Reve and gets #1 in the Artificial Analysis text-to-image leaderboard
New deepseek v3 vs R1 (first is v3)
DeepSeek V3 model has completed a minor version upgrade. Early testing from some users shows incredible performance in mathematics and frontend task as well
How long until OpenAI releases their writing model?
All usage tiers now have UNLIMITED Sora generations, removing the credit system entirely
My kid is never going to grow up smarter than AI...and that'll be natural - Sam Altman
Which one are you going to 1st for a highly technical research deep dive?
Qwen2.5-Omni Incoming? Huggingface Transformers PR 36752
o1-pro sets a new record on the Extended NYT Connections benchmark with a score of 81.7, easily outperforming the previous champion, o1 (69.7)
why do people often make blanket claims about AI just because they dislike particular aspects of it?
OpenAI released GPT-4.5 and O1 Pro via their API and it looks like a weird decision.
"Sam Altman is probably not sleeping well" - Kai-Fu Lee
Midjourney is surveying their user base on discord about their upcoming video generation model
Creative writing under 15b
With the insane prices of recent flagship models like GPT-4.5 and O1-Pro, is OpenAI trying to limit DeepSeek's use of its API for training?
o1-pro now has an API - what are your predictions for its LiveBench score?
o1-pro has arrived
This 1 question cost me $1.74 with o1-pro in the API...