Why is OpenAI shutting down Sora?

OpenAI is shutting down the Sora video generation app and its API to cull side projects and focus its computing resources on key upcoming bets, including a new model codenamed Spud.

ARC-AGI-3 Benchmark Reset

Industry Insights

ByTamás BőzsönyPartnership Manager, System Auditor

Fact-checked byOlivér MrakovicsLead Developer & AI Architect

March 27, 2026

•

3 min read

Add as preferred

The newly released ARC-AGI-3 benchmark has effectively reset the frontier AI scoreboard, with top models scoring less than 1% on interactive reasoning tasks. Meanwhile, major industry shifts are underway as OpenAI cancels its video generation platform Sora, and regulatory tensions rise globally.

The ARC-AGI-3 Benchmark Stumps Frontier AI

One of the artificial intelligence industry's favorite talking points regarding the imminent arrival of Artificial General Intelligence (AGI) has hit a massive roadblock. François Chollet's ARC Prize Foundation has officially released the ARC-AGI-3 benchmark, a rigorous interactive reasoning test designed to evaluate agentic intelligence.

Unlike previous standardized tests, the ARC-AGI-3 benchmark features 135 novel mini-games and nearly 1,000 levels. Agents are dropped into game-like scenarios with zero instructions, forcing them to discover rules and form strategies completely from scratch.

While human testers solve 100% of these environments on their first contact, frontier models have failed spectacularly on the ARC-AGI-3 benchmark. Google's Gemini Pro currently leads the pack with a mere 0.37%. Other models include GPT 5.4 High at 0.26%, Opus 4.6 at 0.25%, and Grok-4.20 scoring an absolute 0%.

GeminiChatbot (LLM) & General AssistantFrom $8/ month

4.7/5

Review Registration

"Today's models only perform well when humans build elaborate scaffolding around them. The scaffolding is the human intelligence; the model is just executing it." - François Chollet on the ARC-AGI-3 benchmark.

Manus Founders Detained - Geopolitical Tensions

Global regulatory scrutiny is intensifying around AI acquisitions. The co-founders of AI firm Manus, Xiao Hong and Ji Yichao, have been restricted from leaving China. Many news sources have reported that authorities are currently reviewing the company's $2.5 billion sale to Meta. The startup had relocated most of its China-based employees to a Singapore entity to facilitate the acquisition, sparking concerns from local officials about unauthorized corporate flight.

Simultaneously with the Manus founders detained, the United States is actively trying to counter Chinese AI dominance. Reflection, an Nvidia-backed startup dubbed the "DeepSeek of the West", is currently in talks to raise $2.5 billion at a $25 billion valuation. The company aims to build a robust network of freely available, open-source AI models.

Platform Integrity and Research Economics

As automated traffic threatens to surpass human users by 2027, Reddit CEO Steve Huffman has outlined a massive cracdown on Reddit AI bots. Accounts running authorized automation will soon carry mandatory [App] labels. Sub-communities will have the power to flag suspicious users for verification using passkeys or Sam Altman's World ID scanner.

On the research front, the economics of model training are shifting. Recent analysis shows that final training runs account for only a minority of total R&D compute spending. The majority of compute burns during exploration - running experiments, generating synthetic data, and testing ideas.

Industry Trend	Key Development	Impact
Open vs Closed Source	Declining Monetizable Spread	Open models reaching parity; frontier labs' premium value dropping.
Quantization	16-bit to 8-bit Efficiency	Near zero quality penalty, allowing models to run natively on edge systems.
Government Advisory	New US Tech Panel	Mark Zuckerberg, Larry Ellison, and Jensen Huang tapped to shape AI regulation.

In a milestone for automated research, Sakana AI's "AI Scientist" became the first autonomous pipeline to invent research ideas, run experiments, write papers, and successfully pass peer review at a top machine learning conference. As models become more capable, platforms are moving beyond pure training; Surge AI has reportedly reached $1.2 billion in revenue simply by managing the reinforcement learning environments where AI learns to execute real-world work.

As the industry digests the punishing ARC-AGI-3 benchmark results, the focus remains on whether scaling current architectures will ever yield genuine adaptability.

#ARC-AGI-3#OpenAI#Benchmarks#AI Regulation#Startup Funding

Frequently Asked Questions

The ARC-AGI-3 benchmark is an interactive reasoning test designed by the ARC Prize Foundation to measure an AI's ability to learn novel tasks without prior instruction. Humans score 100%, while frontier AI currently scores under 1%.

The ARC-AGI-3 Benchmark Stumps Frontier AI

Manus Founders Detained - Geopolitical Tensions

Platform Integrity and Research Economics

Frequently Asked Questions

What is the ARC-AGI-3 benchmark?

Why is OpenAI shutting down Sora?

Category Related News

Related Tools