The week of 24 March 2026 will be studied in AI history. Not because AGI was achieved — the data suggests it wasn't — but because the gap between what AI companies say and what AI systems can actually do became impossible to ignore.

Here is exactly what happened, why it matters, and what Indian developers and businesses should take from it.

What Jensen Huang actually said

On the Lex Fridman Podcast, the Nvidia CEO was asked how close we are to Artificial General Intelligence. He didn't pause. He said: "I think it's now. I think we've achieved AGI."

He was careful to clarify he wasn't referring to a secret Nvidia project — he was describing the collective capability of the current AI ecosystem. When Fridman pushed further and asked whether AI could run an entire company autonomously, Huang said it was "possible."

"I think it's now. I think we've achieved AGI."
— Jensen Huang, Nvidia CEO, Lex Fridman Podcast, March 2026

Huang was not alone. Sam Altman has said OpenAI has "basically built AGI." Microsoft is already marketing a lab focused on what comes after AGI. Arm named a new data centre chip the "AGI CPU." The hype machine was running at full speed.

Then ARC-AGI-3 dropped

Two days after Huang's statement, the ARC Prize Foundation — led by AI researcher François Chollet and engineer Mike Knoop — released ARC-AGI-3. It is the most rigorous AGI test ever built. And it was specifically designed so that AI labs cannot cheat their way to a high score.

Previous benchmarks died quickly. Labs would throw compute and training data at them until the scores were saturated. ARC-AGI-1 fell to test-time training. ARC-AGI-2 lasted about a year before being cracked. Version 3 was built differently: 135 original interactive game environments, created from scratch by an in-house game studio, with 110 kept entirely private. There is no dataset to memorise. You cannot brute-force novel game logic you have never seen.

The results:

ARC-AGI-3 Benchmark Results · March 2026
Every frontier AI model. One benchmark. The verdict.
Humans (baseline) 100%
Google Gemini 3.1 Pro (best AI) 0.37%
OpenAI GPT-5.4 0.26%
Anthropic Claude Opus 4.6 0.25%
xAI Grok-4.20 0.00%

To be clear about what ARC-AGI-3 is testing: it is not trivia. It is not coding. It is not PhD-level questions that can be memorised. The benchmark tests genuine generalisation — the ability to encounter a completely novel situation and figure it out, just as any human would. That is the "G" in AGI.

So has AGI been achieved?

The honest answer: it depends entirely on how you define it — and that is precisely the problem.

"AGI" has no agreed definition. The term is being stretched until it means whatever is commercially convenient. Anthropic President Daniela Amodei called AGI an "outdated" term. Anthropic CEO Dario Amodei said he has "always disliked" it. Microsoft CEO Satya Nadella called any self-declared AGI achievement "benchmark hacking."

Chollet's position is simpler and more useful: if a normal human with no instructions can do something, and your AI system cannot, you do not have AGI — you have very expensive autocomplete that needs a lot of help.

By that definition — which is the most intellectually honest one — no, AGI has not been achieved. The gap between human generalisation and current AI is not narrowing slowly. On ARC-AGI-3, it is a chasm: 100% vs 0.37%.

LLMTools.in Verdict
AI in 2026 is genuinely remarkable — and genuinely not AGI. The models available today are the most powerful tools ever built for specific, well-defined tasks. But true generalisation — the ability to encounter anything novel and figure it out — remains a human capability. The gap is wider than the headlines suggest.

What this means for Indian developers and businesses

The AGI debate might feel abstract, but its practical implications for India are concrete:

Do not let the hype change your roadmap. Products built on today's AI — Claude, GPT, Gemini — are powerful but narrow. They perform brilliantly on well-defined tasks and struggle badly on novel ones. Design your systems accordingly. Do not assume the next model release will magically solve your hardest problems.

Benchmarks matter more than marketing. When AI companies announce a new model, look for third-party benchmark results — especially on benchmarks the companies did not train against. ARC-AGI-3 is the gold standard. A model that scores 0.37% on it is still extraordinarily useful. But it is not AGI.

India's AI advantage is in application, not AGI research. The country building AGI first will likely be the US or China. India's competitive advantage is in applying current AI — which is already transformative — to sectors where the country has structural depth: legal, healthcare, education, agriculture, vernacular language. That opportunity is enormous and does not require waiting for AGI.

Watch Anthropic closely. The same week as the AGI debate, Anthropic confirmed it is testing a new model described internally as a "step change" in capabilities. If that model scores meaningfully higher on ARC-AGI-3, it will be the most significant AI story of the year.

The best AI tools available right now

While the AGI debate continues, these are the frontier tools Indian developers and businesses are using today:

Claude by Anthropic
Best for long-context reasoning, document analysis and nuanced writing. Scores 0.25% on ARC-AGI-3 — and still the best tool for most professional tasks.
Try Claude →
Perplexity AI
Best for staying up to date on the AGI debate and AI news in real time. Cited answers, no hallucinations on factual queries.
Try Perplexity →
Google Gemini
Led ARC-AGI-3 at 0.37% — still the top-scoring frontier model. Best for multimodal tasks and Google Workspace integration.
Try Gemini →
OpenRouter
Access all frontier models through one API. When a model that actually dents ARC-AGI-3 is released, you will be positioned to switch instantly.
Try OpenRouter →

Frequently asked questions about AGI

What is Artificial General Intelligence (AGI)?
AGI refers to an AI system capable of performing any intellectual task a human can — reasoning, learning, adapting, and applying knowledge across completely new domains. Unlike today's narrow AI, which excels only at specific tasks it was trained on, AGI would generalise across all tasks without prior training.
Has AGI been achieved in 2026?
By the most rigorous definition — the ability to generalise to novel tasks as any human would — no. ARC-AGI-3 results show every frontier AI model scoring below 1% on tasks humans solve at 100%. Nvidia CEO Jensen Huang's claim of AGI relies on a much looser definition of the term.
What is ARC-AGI-3 and why does it matter?
ARC-AGI-3 is the toughest AGI benchmark ever created. Built by François Chollet's ARC Prize Foundation using 135 original interactive game environments, it cannot be beaten by memorisation or brute-force compute. It tests genuine generalisation — the defining property of true AGI. It matters because it is the only benchmark that cannot be gamed by AI labs training specifically against it.
When will AGI actually be achieved?
Predictions range widely. Forecasting platforms put the probability of AGI by 2027 at around 9%. Ray Kurzweil predicts AGI by 2029. Most AI researchers place it between 2030 and 2050. The honest answer: nobody knows, and the definition of AGI itself keeps shifting as labs get closer to previous definitions.
Does AGI matter for Indian businesses right now?
Practically, no — current AI is already transformative for Indian businesses without needing AGI. The tools available today (Claude, Gemini, GPT-5.4) handle well-defined tasks with remarkable capability. India's competitive advantage lies in applying these tools to its unique sectors: legal, healthcare, vernacular language, agriculture and education.

Sources: Lex Fridman Podcast (March 2026), ARC Prize Foundation ARC-AGI-3 results, Yahoo Tech, Gizmodo, KaaShiv InfoTech, Stanford HAI. LLMTools.in editorial analysis.