Close Menu
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
What's Hot

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Paragon is not collaborating with Italian authorities probing spyware attacks, report says

April 28, 2026

Microsoft cuts OpenAI revenue share as their AI alliance loosens

April 28, 2026
Facebook X (Twitter) Instagram
Trending
  • Investors trust Google more than Meta when comes to spending on AI
  • Paragon is not collaborating with Italian authorities probing spyware attacks, report says
  • Microsoft cuts OpenAI revenue share as their AI alliance loosens
  • Robotically assembled building blocks could make construction more efficient and sustainable | MIT News
  • AI showdown: Musk and Altman go to trial in fight over OpenAI’s beginnings
  • U.S., Iran seize ships as war evolves into standoff over Strait of Hormuz
  • Google launches training and inference TPUs in latest shot at Nvidia
  • Zoom teams up with World to verify humans in meetings
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech InnovationsRoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Friday, May 8
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Home » ARC Prize launches its toughest AI benchmark yet: ARC-AGI-2

ARC Prize launches its toughest AI benchmark yet: ARC-AGI-2

GTBy GTMarch 26, 2025 AI No Comments5 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


ARC Prize has launched the hardcore ARC-AGI-2 benchmark, accompanied by the announcement of their 2025 competition with $1 million in prizes.

As AI progresses from performing narrow tasks to demonstrating general, adaptive intelligence, the ARC-AGI-2 challenges aim to uncover capability gaps and actively guide innovation.

“Good AGI benchmarks act as useful progress indicators. Better AGI benchmarks clearly discern capabilities. The best AGI benchmarks do all this and actively inspire research and guide innovation,” the ARC Prize team states.

ARC-AGI-2 is setting out to achieve the “best” category.

Beyond memorisation

Since its inception in 2019, ARC Prize has served as a “North Star” for researchers striving toward AGI by creating enduring benchmarks. 

Benchmarks like ARC-AGI-1 leaned into measuring fluid intelligence (i.e., the ability to adapt learning to new unseen tasks.) It represented a clear departure from datasets that reward memorisation alone.

ARC Prize’s mission is also forward-thinking, aiming to accelerate timelines for scientific breakthroughs. Its benchmarks are designed not just to measure progress but to inspire new ideas.

Researchers observed a critical shift with the debut of OpenAI’s o3 in late 2024, evaluated using ARC-AGI-1. Combining deep learning-based large language models (LLMs) with reasoning synthesis engines, o3 marked a breakthrough where AI transitioned beyond rote memorisation.

Yet, despite progress, systems like o3 remain inefficient and require significant human oversight during training processes. To challenge these systems for true adaptability and efficiency, ARC Prize introduced ARC-AGI-2.

ARC-AGI-2: Closing the human-machine gap

The ARC-AGI-2 benchmark is tougher for AI yet retains its accessibility for humans. While frontier AI reasoning systems continue to score in single-digit percentages on ARC-AGI-2, humans can solve every task in under two attempts.

So, what sets ARC-AGI apart? Its design philosophy chooses tasks that are “relatively easy for humans, yet hard, or impossible, for AI.”

The benchmark includes datasets with varying visibility and the following characteristics:

Symbolic interpretation: AI struggles to assign semantic significance to symbols, instead focusing on shallow comparisons like symmetry checks.

Compositional reasoning: AI falters when it needs to apply multiple interacting rules simultaneously.

Contextual rule application: Systems fail to apply rules differently based on complex contexts, often fixating on surface-level patterns.

Most existing benchmarks focus on superhuman capabilities, testing advanced, specialised skills at scales unattainable for most individuals. 

ARC-AGI flips the script and highlights what AI can’t yet do; specifically the adaptability that defines human intelligence. When the gap between tasks that are easy for humans but difficult for AI eventually reaches zero, AGI can be declared achieved.

However, achieving AGI isn’t limited to the ability to solve tasks; efficiency – the cost and resources required to find solutions – is emerging as a crucial defining factor.

The role of efficiency

Measuring performance by cost per task is essential to gauge intelligence as not just problem-solving capability but the ability to do so efficiently.

Real-world examples are already showing efficiency gaps between humans and frontier AI systems:

Human panel efficiency: Passes ARC-AGI-2 tasks with 100% accuracy at $17/task.

OpenAI o3: Early estimates suggest a 4% success rate at an eye-watering $200 per task.

These metrics underline disparities in adaptability and resource consumption between humans and AI. ARC Prize has committed to reporting on efficiency alongside scores across future leaderboards.

The focus on efficiency prevents brute-force solutions from being considered “true intelligence.”

Intelligence, according to ARC Prize, encompasses finding solutions with minimal resources—a quality distinctly human but still elusive for AI.

ARC Prize 2025

ARC Prize 2025 launches on Kaggle this week, promising $1 million in total prizes and showcasing a live leaderboard for open-source breakthroughs. The contest aims to drive progress toward systems that can efficiently tackle ARC-AGI-2 challenges. 

Among the prize categories, which have increased from 2024 totals, are:

Grand prize: $700,000 for reaching 85% success within Kaggle efficiency limits.

Top score prize: $75,000 for the highest-scoring submission.

Paper prize: $50,000 for transformative ideas contributing to solving ARC-AGI tasks.

Additional prizes: $175,000, with details pending announcements during the competition.

These incentives ensure fair and meaningful progress while fostering collaboration among researchers, labs, and independent teams.

Last year, ARC Prize 2024 saw 1,500 competitor teams, resulting in 40 papers of acclaimed industry influence. This year’s increased stakes aim to nurture even greater success.

ARC Prize believes progress hinges on novel ideas rather than merely scaling existing systems. The next breakthrough in efficient general systems might not originate from current tech giants but from bold, creative researchers embracing complexity and curious experimentation.

(Image credit: ARC Prize)

See also: DeepSeek V3-0324 tops non-reasoning AI models in open-source first

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.



Source link

GT
  • Website

Keep Reading

Enterprise users swap AI pilots for deep integrations

Google, Sony Innovation Fund, and Okta back Resemble AI deepfake detection plan

Platform corrects AI algorithmic bias for eKYC

What ByteDance’s Launch Means for Enterprise

UK and Germany plan to commercialise quantum supercomputing

Frontier AI agents replace chatbots

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Google launches training and inference TPUs in latest shot at Nvidia

April 27, 2026

Meta tracks employee usage on Google, LinkedIn AI training project

April 25, 2026

Meta will cut 10% of workforce as company pushes deeper into AI

April 24, 2026
Latest Posts

Malicious Chrome Extension Steal ChatGPT and DeepSeek Conversations from 900K Users

April 1, 2026

Top 10 Best Server Monitoring Tools

April 1, 2026

10 Best Cybersecurity Risk Management Tools

March 31, 2026

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to RoboNewsWire, your trusted source for cutting-edge news and insights in the world of technology. We are dedicated to providing timely and accurate information on the most important trends shaping the future across multiple sectors. Our mission is to keep you informed and ahead of the curve with deep dives, expert analysis, and the latest updates in key industries that are transforming the world.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 Robonewswire. Designed by robonewswire.

Type above and press Enter to search. Press Esc to cancel.