The belief that the US holds a sturdy lead in AI mannequin efficiency just isn’t well-supported by the information, and that’s simply one of many uncomfortable findings in Stanford University’s 2026 AI Index Report, printed this week.

The report, produced by Stanford’s Institute for Human-Centred Synthetic Intelligence, is a 423-page annual evaluation of the place synthetic intelligence stands. It covers analysis output, mannequin efficiency, funding flows, public sentiment, and accountable AI. The headline findings are placing.

However the extra consequential insights sit within the sections most protection has skipped, notably on AI security, the place the hole between what fashions can do and the way rigorously they’re evaluated for hurt has not closed however widened.

That mentioned, three findings deserve extra consideration than they’re getting.

The US-China mannequin efficiency hole has successfully closed

The framing that the US leads China in AI improvement wants updating. In keeping with the report, US and Chinese models have traded the highest efficiency place a number of occasions since early 2025. In February 2025, DeepSeek-R1 briefly matched the highest US mannequin. As of March 2026, Anthropic’s high mannequin leads by simply 2.7%.

The US nonetheless produces extra top-tier AI fashions – 50 fashions in 2025 to China’s 30 – and retains higher-impact patents. However China now leads in publication quantity, quotation share, and patent grants. China’s share of the highest 100 most-cited AI papers grew from 33 in 2021 to 41 in 2024. South Korea, notably, leads the world in AI patents per capita.

The sensible implication is that the belief of a sturdy US technological lead in AI mannequin efficiency just isn’t well-supported by the information. The hole that existed two years in the past has closed to a margin that shifts with every main mannequin launch.

There’s a additional structural vulnerability the report identifies. The US hosts 5,427 information centres – greater than ten occasions another nation – however a single firm, TSMC, fabricates nearly each main AI chip inside them. The complete international AI {hardware} provide chain runs by one foundry in Taiwan, although a TSMC enlargement within the US started operations in 2025.

AI security benchmarking just isn’t retaining tempo, and the numbers present it

Virtually each frontier mannequin developer studies outcomes on means benchmarks. The identical just isn’t true for responsible AI benchmarks, and the 2026 Index paperwork the hole with some precision.

The report’s benchmark desk for security and accountable AI exhibits that the majority entries are merely empty. Solely Claude Opus 4.5 studies outcomes on greater than two of the accountable AI benchmarks tracked. Solely GPT-5.2 studies StrongREJECT. Throughout benchmarks measuring equity, safety and human company, the vast majority of frontier fashions report nothing.

Functionality benchmarks are reported persistently throughout frontier fashions. Accountable AI benchmarks–masking security, equity, and factuality–are largely absent. Supply: Stanford HAI 2026 AI Index Report

This doesn’t imply Frontier Labs is doing no inner security work. The report acknowledges that red-teaming and alignment testing occur, however that “these efforts are not often disclosed utilizing a typical, externally comparable set of benchmarks.” The impact is that exterior comparability in AI security dimensions is successfully inconceivable for many fashions.

Documented AI incidents rose to 362 in 2025, up from 233 in 2024, based on the AI Incident Database. The OECD’s AI Incidents and Hazards Monitor, which makes use of a broader automated pipeline, recorded a peak of 435 month-to-month incidents in January 2026, with a six-month transferring common of 326.

Documented AI incidents rose to 362 in 2025, up from 233 the earlier 12 months and underneath 100 yearly earlier than 2022. Supply: AI Incident Database (AIID), by way of Stanford HAI 2026 AI Index Report

The governance response on the organisational stage is struggling to match. In keeping with a survey carried out by the AI Index and McKinsey, the share of organisations ranking their AI incident response as “wonderful” dropped from 28% in 2024 to 18% in 2025. These reporting “good” responses additionally fell, from 39% to 24%. In the meantime, the share experiencing three to 5 incidents rose from 30% to 50%.

The report additionally identifies a structural downside in accountable AI enchancment itself: features in a single dimension have a tendency to scale back efficiency in one other. Enhancing security can degrade accuracy, or enhancing privateness can cut back equity, for instance. There isn’t a established framework for managing such trade-offs, and in a number of dimensions, together with equity and explainability, the standardised information wanted to trace progress over time doesn’t but exist.

Public anxiousness rises with adoption, and the expert-public hole

Globally, 59% of individuals surveyed say AI’s advantages outweigh its drawbacks, up from 55% in 2024. On the similar time, 52% say AI services and products make them nervous, a rise of two share factors in a single 12 months. Each figures are transferring upward concurrently, which displays a public that’s utilizing AI extra whereas turning into extra unsure about the place it leads.

The expert-public divide on AI’s employment results is especially sharp. In keeping with the report, 73% of AI consultants count on AI to have a optimistic influence on how individuals do their jobs, in contrast with simply 23% of most people – a 50-point hole. On the financial system, the hole is 48 factors (69% of consultants are optimistic versus 21% of the general public). On medical care, consultants are significantly extra optimistic at 84%, towards 44% of the general public.

These gaps matter as a result of public belief shapes regulatory outcomes, and regulatory outcomes form how AI is deployed. On that dimension, the report flags one thing placing: the US reported the bottom stage of belief in its personal authorities to manage AI responsibly of any nation surveyed, at 31%. The worldwide common was 54%. Southeast Asian nations have been probably the most trusting, with Singapore at 81% and Indonesia at 76%.

Globally, the EU is trusted greater than the US or China to manage AI successfully. Amongst 25 nations in Pew Analysis Centre’s 2025 survey, a median of 53% trusted the EU to manage AI, in comparison with 37% for the US and 27% for China.

The report closes its public opinion chapter by noting that Southeast Asian nations stay among the many world’s most optimistic about AI. In China, Malaysia, Thailand, Indonesia, and Singapore, greater than 80% of respondents say AI will profoundly change their lives within the subsequent three to 5 years. Malaysia posted the biggest improve on this view from 2024 to 2025.

See additionally: IBM: How sturdy AI governance protects enterprise margins

Banner for AI & Big Data Expo by TechEx events.

Need to study extra about AI and large information from business leaders? Try AI & Big Data Expo going down in Amsterdam, California, and London. The great occasion is a part of TechEx and is co-located with different main expertise occasions together with the Cyber Security & Cloud Expo. Click on here for extra info.

AI Information is powered by TechForge Media. Discover different upcoming enterprise expertise occasions and webinars here.

Source link

AI Safety Benchmarks Are Falling Behind

The US-China mannequin efficiency hole has successfully closed

AI security benchmarking just isn’t retaining tempo, and the numbers present it

Public anxiousness rises with adoption, and the expert-public hole

Leave a Reply Cancel reply

Your Trusted Source for Accurate and Timely Updates!

Popular Posts

Keysource issues warning to data centres as overconfidence threatens sustainability progress

Google’s new framework helps AI agents spend their compute and tool budget more wisely

What ByteDance’s Launch Means for Enterprise

Colt Technology Services sells eight European data centres

FLUX.1 Kontext enables in-context image generation for enterprise AI pipelines

About Us

Top Categories

Useful Links