The 2025 Foundation Model Transparency Index

2025 AI Transparency Backslide: Average Score Falls to 40, IBM Takes the Crown, xAI and Midjourney Bring Up the Rear

As AI models evolve at a staggering pace, we seem to know very little about how these “black boxes” actually work. The latest 2025 Foundation Model Transparency Index (FMTI) released by Stanford University and other institutions reveals a worrying trend: despite the rapid advancement of AI technology, the industry’s overall transparency is deteriorating sharply.

ArXiv URL:http://arxiv.org/abs/2512.10169v1

This heavyweight annual report not only gave a “checkup” to long-established giants like OpenAI and Google, but also included Chinese companies such as Alibaba and DeepSeek in the evaluation for the first time. The results were astonishing: the average score plunged from 58 last year to 40, even lower than in 2023.

The “Winter” of Transparency: Who Is Exposed, and Who Is Leading?

This year’s FMTI report evaluated 13 of the world’s top foundation model developers. The research team designed an assessment framework with 100 indicators, covering the entire process from upstream data and model construction to downstream impact.

Refer to caption

The stark contrast between the winners and the laggards:

The debut of Chinese companies:

Chinese companies evaluated for the first time this year showed mixed performance. Alibaba, DeepSeek, and others were included in the assessment. Although their overall scores were in the lower-middle range (DeepSeek, Meta, and Alibaba averaged 30 points), this marks a step toward a more complete global map of AI transparency evaluation.

The Truth Behind the Score Collapse: Stricter Standards and Deliberate Concealment

Why did the average score fall from 58 to 40 this year? It is not just because lower-scoring new companies were added; many established players also “regressed” on key metrics.

1. The “black-boxing” of core resources

Companies are most secretive about “upstream resources.” Training Data and Training Compute are the two biggest black holes.

2. A tougher upgrade to the evaluation standards

FMTI 2025 substantially revised its indicators in an effort to “separate the real from the fake.”

Refer to caption

Technical Breakdown: How Is Transparency Quantified?

To measure transparency scientifically, the research team divided the 100 indicators into three core areas:

  1. Upstream: Focuses on the resources needed to build the model.
  1. Model: Focuses on the model’s own properties and release.
  1. Downstream: Focuses on the model’s use and impact.

Refer to caption

An Interesting Finding: Can AI Agents Replace Human Evaluators?

During this year’s evaluation, the research team ran an interesting experiment: using AI Agents to help collect transparency information from each company.

The results showed that AI Agents can indeed improve the efficiency of information retrieval, but they are still far from fully replacing humans. Agents are prone to hallucination or being misled by surface-level information (false positives), and they can also miss key details buried in technical documents (false negatives). In the end, all information still had to be manually verified by the FMTI team.

Conclusion: Transparency Is a Choice, Not a Technical Problem

The most important takeaway from the 2025 FMTI report is that differences in transparency are driven mainly by corporate willingness, not by technical or structural barriers.

The high scores of IBM, Writer, and AI21 Labs prove that even commercial companies can achieve a high degree of transparency while remaining competitive. By contrast, some companies score extremely high on downstream application policies (such as download terms of use) but score zero on model training data; this sharp contrast reveals their strategic opacity.

As global policymakers, such as those behind the EU AI Act, begin to mandate certain types of transparency, this report is not only a record of the current state of affairs, but also a guide to future policy intervention. If market competition cannot deliver transparency, then more aggressive policy intervention may become inevitable.