The Foundation Model Transparency Index

A comprehensive assessment of the transparency of foundation model developers

Paper Blog Interview Data

CRFM Image
Context. Foundation models like GPT-4 and Llama 2 are used by millions of people. While the societal impact of these models is rising, transparency is on the decline. If this trend continues, foundation models could become just as opaque as social media platforms and other previous technologies, replicating their failure modes.
Design. We introduce the Foundation Model Transparency Index to assess the transparency of foundation model developers. We design the Index around 100 transparency indicators, which codify transparency for foundation models, the resources required to build them, and their use in the AI supply chain.
Execution. For the 2023 Index, we score 10 leading developers against our 100 indicators. This provides a snapshot of transparency across the AI ecosystem. All developers have significant room for improvement that we will aim to track in the future versions of the Index.


Key Findings

  • The top-scoring model scores only 54 out of 100. No major foundation model developer is close to providing adequate transparency, revealing a fundamental lack of transparency in the AI industry.
  • The mean score is a just 37%. Yet, 82 of the indicators are satisfied by at least one developer, meaning that developers can significantly improve transparency by adopting best practices from their competitors.
  • Open foundation model developers lead the way. Two of the three open foundation model developers get the two highest scores. Both allow their model weights to be downloaded. Stability AI, the third open foundation model developer, is a close fourth, behind OpenAI.
Overall Scores for the 10 foundation model providers


Indicators

We define 100 indicators that comprehensively characterize transparency for foundation model developers. We divide our indicators into three broad domains:

Overall Scores for the 10 foundation model providers broken down by domain
Scores for the 10 foundation model providers, broken down by domain.


Scores by subdomain

In addition to the top-level domains (upstream, model, and downstream), we also group indicators together into subdomains. Subdomains provide a more granular and incisive analysis, as shown in the figure below. Each of the subdomains in the figure includes three or more indicators.

Scores for the 10 foundation model providers broken down by 13 subdomains which have three or more indicators.
Scores for the 10 foundation model providers broken down by 13 subdomains, each of which have three or more indicators. Analysis at the level of major subdomains reveals actionable insight into what types of transparency or opacity lead to the above findings.


Open vs. Closed models

One of the most contentious policy debates in AI today is whether AI models should be open or closed. While the release strategies of AI are not binary, for the analysis below, we label models whose weights are broadly downloadable as open. Open models lead the way: We find that two of the three open models (Meta's Llama 2 and Hugging Face's BLOOMZ) score greater than or equal to the best closed model (as shown in the figure on the left), with Stability AI's Stable Diffusion 2 right behind OpenAI's GPT-4. Much of this disparity is driven the lack of transparency of closed developers on upstream issues such as the data, labor, and compute used to build the model (as shown in the figure on the right).

Scores for the 10 foundation model providers broken down by the 13 subdomains which have three or more indicators.
Open models (Meta's Llama-2, Hugging Face's BLOOMZ, and Stability AI's Stable Diffusion 2) lead the way.
Disparity is driven by upstream details, such as details about the data, labor, and compute used to develop
                    the model
The disparity between open and closed models is driven by upstream indicators, such as details about the data, labor, and compute used to develop the model


Methodology



About us

The 2023 Foundation Model Transparency Index was created by a group of eight AI researchers from Stanford University's Center for Research on Foundation Models (CRFM) and Institute on Human-Centered Artificial Intelligence (HAI), MIT Media Lab, and Princeton University's Center for Information Technology Policy. The shared interest that brought the group together is improving the transparency of foundation models. See author websites below.

Acknowledgments. We thank Alex Engler, Anna Lee Nabors, Anna-Sophie Harling, Arvind Narayanan, Ashwin Ramaswami, Aspen Hopkins, Aviv Ovadya, Benedict Dellot, Connor Dunlop, Conor Griffin, Dan Ho, Dan Jurafsky, Deb Raji, Dilara Soylu, Divyansh Kaushik, Gerard de Graaf, Iason Gabriel, Irene Solaiman, John Hewitt, Joslyn Barnhart, Judy Shen, Madhu Srikumar, Marietje Schaake, Markus Anderljung, Mehran Sahami, Peter Cihon, Peter Henderson, Rebecca Finlay, Rob Reich, Rohan Taori, Rumman Chowdhury, Russell Wald, Seliem El-Sayed, Seth Lazar, Stella Biderman, Steven Cao, Toby Shevlane, Vanessa Parli, Yann Dubois, Yo Shavit, and Zak Rogoff for discussions on the topics of foundation models, transparency, and/or indexes that informed the Foundation Model Transparency Index. We especially thank Loredana Fattorini for her extensive work on the visuals for this project, as well as Shana Lynch for her work in publicizing this effort.