Context. The October 2023 Foundation Model Transparency Index scored 10 major foundation
developers like Google and OpenAI for their transparency on 100 transparency indicators. The
Index showed developers are generally quite opaque: the average score was a 37 and the top score
was a 54 out of 100.
Design. To understand how the landscape has changed, we conduct a follow up 6 months
later, scoring developers on the same 100 indicators. In contrast to the October 2023 FMTI, we
request that developers submit transparency reports to affirmatively disclose their practices
for each of the 100 indicators.
Execution. For the May 2024 Index, 14 developers submitted transparency reports that they
have validated and approved for public release. Given their disclosures, we have scored each
developer to better understand the new status quo for transparency, changes from October 2023,
and areas of sustained and systemic opacity for foundation models.
Key Findings
The mean score is a 58 and the top score is a 85 out of 100. This is a 21 point
improvement in the mean over the October 2023 FMTI, though developers still have significant
room for improvement.
Compared to the October 2023 FMTI, we see considerable improvement: the top score rose by
31 points and the bottom score rose by 21 points. All eight developers scored in both the
October 2023 and May 2024 FMTI have improved their scores. Of the 100 transparency indicators,
96 are satisfied by at least one developer and 89 are satisfied by multiple developers.
Developers proactively release transparency
reports. This is in contrast to our previous
approach of the FMTI team collecting information from the internet.
Developers disclosed an average of 17 new indicators-worth of information in their reports.
Changes from October 2023
For the October 2023 Foundation Model Transparency Index, we searched through companies' documentation to
see if the 10 companies we examined disclosed information related to each of the 100 transparency
indicators. The big change in our process for the May 2024 Index is that we asked companies to provide us
with a report and proactively disclose information about these transparency indicators. We reached out to 19
companies and 14 agreed to participate in our study.
Companies disclosed information they had never previously made public: all 14 companies disclosed new
information, with companies disclosing information in relation to 16.6 indicators on average.
We observed a significant improvement in scores for each of the 8 companies in both the
2023 and 2024 versions. AI21 (+50), HuggingFace (+32) and Amazon (+29) made the greatest strides. The
average improvement was 19 points.
Developers generally improved their scores or held their scores in subdomains constant, though there
were minor
regressions for certain subdomains.
Scores by subdomain
Disparity between most transparent and least transparent subdomains. The subdomain with the
most and the least transparency are separated by 86 percentage points. In the figure below, we show the 13
major subdomains of transparency across the 14 developers we evaluate.
Sustained opacity on specific issues. While overall trends indicate significant improvement in the
status quo for transparency, some areas have seen no real headway: information about data (copyright,
licenses, and PII), how effective companies' guardrails are (mitigation evaluations), and the downstream
impact of foundation models (how people use models and how many people use them in specific regions) all
remain quite opaque.
Foundation Model Transparency Reports
As part of the May 2024 version of
FMTI, developers
prepared reports including information related to the FMTI's 100 transparency indicators.
We hope that these reports provide
a model for
how companies can regularly disclose important information about their foundation models.
In particular, developers' reports include a substantial amount of information that was not public
before the beginning
of the FMTI v1.1 process: on average, each company disclosed information about 16.6 indicators that was
previously not disclosed.
For example, four or more companies shared information that was previously not disclosed regarding the
compute, energy, and
any synthetic data used to build their flagship foundation models.
These transparency reports provide a wealth of information that other researchers can analyze to learn more
about the AI industry.
Next steps
Developers. We recommend that developers should publish transparency reports for every major
foundation model in line with voluntary codes of conduct from the White
House and the G7.
Researchers. The Index has centralized key
information on major companies across a range of
areas (e.g. data practices, compute expenditure, model evaluations, acceptable use policies), which
researchers can use to build collective understanding.
Policymakers. By identifying areas of sustained opacity (no improvement from October 2023 to May
2024) and systemic opacity (all companies disclose little or no information), we make clear where policy
intervention may be necessary for transparency. By specifying precise indicators, we demonstrate how
policymakers can define clear disclosure requirements.
The Index. For this version, we changed the process to include proactive reporting but kept the
100 transparency indicators the same. Moving forward, we may solicit feedback to reflect changing
priorities and update indicators.
Board
The FMTI advisory board will work directly with the Index team, advising the design, execution, and
presentation
of subsequent iterations of the Index. Concretely, the Index team will meet regularly with the board to
discuss key decision points: How is transparency best measured, how should companies disclose the relevant
information publicly, how should scores be computed/presented, and how should findings be communicated to
companies, policymakers, and the public? The Index aims to measure transparency to bring about greater
transparency in the foundation model ecosystem: the board's collective wisdom will guide the Index team in
achieving these goals. (Home)
Board members
Arvind Narayanan is a professor of computer science at Princeton University and the director of
the Center
for Information Technology Policy. He co-authored a textbook on fairness and machine learning and is
currently co-authoring a book on AI snake oil. He led the Princeton Web Transparency and Accountability
Project to uncover how companies collect and use our personal information. His work was among the first
to
show how machine learning reflects cultural stereotypes, and his doctoral research showed the
fundamental
limits of de-identification. Narayanan is a recipient of the Presidential Early Career Award for
Scientists
and Engineers (PECASE).
Daniel E. Ho is the William Benjamin Scott and Luna M. Scott Professor of Law, professor of
political science, professor of computer science (by courtesy), senior fellow at the Stanford Institute
for
Human-Centered Artificial Intelligence (HAI), senior fellow at the Stanford Institute for Economic
Policy
Research, and director of the Regulation, Evaluation, and Governance Lab (RegLab). Ho serves on the
National
Artificial Intelligence Advisory Committee (NAIAC), advising the White House on AI policy, as senior
advisor
on Responsible AI at the U.S. Department of Labor and as special advisor to the ABA Task Force on Law
and
Artificial Intelligence. His scholarship focuses on administrative law, regulatory policy, and
antidiscrimination law. With the RegLab, his work has developed high-impact demonstration projects of
data
science and machine learning in public policy.
Danielle Allen is James Bryant Conant University Professor at Harvard University. She is a
professor of political philosophy, ethics, and public policy and director of the Democratic Knowledge
Project and of the Allen Lab for Democracy Renovation. She is also a seasoned nonprofit leader,
democracy
advocate, national voice on AI and tech ethics, distinguished author, and mom. A past chair of the
Mellon
Foundation and Pulitzer Prize Board, and former Dean of Humanities at the University of Chicago, she is
a
member of the American Academy of Arts and Sciences and American Philosophical Society. Her many books
include the widely acclaimed Talking to Strangers: Anxieties of Citizenship Since Brown v Board of
Education; Our Declaration: A Reading of the Declaration of Independence in Defense of Equality; Cuz:
The
Life and Times of Michael A.; Democracy in the Time of Coronavirus; and Justice by Means of Democracy.
She
writes a column on constitutional democracy for the Washington Post. She is also a co-chair for the Our
Common Purpose Commission and founder and president for Partners In Democracy, where she advocates for
democracy reform to create greater voice and access in our democracy, and to drive progress toward a new
social contract that serves and includes us all.
Daron Acemoglu is an Institute Professor of Economics in the Department of Economics at the
Massachusetts Institute of Technology and also affiliated with the National Bureau of Economic Research,
and
the Center for Economic Policy Research. His research covers a wide range of areas within economics,
including political economy, economic development and growth, human capital theory, growth theory,
innovation, search theory, network economics and learning. He is an elected fellow of the National
Academy
of Sciences, the British Academy, the American Philosophical Society, the Turkish Academy of Sciences,
the
American Academy of Arts and Sciences, the Econometric Society, the European Economic Association, and
the
Society of Labor Economists.
Rumman Chowdhury is the CEO and co-founder of Humane Intelligence, a tech nonprofit that
creates methods of public evaluations of AI models, as well as a Responsible AI affiliate at Harvard's
Berkman Klein Center for Internet and Society. She is also a research affiliate at the Minderoo Center
for
Democracy and Technology at Cambridge University and a visiting researcher at the NYU Tandon School of
Engineering. Previously, Dr. Chowdhury was the director of the META (ML Ethics, Transparency, and
Accountability) team at Twitter, leading a team of applied researchers and engineers to identify and
mitigate algorithmic harms on the platform. She was named one of BBC's 100 Women, recognized as one of
the
Bay Area's top 40 under 40, and a member of the British Royal Society of the Arts (RSA). She has also
been
named by Forbes as one of Five Who are Shaping AI.