IBM Transparency Report

1. Data acquisition methods (Score: 1)

What methods does the developer use to acquire data used to build the model?

Disclosure:

We use the following four data acquisition methods to build our model: (1) acquire existing public datasets (2) crawl the web (3) license existing data from external parties (4) use models to generate new, synthetic data

References:

Not disclosed

Score justification:

Data acquisition methods are clearly disclosed.

Indicator notes:

Which of the following data acquisition methods does the developer use:   (i) acquiring existing public datasets, (ii) crawling the web, (iii) using data acquired via its existing products and services, (iv) licensing existing data from external parties, (v) having humans create or annotate new data, (vi) using models to generate new data,  or (vii) other data acquisition methods not captured by the above. For example, if the developer uses reinforcement learning from human feedback to train models using model-generated outputs with human preference annotations, this would satisfy categories (v) and (vi). Alternatively, if the developer post-trains its model using off-the-shelf preference data (for example, the Alpaca dataset), this would satisfy category (i).

Example disclosure:

To build our model, we acquire data by crawling the Internet for publicly available data, licensing data from third-parties, and using models to synthetically generate new data. Humans do not create new data nor do we use data from our other products/services to train our model.

2. Public datasets (Score: 1)

What are the top-5 sources (by volume) of publicly available datasets acquired for building the model?

Disclosure:

The top five largest sources of publicly acquired datasets are: (1) Fineweb (2) DCLM-Baseline (3) IBM-crawled datasets (4) StarCoderdata (5) Github Clean

References:

Not disclosed

Score justification:

Top-5 public datasets are clearly disclosed

Indicator notes:

We define a source as the entity or means by which the developer acquires data. We define the top-5 sources as the top-5 sources by data volume.

Example disclosure:

We acquire publicly available data from only two sources: The Pile and CommonCrawl.

3. Crawling (Score: 1)

If data collection involves web-crawling, what is the crawler name and opt-out protocol?

Disclosure:

We use a Scrapy-based crawler that verifies compliance with robots.txt for every URL. We often leverage the crawler name IBMCrawlerBot.

References:

Not disclosed

Score justification:

The crawler IBMCrawlerBot is used with opt-out based on robots.txt

Indicator notes:

We award this point for disclosure of the crawler name and opt-out protocols, including if/how they respect the Robots Exclusion Protocol (robots.txt).

Example disclosure:

Our web crawler is named A and information on the opt-out protocol can be found at this URL: ... The CommonCrawl web crawler is named CCBot and information on the opt-out protocol can be found at this URL: https://commoncrawl.org/faq#:~:text=How%20can%20I%20block%20the,%2Dagent%20string%20is%3A%20CCBot.

4. Usage data used in training (Score: 1)

What are the top-5 sources (by volume) of usage data from the developer's products and services that are used for building the model?

Disclosure:

As a B2B company, IBM does not collect and use prompts from watsonx for Granite training. Refer to the terms and conditions of watsonx regarding security and privacy: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-security.html?context=wx&utm_source=chatgpt.com. “IBM does not use your work to improve IBM models” and “IBM does not monitor or log foundation model input.”

References:

Not disclosed

Score justification:

No usage data is used for model training.

Indicator notes:

We define usage data as data collected from the use of a developer's products or services.

Example disclosure:

We use usage data from only two sources: our deployed chatbot X and our online social media platform Y.

5. Notice of usage data used in training (Score: 1)

For the top-5 sources of usage data, how are users of these products and services made aware that this data is used for building the model?

Disclosure:

Not applicable. As a B2B company, IBM does not collect and use prompts from watsonx for Granite training. Refer to the terms and conditions of watsonx regarding security and privacy: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-security.html?context=wx&utm_source=chatgpt.com. “IBM does not use your work to improve IBM models” and “IBM does not monitor or log foundation model input.”

References:

Not disclosed

Score justification:

No usage data is used for model training.

Indicator notes:

We define usage data notice as the proactive disclosure to users of how their data is used for model development. For example, via a pop-up with a description, a link to the privacy policy, or link to a description of company practices.

Example disclosure:

We notify users of our chatbot X that chatbot interactions are used to train our AI via a pop-up as shown at this URL: ... We notify users of our platform Y about whether their data is used to train our AI via a link to our privacy policy when they sign up for an account.

6. Licensed data sources (Score: 1)

What are the top-5 sources (by volume) of licensed data acquired for building the model?

Disclosure:

We license data from OntoChem IT Solutions, Webhose, IEEE, and Scale.

References:

Not disclosed

Score justification:

Four sources of licensed data are listed.

Indicator notes:

We define a source as the entity from which the developer acquires data. For example, the Associated Press is reportedly a source of licensed data for OpenAI.

Example disclosure:

We license data from only three sources: A, B, and C.

7. Licensed data compensation (Score: 1)

For each of the top-5 sources of licensed data, are details related to compensation disclosed?

Disclosure:

We compensate OntoChem IT Solutions, IEEE, and Scale with a license fee. We cannot disclose information on compensation for our relationships with Webhose due to contractual terms that prohibit public disclosure.

References:

Not disclosed

Score justification:

The prohibition to discuss contractual terms publicly is clearly stated.

Indicator notes:

We award this point if the model developer describes the compensation structure specified in the contract with the data source or indicates they are prohibited from sharing this information if contractually mandated.

Example disclosure:

We compensate A by ... We cannot disclose information on compensation for our relationships with B and C due to contractual terms that prohibit public disclosure.

8. New human-generated data sources (Score: 1)

What are the top-5 sources (by volume) of new human-generated data for building the model?

Disclosure:

We did not acquire net new human-generated data.

References:

Not disclosed

Score justification:

No new human-generated data was created for model training.

Indicator notes:

We define a source as the entity or means by which the developer acquires data. For example, Scale AI could be a source of new human-generated data. By new, we mean the data is specifically acquired for the purposes of building the model.

Example disclosure:

We acquire new human-generated data from only two sources: our internal data annotation team and an external vendor, A.

9. Instructions for data generation (Score: 1)

For each of the top-5 sources of human-generated data, what instructions does the developer provide for data generation?

Disclosure:

Not applicable -- refer to question 9 above.

References:

Not disclosed

Score justification:

No new human-generated data was created for model training.

Indicator notes:

The instructions should be those provided to the data source. For example, if a third-party vendor works directly with the data laborers to produce the data, the instructions from the developer to this vendor should be disclosed.

Example disclosure:

We instruct our internal data annotation team as follows: ... We instruct vendor A as follows: ...

10. Data laborer practices (Score: 1)

For the top-5 sources of human-generated data, how are laborers compensated, where are they located, and what labor protections are in place?

Disclosure:

Not applicable -- refer to question 9 above.

References:

Not disclosed

Score justification:

No new human-generated data was created for model training.

Indicator notes:

For each data source, we require (i) the compensation in either USD or the local currency, (ii) any countries where at least 25% of the laborers are located, and (iii) a description of any labor protections. We will award this point if the developer discloses that it is not aware of data laborer practices.

Example disclosure:

Our internal data annotation team is located in the US, is compensated at 20 USD per hour, and deals with data that does not require specific protections. Our sole external data vendor contracts laborers in Kenya, compensates them at KES 15000 per month, and implements protections for dealing with toxic or unsafe content such as A and B.

11. Synthetic data sources (Score: 1)

What are the top-5 sources (by volume) of synthetic data acquired for building the model?

Disclosure:

We synthetically generated data using Mixtral-8x7B-Instruct, Mixtral-8x22B-Instruct, Phi-3.5-MoE-Instruct, and granite-34b-code-instruct, and granite-8b-code-instruct.

References:

Not disclosed

Score justification:

The specific models used to generate synthetic data are specified.

Indicator notes:

We define a source of synthetic data as a non-human mechanism (e.g. a machine learning model) used to generate the data.

Example disclosure:

We synthetically generate data using only our previous model X and an early checkpoint of our current flagship model Y.

12. Synthetic data purpose (Score: 1)

For the top-5 sources of synthetically generated data, what is the primary purpose for data generation?

Disclosure:

The primary purpose for the generated synthetic data is to creating training data (SFT and RL) that targets specific behaviors including reasoning, instruction following, harmlessness, as well as to target key use cases, including RAG and tool-use.

References:

Not disclosed

Score justification:

The purpose for synthetic data generation is clearly stated.

Indicator notes:

We define a source of synthetic data as a non-human mechanism (e.g. a machine learning model) used to generate the data.

Example disclosure:

We use model X to generate instruction-tuning data and we use model Y to generate candidate responses that humans select between to provide human preference data for reinforcement learning with human feedback.

13. Data processing methods (Score: 1)

What are the methods the developer uses to process acquired data to determine the data directly used in building the model?

Disclosure:

Our data processing pipeline consists of a multi-step process covering 1. Identification and removal of non-permissively licensed code repositories, 2. Removal of any data obtained from sources found on IBM's URL blocklist, which collates URLs that are known to share pirated or harmful content 3. Text extraction of HTML and PDF documents, including HTML tag removal, PDF header and footer and formatting removal, and in-line deduplication. 4. Language identification to retain English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese 5. Deduplication across documents 6. HAP and malware filtering 7. Quality filtering 8. Instruction data annotation (structured data only), scoring and filtering data for instruction difficulty, response quality, multi-turn identification, and cross-sample similarity scoring Add that our detailed Web data processing recipes are made public in GneissWeb paper, DPK notebooks, and Huggingface page (with pointers).

References:

Not disclosed

Score justification:

Methods for data processing are clearly stated.

Indicator notes:

We will award this point for disclosure of all of the methods used to process acquired data. Data processing refers to any method that substantively changes the content of the data. For example, compression or changing the data file format is generally not in the scope of this indicator.

Example disclosure:

We process data in the following six-step pipeline: (i) removal of HTML artifacts, (ii) deduplication, (iii) language identification to retain English data, (iv) removal of CSAM imagery, (v) removal of train-test overlap, and (vi) tokenization.

14. Data processing purpose (Score: 1)

For each data processing method, what is its primary purpose?

Disclosure:

1. Removes data that would not pass IBM's data clearance process 2. Removes data that is suspected to be pirated or malicious 3. Improves the quality of the data for training 4. Reduces the data to relevant subsets targeted in training 5. Improves the quality of the data for training 6. Removes potentially toxic or harmful content 7. Improves the quality of the data for training 8. Improves the quality of the data for training

References:

Not disclosed

Score justification:

Purposes for data processing are clearly stated.

Indicator notes:

Data processing refers to any method that substantively changes the content of the data. For example, compression or changing the data file format is generally not in the scope of this indicator.

Example disclosure:

Examples of primary purposes for a data processing method could include: (i) removes low quality data, (ii) removes potentially personal/copyrighted data, (iii) removes product-irrelevant data, (iv) removes toxic data, (v) improves evaluation integrity, or (vi) prepares the data for training the model.

15. Data processing techniques (Score: 1)

For each data processing method, how does the developer implement the method?

Disclosure:

All below steps (outside of step 3), are conducted using IBM's Open Source Data Prep Kit (https://github.com/data-prep-kit/data-prep-kit) 1. in-house filtering tool, comparing against a pre-defined list of licenses 2. in-house filtering tool, comparing against a predefined list of blocklisted URLs 3. HTMLs are processed using Trafilatura, pdfs are processed using ApachePDF 4. Fasttext based classifier Steps 5-8 are completed using IBM's open sourced framework, DataPrepKit (https://github.com/data-prep-kit/data-prep-kit/tree), specifically: 5. Exact deduplication computes SHA256 and removes records with identical hashes. Two-step fuzzy dedup is where (1) MinHashes of all documents are computed, and grouped, and then (2) Jaccard similarity is used to pair similar documents, afterwhich only one is retrained 6. Classifiers for HAP are used to annotate and filter out flagged data 7. Gopher-based quality filtering to remove low quality documents that contain for example bullet points ratio of greater than 90%, ellipsis line ratio of greater than 30% and symbol to word ratio of greater than 10%. A KenLM linear classifier, pre-trained on a small collection of known high quality documents, is used to score the overall quality of the document. 8. LLMaaJ based scoring and minimum neighbor distance based duplicate detection

References:

Not disclosed

Score justification:

Techniques for data processing are clearly stated.

Indicator notes:

Data processing refers to any method that substantively changes the content of the data. For example, compression or changing the data file format is generally not in the scope of this indicator.

Example disclosure:

Examples of how a data processing method is implemented could include: the method (i) is implemented using an in-house regular expression, (ii) is implemented using an in-house tool based on n-gram overlap, (iii) is implemented using a FastText classifier trained on Wikipedia data, (iv) is implemented using hash collisions with the NCMEC database, (v) is implemented by searching for known benchmark canary strings, and (vi) is implemented using tiktoken (https://github.com/openai/tiktoken).

16. Data size (Score: 1)

Is the size of the data used in building the model disclosed?

Disclosure:

The sizes of all Granite models can be found publicly documented on our Hugging Face model cards. Recent additions to the Granite family include Granite-3-8B, which is 8B parameters in size, trained on 12T tokens, Granite-4-Tiny, which is 7B parameters in size, and Granite-4-Small, which is 30B parameters in size. Each of these models was pre-trained on 12T tokens.

References:

Not disclosed

Score justification:

Training data sizes are clearly disclosed in relation to specific models.

Indicator notes:

To receive this point, the developer should report data size in appropriate units (e.g. bytes, words, tokens, images, frames) and broken down by modality. Data size should be reported to a precision of one significant figure (e.g. 4 trillion tokens, 200 thousand images). The size should reflect data directly used in building the model (i.e. training data) and not data that was acquired but unused, or data used to evaluate the model.

Example disclosure:

We used 3 x 10^12 tokens of text, 1 x 10^6 images, and 5 x 10^5 hours of audio for training.

17. Data language composition (Score: 1)

For all text data used in building the model, what is the composition of languages?

Disclosure:

We estimate the language composition for non-code data, is as follows: en 80% fr 2.3% es 2.1% de 2.0% ja 1.5% zh 1.5% pt 1.3% it 0.78% nl 0.59% cs 0.28% ko 0.20% ar 0.19% Unknown 6.9% We use two methods to classify language in the datasets. (1) For opensource, permissively licensed datasets where the language is identified as part of the metadata, we rely on that identification. (2) For all others, we use either Cld2 https://pypi.org/project/pycld2/ OR we also use Fasttext-based language identification from opensource. Percentages are based on percent of documents, before tokenization or any further subsampling for data mixtures.

References:

Not disclosed

Score justification:

Training data language composition and frequency is provided.

Indicator notes:

To receive this point, the developer should report (i) all languages which make up at least 1% of the data and their corresponding proportions and (ii) a brief description of how languages are labeled (if a publicly available tool is used, include a link to the tool). Proportions should be reported to a precision of two significant figures and should describe proportions of documents labeled with some langauge. An "Unknown" category may be included to denote documents where the language could not be identified.

Example disclosure:

English 80%, Spanish 5.0%, French 3.0%, Chinese 2.0%, Unknown 10%. We used a FastText-based classifier trained on Wikipedia data to identify languages.

18. Data domain composition (Score: 1)

For all the data used in building the model, what is the composition of domains covered in the data?

Disclosure:

We estimate the following domain compositions is covered in the top level datasets acquired for training: Finance and Business: 40% News and Politics: 10% Lifestyle: 9% Education and Science: 7% Code: 7% Entertainment: 7% Health: 6% Technology: 4% Other: 10% Percentages are based on percent of documents, before tokenization or any further subsampling for data mixtures.

References:

Not disclosed

Score justification:

Training data domain composition and frequency is provided.

Indicator notes:

To receive this point, the developer should report the composition of the main domains included in the data used to train the model. This data should be at a level of granularity lower than broad claims about training on "internet data". For example, this could include the proportion of data from e-commerce, social media, news, code, etc. based on the URLs from which the data is sourced. Proportions should be reported to a precision of one significant figure.

Example disclosure:

Social media 40%, code repositories 30%, news articles 20%, e-commerce product listings 5%, scientific papers 5%.

19. External data access (Score: 1)

Does a third-party have direct access to the data used to build the model?

Disclosure:

Model data and the models are stored in an internal access controlled Lakehouse, which maintains an audit trail. Third party entities do not have direct access to the data used to build the model, with the following two exceptions: (1) Red Hat (a subsidiary of IBM) employees, under access control (2) Schellman LLC, as a part of a third-party audit for ISO 42001 compliance certification We do not distribute the training data to the general public, in order to respect any licensing restrictions around the acquired data. However, the names and locations of all Granite training datasets are openly disclosed, so that the public can procure any open source data if it is of interest. Further, to support third-parties interested in model building, IBM has opensourced the GneissWeb data recipe, so that anyone can recreate IBM's filtered down Common Crawl datasets leveraged in Granite model building.

References:

Not disclosed

Score justification:

A third party (Schellman LLC) and a subsidiary (Red Hat) are provided access.

Indicator notes:

By a third-party, we mean entities that are financially independent of the developer. We will award this point if at least one such entity is named as having direct access to the data. With that said, we may award this point if the developer provides justifications for prohibiting access to narrowly-scoped parts of the data.

Example disclosure:

Third-parties that have direct access to the data include organizations A and B.

20. Data replicability (Score: 1)

Is the data used to build the model described in enough detail to be externally replicable?

Disclosure:

We list the permissively licensed data we use for training Granite. Please refer to the technical white paper (https://github.com/ibm-granite/granite-3.0-language-models/blob/main/paper.pdf), particularly Appendix A: Contributions and Acknowledgements sub-section B. Data (pages 29 – 37). In the footnotes for pages 29 – 37, we explicitly hyperlink to the location of the data. Datasets listed in the white paper as IBM curated, which do not have a corresponding footnote with the data location, are not publicly available datasets.

References:

Not disclosed

Score justification:

In Appendix B of the technical report, the developer enumerates many datasets involved in model training. Datasets are confirmed to either be (i) publicly available, in which case they include the link to where to obtain the data, (ii) from third parties, in which case the third party is named, or (iii) internally developed, in which case nothing is required for this indicator.

Indicator notes:

We will award this point if the description contains (i) a list of all publicly available training data and where to obtain it and (ii) a list of all training data obtainable from third parties and where to obtain it. These conditions refer to criteria 2 and 3 under the OSI Open Source AI v1.0 definition.

Example disclosure:

The listing of publicly available training data can be found at this URL ... and the listing of all training data obtainable from third parties can be found at this URL ...

21. Compute usage for final training run (Score: 1)

Is the amount of compute used in the model's final training run disclosed?

Disclosure:

We estimate Granite 3.0 8B was trained with 5.66×10^23 FLOPs, this is based on a measured throughput of 6.8×10^17 FLOPs/GPU-hour, and a final training run time of 832,102 hours.

References:

Not disclosed

Score justification:

5.66 x 10^23 FLOPs

Indicator notes:

Compute should be reported in appropriate units, which most often will be floating point operations (FLOPs), along with a description of the measurement methodology, which may involve estimation. Compute should be reported to a precision of one significant figure (e.g. 5 x 10^25 FLOPs). This number should represent the compute used to train the final model across all model stages.

Example disclosure:

Our model was trained using 5 x 10^25 FLOPs, measured according to the Frontier Model Forum guidance provided at this URL: https://www.frontiermodelforum.org/updates/issue-brief-measuring-training-compute/

22. Compute usage including R&D (Score: 1)

Is the amount of compute used to build the model, including experiments, disclosed?

Disclosure:

Our cumulative compute usage across the entire Granite 3.0 family is estimated as 5.30 x 10^24 FLOPS. We estimate across the entire model family, as most of the data mixtures and hyper-parameter search experiments benefited all of the Granite 3.0 final training runs. This estimate is based on a measured throughput of 6.8×10^17 FLOPs/GPU-hour, and a total of 7.7M cumulative GPU hours run across the Granite 3.0 development lifecycle.

References:

Not disclosed

Score justification:

5.30 x 10^24 FLOPs

Indicator notes:

Compute should be reported in appropriate units, which most often will be floating point operations (FLOPs), along with a description of the measurement methodology, which may involve estimation. Compute should be reported to a precision of one significant figure (e.g. 7 x 10^26 FLOPs). Compared to the previous indicator, this indicator should include an estimation of the total compute used across experiments used towards the final training run for the model (such as including hyperparameter optimization or other experiments), and not just the final training run itself.

Example disclosure:

Our cumulative compute usage involved in building the model was 7 x 10^26 FLOPs, measured according to the Frontier Model Forum guidance provided at this URL: https://www.frontiermodelforum.org/updates/issue-brief-measuring-training-compute/

23. Development duration for final training run (Score: 1)

Is the amount of time required to build the model disclosed?

Disclosure:

Granite 3.0 8B model was trained over a period of 120 days, 832,102 NVIDIA H100 GPU-Hours.

References:

Not disclosed

Score justification:

120 days; 832k Nvidia H100 GPU hours

Indicator notes:

The amount of time should be specified in terms of both the continuous duration of time required and the number of hardware hours used. The continuous duration of time required to build the model should be reported in weeks, days, or hours to a precision of one significant figure (e.g. 3 weeks). The number of hardware hours should be reported to a precision of one significant figure and include the type of hardware hours. No form of decomposition into phases of building the model is required for this indicator, but it should be clear what the duration refers to (e.g. training the model, or training and subsequent evaluation and red teaming).

Example disclosure:

Our model was trained over a period of 90 days using 4x10^4 NVIDIA H100 GPU-days.

24. Compute hardware for final training run (Score: 1)

For the primary hardware used to build the model, is the amount and type of hardware disclosed?

Disclosure:

Our Granite 3.0 8B model was trained on 768 H100s; Granite 4.0 Tiny was trained on 256 H100s; and Granite 4.0 Small was trained on 1024 H100s.

References:

Not disclosed

Score justification:

768 Nvidia H100s

Indicator notes:

In most cases, this indicator will be satisfied by information regarding the number and type of GPUs or TPUs used to train the model. The number of hardware units should be reported to a precision of one significant figure (e.g. 800 NVIDIA H100 GPUs). We will not award this point if (i) the training hardware generally used by the developer is disclosed, but the specific hardware for the given model is not, or (ii) the training hardware is disclosed, but the amount of hardware is not. We will award this point even if information about the interconnects between hardware units is not disclosed.

Example disclosure:

Our model was trained using 1000 NVIDIA H100 GPUs.

25. Compute provider (Score: 1)

Is the compute provider disclosed?

Disclosure:

IBM Blue Vela, one of IBM's supercomputing clusters, using IBM Spectrum LSF, as shown on page 18 of the technical report.

References:

Not disclosed

Score justification:

Self-owned IBM cluster

Indicator notes:

For example, the compute provider may be the model developer in the case of a self-owned cluster, a cloud provider like Microsoft Azure, Google Cloud Platform, or Amazon Web Services, or a national supercomputer. In the event that compute is provided by multiple sources or is highly decentralized, we will award this point if a developer makes a reasonable effort to describe the distribution of hardware owners.

Example disclosure:

Compute is provided by Google Cloud Platform.

26. Energy usage for final training run (Score: 1)

Is the amount of energy expended in building the model disclosed?

Disclosure:

Granite 3.0 8B was trained with an estimated 757.0 MWh of energy. To estimate training energy consumption, we multiplied training GPU Hours (832,102) by a Power Usage Effectivess (PUE) of 1.3 and a GPU power consumption of 700W.

References:

Not disclosed

Score justification:

757 MWh with clear estimation methodology

Indicator notes:

Energy usage should be reported in appropriate units, which most often will be megawatt-hours (mWh), along with a description of the measurement methodology, which may involve estimation. Energy usage should be reported to a precision of one significant figure (e.g. 500 mWh). No form of decomposition into compute phases is required, but it should be clear whether the reported energy usage is for a single model run or includes additional runs, or hyperparameter tuning, or training other models like reward models, or other steps in the model development process that necessitate energy usage. If the developer is unable to measure or estimate this quantity due to information not being available from another party (e.g. compute provider), we will award this point if the developer explicitly discloses what information it lacks and why it lacks it.

Example disclosure:

Our model was trained using an estimate 1 x 10^4 MWh of energy. To estimate training energy consumption, we multiplied training FLOPs (5 x 10^25) by a conversion factor using NVIDIA A100 GPU information (3.74 × 10^21 FLOPs/MWh) given we train using FP16 with sparsity.

27. Carbon emissions for final training run (Score: 1)

Is the amount of carbon emitted in building the model disclosed?

Disclosure:

To calculate the emission we use the US national average carbon intensity factor of 0.39 kg CO2eq/KWh according to U.S. Energy Information Administration, resulting in a Carbon Emissions of 295.2 tCO2eq for Granite 3.0 8b.

References:

Not disclosed

Score justification:

295.2 tCO2eq with clear estimation methodology

Indicator notes:

Emissions should be reported in appropriate units, which most often will be tons of carbon dioxide emitted (tCO2), along with a description of the measurement methodology, which may involve estimation. Emissions should be reported to a precision of one significant figure (e.g. 500 tCO2). No form of decomposition into compute phases is required, but it should be clear whether the reported emissions is for a single model run or includes additional runs, or hyperparameter tuning, or training other models like reward models, or other steps in the model development process that generate emissions. If the developer is unable to measure or estimate this quantity due to information not being available from another party (e.g. compute provider), we will award this point if the developer explicitly discloses what information it lack and why it lacks it. Emissions should correspond with the energy used in the previous indicator.

Example disclosure:

Our model yielded an estimate of 5 x 10^3 tCO2. To estimate training carbon emissions, we multiplied training energy usage (1 x 10^4 MWh) by a 2023 estimate for the US data center carbon intensity (0.375 tCO2/MWh) given the data centers used in training operate in the US.

28. Water usage for final training run (Score: 1)

Is the amount of clean water used in building the model disclosed?

Disclosure:

Our model yielded an estimate of 2.1 ML water. To estimate training water usage, we multiplied training energy usage (757,000 kWh) by a snapshot of our Blue Vela data center water efficiency, measured in April of 2025 (2.83 L per kWh).

References:

Not disclosed

Score justification:

2.1ML water with clear estimation methodology.

Indicator notes:

Clean water usage should be in appropriate units, which most often will be megaliters, along with a description of the measurement methodology, which may involve estimation. Clean water usage should be reported to a precision of one significant figure (e.g., 5000ML). No form of decomposition into compute phases is required, but it should be clear whether the reported water usage is for a single model run or includes additional runs, or hyperparameter tuning, or training other models like reward models, or other steps in the model development process that necessitates water usage. If the developer is unable to measure or estimate this quantity due to information not being available from another party (e.g. compute provider), we will award this point if the developer explicitly discloses what information it lacks and why it lacks it.

Example disclosure:

Our model yielded an estimate of 20 ML water. To estimate training water usage, we multiplied training energy usage (1 x 10^4 MWh) by a 2021 estimate for the US data center water efficiency (1.8 ML per 1,000 MWh) given the data centers used in training operate in the US.

29. Internal compute allocation (Score: 1)

How is compute allocated across the teams building and working to release the model?

Disclosure:

Data Mixture -- 37% Hyper Parameter Tuning --17% Pre-Training -- 16% Post-Training --16% Miscellaneous--14%

References:

Not disclosed

Score justification:

Clear compute allocation across data mixture, hyperparameter tuning, pre-training, post-training, and misc.

Indicator notes:

To receive a point, the developer should provide the compute allocated to each team involved in training the model. We understand there might be no clear allocation of compute across different teams; in that case, report an estimate of the compute used over the last year. Compute allocation should be reported to at least one significant figure.

Example disclosure:

- Safety — 15% - Pre-training — 60% - Post-training — 15% - Infrastructure and reliability — 5%

30. Model stages (Score: 1)

Are all stages in the model development process disclosed?

Disclosure:

We define four stages in the model build: 1.Pre-training – Phase 1 2. Pre-training – Phase 2 (including annealing and long context extension) 3. Supervised Fine-tuning (SFT) 4. Reinforcement Learning-based Alignment (RLHF)

References:

Not disclosed

Score justification:

4 model stages provided.

Indicator notes:

Stages refer to each identifiable step that constitutes a substantive change to the model during the model building process. We recognize that different developers may use different terminology for these stages, or conceptualize the stages differently. We will award this point if there is a clear and complete description of these stages.

Example disclosure:

We define five stages in building the model: (1) unsupervised pre-training, (2) supervised instruction tuning, (3) RLHF, (4) domain-specific fine-tuning, and (5) final safety alignment.

31. Model objectives (Score: 1)

For all stages that are described, is there a clear description of the associated learning objectives or a clear characterization of the nature of this update to the model?

Disclosure:

1.Pre-training – Phase 1 Objective: Train the model on a broad mixture of medium-quality data to build foundational knowledge and linguistic competence across diverse domains and tasks, without overfitting to any single type of content. Update: Establishes general-purpose representations by learning patterns and structures common across a wide range of inputs. 2. Pre-training – Phase 2 (including annealing and long context extension) Objective: Continue training on a curated, high-quality data mixture to improve generalization and performance on downstream tasks, with a focus on enterprise-relevant domains and longer context data. Update: Refines and stabilizes the model’s representations through continued pretraining on higher-signal data. 3. Supervised Fine-tuning (SFT) Objective: Adapt the model for instruction following, tool use, and multi-turn dialogue through training on structured instruction-response examples. Update: Specializes the model to follow directives and engage in structured interactions, improving usability and coherence in applied settings. 4. Reinforcement Learning-based Alignment (RLHF) Objective: Further align the model’s behavior with human preferences using techniques such as Best-of-N sampling, Proximal Policy Optimization (PPO). Update: Adjusts the model’s output distribution to favor helpful, safe, and high-quality responses, while preserving alignment with the fine-tuned baseline.

References:

Not disclosed

Score justification:

Clear objective for each model stage.

Indicator notes:

We recognize that different developers may use different terminology for these stages, or conceptualize the stages differently. We will award this point if there is a clear description of the update to the model related to each stage, whether that is the intent of the stage (e.g. making the model less harmful), a mechanistic characterization (e.g. minimizing a specific loss function), or an empirical assessment (e.g. evaluation results conducted before and after the stage).

Example disclosure:

During unsupervised pre-training, the objective is next-token prediction. During supervised instruction tuning, we optimize for correctness and helpfulness on labeled tasks. RLHF aligns model outputs with human preference judgments. Domain-specific fine-tuning focuses on improving in-domain capabilities using specialized data (e.g., code or legal text). Final safety alignment reduces disallowed or harmful responses.

32. Code access (Score: 1)

Does the developer release code that allows third-parties to train and run the model?

Disclosure:

As a part of our model release process, we submit PRs enabling inferencing support for all Granite models in vLLM (https://github.com/vllm-project/vllm), and training support for all Granite models in Hugging Face PEFT (https://github.com/huggingface/peft). An open sourced version of the code used to pretrain Granite models can be found at https://github.com/open-lm-engine; to support community reference implementations, a full description of the Granite architecture and technical parameters can be found in the publicly-available Granite Technical Report and model config files posted on Hugging Face.

References:

Not disclosed

Score justification:

Representative code for model training along with the relevant configuration is released. In addition code for model inference and fine-tuning is also released.

Indicator notes:

The released code does not need to match the code used internally.

Example disclosure:

We release training and inference code under an Apache 2.0 license at https://github.com/..., enabling others to replicate our core pipeline.

33. Organization chart (Score: 1)

How are employees developing and deploying the model organized internally?

Disclosure:

The model team is structured as follows, with the following proportional headcount: - CEO -- AI Research VP --- AI Model VP: ---- Data and Tools Director: 26% ---- Model Training Distinguished Engineer: 5% ---- Advanced LLM Technologies and Applications Director: 37% ---- Model Safety Director: 26% ---- Strategy and Go-to-Market Director: 5%

References:

Not disclosed

Score justification:

Clear org chart relevant to training.

Indicator notes:

To receive a point, the developer should provide both the internal organization chart for the team developing the model as well as the headcounts (or a proportion of headcounts) by the team.

Example disclosure:

The model team comprises of 63 people, organized as follows: - CEO - Managing Director (Safety) — 24 people - Managing Director (Pre-training) — 12 people - Managing Director (Post-training) — 11 people - Managing Director (API) — 6 people - Director (Infrastructure and reliability) — 7 people - Director (PR and marketing) — 4 people - Director (hiring) — 7 people

34. Model cost (Score: 1)

What is the cost of building the model?

Disclosure:

We estimate our total Granite 3 8B model cost to be $10M, where $4M was spent on data processing, $2M is spent on hyperparameter searches, and $2M on the final pre-training run, and $2M on post-training and post-training experiments. Our costs are projected using GPU hours during the Granite 3 training period and an average market cost assumption of $2.25/gpu-hr (H100). For activities that are shared across multiple models being trained during that period of time (e.g. multiple models benefited from the same data processing work), we calculated the percent of final pre-training run hours that was driven by Granite 3 8B, and applied that percentage across shared categories.

References:

Not disclosed

Score justification:

Clear total model cost with disaggregated values.

Indicator notes:

Monetary cost should be reported in appropriate currency (e.g. USD), along with the measurement methodology, which may involve estimation. Cost should be reported to a precision of one significant figure (e.g. 200 million USD).

Example disclosure:

We spent approximately 200 million USD on building the model: 50 million for data acquisition, 10 million for data processing, 20 million for personnel, 80 million for compute for R&D priced at market rates, and 40 million for compute for the final training run priced at market rates.

35. Basic model properties (Score: 1)

Are all basic model properties disclosed?

Disclosure:

Granite 3 Input modality: Text Output modality: Text Model components: Decoder-only model trained using self-supervised learning, followed by supervised fine tuning and RL. Model size: 2B, 8B parameters Model architecture: decoder-only dense transformer architecture, ROPE Granite 4 Input modality: Text Output modality: Text Model components: Decoder-only model trained using self-supervised learning, followed by supervised fine tuning and RL. Model size: 7B-A1B, 30B-A6B parameters Model architecture: fine-grained hybrid mamba2-dense mixture-of-experts (MoE) architecture, NOPE

References:

Not disclosed

Score justification:

All basic model properties are disclosed.

Indicator notes:

Basic model properties include: the input modality, output modality, model size, model components, and model architecture. To receive a point, all model properties should be disclosed. Modalities refer to the types or formats of information that the model can accept as input. Examples of input modalities include text, image, audio, video, tables, graphs. Model components refer to distinct and identifiable parts of the model. We recognize that different developers may use different terminology for model components, or conceptualize components differently. Examples include: (i) For a text-to-image model, components could refer to a text encoder and an image encoder, which may have been trained separately. (ii) For a retrieval-augmented model, components could refer to a separate retriever module. Model size should be reported in appropriate units, which generally is the number of model parameters, broken down by named component. Model size should be reported to a precision of one significant figure (e.g. 500 billion parameters for text encoder, 20 billion parameters for image encoder). Model architecture is the overall structure and organization of a foundation model, which includes the way in which any disclosed components are integrated and how data moves through the model during training or inference. We recognize that different developers may use different terminology for model architecture, or conceptualize the architecture differently; a sufficient disclosure includes any clear, though potentially incomplete, description of the model architecture.

Example disclosure:

Input modality: Text Output modality: Text Model components: Decoder-only model trained using self-supervised learning, followed by supervised fine tuning and RLHF that are used to align the language model to follow users' instructions and be helpful, harmless, and honest. Model size: 70B parameters Model architecture: Autoregressive (causal, decoder only) transformer language model with rotary position embeddings and are trained on the next token prediction task.

36. Deeper model properties (Score: 1)

Is a detailed description of the model architecture disclosed?

Disclosure:

Architecture configuration file can be found here: Granite3 - https://github.com/huggingface/transformers/blob/main/src/transformers/models/granite/configuration_granite.py Granite4 - https://github.com/huggingface/transformers/blob/main/src/transformers/models/granitemoeshared/configuration_granitemoeshared.py Model training code can be found at https://github.com/open-lm-engine/lm-engine

References:

Not disclosed

Score justification:

A configuration file is provided that allows an external entity to reproduce the model architecture (through HuggingFace Transformers).

Indicator notes:

To receive a point, the model architecture should be described in enough detail to allow for an external entity to fully implement the model. Publicly available code or a configuration file for a model training library (e.g., GPT-NeoX) would be a sufficiently detailed description.

Example disclosure:

The configuration file for training our model using a public model training library A can be found at [URL].

37. Model dependencies (Score: 1)

Is the model(s) the model is derived from disclosed?

Disclosure:

The model is not dependent on or derived from any other model. We train this model from scratch.

References:

Not disclosed

Score justification:

The developer discloses that the model is not dependent on another model.

Indicator notes:

We will award this point for a comprehensive disclosure of the model or models on which the foundation model directly depends on or is derived from, as well as the method by which it was derived (e.g., through fine tuning, model merging, or distillation). Additionally, we will award a point if the developer discloses that the model is not dependent on or derived from any model.

Example disclosure:

This model is a fine tune of Camel-70B. We used the methods described in [PAPER URL] for distillation.

38. Benchmarked inference (Score: 1)

Is the compute and time required for model inference disclosed for a clearly-specified task on clearly-specified hardware?

Disclosure:

It takes approximately 25.6 seconds and 1.9 petaFLOPs of total compute — or an effective throughput of 75 teraFLOPs per second — to generate 128,000 tokens as 1,000 sequences of 128 output tokens, each with an input context of 1,024 tokens sampled from ShareGPT [1,2]. The hardware tested is one NVIDIA A100 card with 80 GB of HBM memory running granite-3.1-8b (BF16) with vLLM 0.8.5.post1 operating at a batch size of 256 (concurrent requests being processed at a time) for optimal throughput efficiency. [1] https://github.com/vllm-project/vllm/tree/main/benchmarks [2] https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered

References:

Not disclosed

Score justification:

Compute and time on specified hardware/tasks are disclosed.

Indicator notes:

The duration should be reported in seconds to a precision of one significant figure (e.g. 0.002 seconds). Compute usage for inference should be reported in FLOPs/second to a precision of one significant figure (e.g. 5 x 10^21 FLOPs/second). The hardware in this evaluation need not be the hardware the developer uses for inference. The developer can report this figure over some known or public dataset.

Example disclosure:

It takes 0.002 seconds and 5 x 10^21 FLOPs/second to generate 100,000 tokens as 5,000 sequences of length 20 given inputs of length 40 from [DATASET URL]. The fixed set of hardware is 8 NVIDIA A100s.

39. Researcher credits (Score: 1)

Is a protocol for granting external entities API credits for the model disclosed?

Disclosure:

Any academic collaborator in the MIT-IBM Watson AI Lab (joint research partnership between MIT and IBM) can request API access to IBM-hosted Granite models to support model evaluation in their research projects. To do so, the requestor must request access through their dedicated IBM Principal Investigator (PI). The IBM PI is their dedicated focal point for this ask. Select commercial entities exploring partnership opportunities with IBM are also provided API credits in order to evaluate Granite model performance, these credits are offered by invitation only, from the AI Research Partnerships Lead based on alignment of business interests. Kindly note that we do not offer a process to the general public to access API credits at this time.

References:

Not disclosed

Score justification:

The developer does not implement a researcher access program that's open to the general public.

Indicator notes:

A model credit access protocol refers to the steps, requirements, and considerations involved in granting credits to external entities. We will award this point if the developer discloses key details of its protocol, including (i) where external entities can request access to credits (e.g. via an access request form); (ii) explicit criteria for selecting external entities; and (iii) its policy on granting a transparent decision on whether access has been granted within a specified, reasonable period of time. Additionally, we will award a point if the developer discloses that it does not grant external entities API credits.

Example disclosure:

We implement a researcher access program: (i) Access can be requested from [URL] (ii) Any researcher at an accredited research institution is eligible to apply. Decisions are made based on the alignment between the applicant's project description and our target research directions (as described here: [URL]). (iii) Decision notifications are sent within three weeks of the application receipt.

40. Specialized access (Score: 1)

Does the developer disclose if it provides specialized access to the model?

Disclosure:

Early Access: We have provided early API access to Granite to 2 leaderboard providers and 2 external red teaming partners. We also have provided early access to the model weights to 7 commercial launch partners. One leaderboard provider is an industry affiliate, one leaderboard provider is an academic affiliate, the 2 external red teaming partners are industry. Subsidized Access (post-release): 66 academics in the MIT-IBM Watson AI Lab and 4 commercial entities have been provided subsidized access for model evaluation. We are also in the process of setting up specialized access to HackerOne-vetted white hat hackers through a bug bounty program, which is targeted for launch in the Summer 2025. Deeper Access: IBM provides early checkpoints, not otherwise available, to Red Hat, an IBM Subsidiary.

References:

Not disclosed

Score justification:

The developer discloses sufficient information for both (i) and (ii).

Indicator notes:

Specialized access could include several categories, such as early access, subsidized access, or deeper access (e.g., to model weights or checkpoints, that are not publicly available). We will award this point if the developer discloses (i) if it provides specialized access and (ii) statistics on the number of users granted access across academia, industry, non-profits, and governments, to one significant figure.

Example disclosure:

We provide early access to the model via API to: (1) 250 academics vetted by our program (2) 0 industry affiliates (3) 0 non-profit affiliates (3) 2 government entities with whom we have signed MoUs We provide no other specialized research access.

41. Open weights (Score: 1)

Are the model's weights openly released?

Disclosure:

Yes, model weights are available on HuggingFace under an Apache 2.0 license by following this link: https://huggingface.co/ibm-granite

References:

Not disclosed

Score justification:

Weights are publicly accessible on HuggingFace.

Indicator notes:

To receive this point, model weights need to be publicly available at no cost. Developers may receive this point even if there are some restrictions on the external entities that are permitted access (e.g. geographic restrictions), insofar as these restrictions are transparent (e.g. via a license or some high-level description of who has been granted access to the foundation model).

Example disclosure:

Model weights are available on HuggingFace by following this link: [URL]

42. Agent Protocols (Score: 1)

Are the agent protocols supported for the model disclosed?

Disclosure:

Granite supports IBM's Agent Context Protocol (ACP), which is built off of MCP. More details and documentation on ACP can be found here: https://research.ibm.com/blog/agent-communication-protocol-ai; https://agentcommunicationprotocol.dev/introduction/welcome

References:

Not disclosed

Score justification:

The developer discloses the agent protocols supported.

Indicator notes:

Agent protocols are specifications that define how autonomous agents exchange messages, context, or function calls with other agents, tools, or services (e.g., Anthropic’s Model Context Protocol (MCP) and Google’s Agent‑to‑Agent (A2A) spec). To earn this point, documentation must enumerate each protocol and describe any deviations or proprietary extensions.

Example disclosure:

We support MCP and A2A for agents built using model A

43. Capabilities taxonomy (Score: 1)

Are the specific capabilities or tasks that were optimized for during post-training disclosed?

Disclosure:

We focus on the following capabilities during post-training: - instruction following - reasoning - code - math - safety

References:

Not disclosed

Score justification:

The developer discloses the capabilities optimized for during post-training.

Indicator notes:

Capabilities refer to the specific and distinctive functions that the model can perform. We recognize that different developers may use different terminology for capabilities, or conceptualize capabilities differently. We will award this point for a list of capabilities specifically optimized for in the post-training phase of the model, even if some of the capabilities are not reflected in the final model.

Example disclosure:

We focus on the following capabilities during post-training: (1) Coding ability (2) Retrieval of information and factuality (3) Multilingual language proficiency on non-English languages (4) Tool-use

44. Capabilities evaluation (Score: 1)

Does the developer evaluate the model's capabilities prior to its release and disclose them concurrent with release?

Disclosure:

We evaluate capabilities using the following benchmarks: - instruction following: ArenaHard (57.56), AlpacaEval (62.68), IFEval (74.82) - reasoning: BigBenchHard (69.13), DROP (59.36) - code: HumanEval (89.73), HumanEval+ (86.09) - math: GSM8K (80.89), Math500 (69.02), AIME2024 (8.12) - safety: AttaQ (8.15), TruthfulQA (66.86)

References:

Not disclosed

Score justification:

The developer discloses evaluations for each of the capabilities.

Indicator notes:

The evaluations must contain precise quantifications of the model's behavior in relation to the capabilities specified in the capabilities taxonomy. We will award this point for any clear, but potentially incomplete, evaluation of multiple capabilities.

Example disclosure:

We evaluate capabilities using the following benchmarks: (1) Coding: HumanEval (2) Retrieval: HotPotQA (3) Multilingual performance: MMMLU (4) Tool use: UltraTool

45. External reproducibility of capabilities evaluation (Score: 1)

Are code and prompts that allow for an external reproduction of the evaluation of model capabilities disclosed?

Disclosure:

All of the benchmarks are from open-sourced benchmarks and can be recreated using common evaluation frameworks such as lm-eval (https://github.com/EleutherAI/lm-evaluation-harness).

References:

Not disclosed

Score justification:

The developer discloses how the evaluations can be reproduced. The evaluation harness includes the code and prompts for running evaluations.

Indicator notes:

The released code and prompts need not be the same as what is used internally, but should allow the developer's results on all capability evaluations to be reproduced. The released code must be open source, following the OSI definition of open source.

Example disclosure:

The code and prompts to reproduce our evaluations can be found on this GitHub repository link: [URL]

46. Train-test overlap (Score: 0)

Does the developer measure and disclose the overlap between the training set and the dataset used to evaluate model capabilities?

Disclosure:

We did not measure this for Granite 3 models, but will be releasing a technical report later this summer for Granite 4 models. The report will calculate the test-train overlap, using a similar process as reported in Tulu3.

References:

Not disclosed

Score justification:

The developer does not disclose this.

Indicator notes:

We will award this point if, with every capability evaluation for which the developer reports results, the developer reports the overlap between the training set of the model and the dataset used for evaluation, as well as the general methodology for computing train-test overlap (e.g. a description of how n-gram matching was used).

Example disclosure:

We compute train-test overlap using n-gram matching using the procedure described here [URL]. We evaluate the train-test overlap for the following benchmarks: (1) Coding: HumanEval (1.6%) (2) Retrieval: HotPotQA (4%) (3) Multilingual performance: MMMLU (3%) (4) Tool use: UltraTool (9%)

47. Risks taxonomy (Score: 1)

Are the risks considered when developing the model disclosed?

Disclosure:

We consider the following risks when developing and evaluating the model: 1) Explicit content 2) Deception 3) Discrimination 4) Harmful information 5) Violence 6) Substance abuse 7) PII leakage

References:

Not disclosed

Score justification:

The developer discloses the risks considered.

Indicator notes:

Risks refer to possible negative consequences or undesirable outcomes that can arise from the model's deployment and usage. These consequences or outcomes may arise from model limitations (functions that the model cannot perform) or issues with the model's trustworthiness (e.g., its lack of robustness, reliability, calibration). We recognize that different developers may use different terminology for risks, or conceptualize risks differently. We will award this point for a complete list of risks considered, even if some of the risks are not reflected in the final model.

Example disclosure:

We consider the following risks when developing and evaluating the model: (1) Misinformation (2) Harassment (3) Cybersecurity risks (4) Bioweapons design (5) Revealing personally-identifiable information

48. Risks evaluation (Score: 1)

Does the developer evaluate the model's risks prior to its release and disclose them concurrent with release?

Disclosure:

We evaluate the risks for each of the harms as measured by the ATTAQ framework (high score is good): Granite-3.0-8B-Instruct 1) Explicit content - 0.85 2) Deception - 0.87 3) Discrimination - 0.85 4) Harmful information - 0.86 5) Violence - 0.86 6) Substance abuse - 0.84 7) PII leakage - 0.81

References:

Not disclosed

Score justification:

The developer discloses evaluation results for each of the risks.

Indicator notes:

The evaluations must contain precise quantifications of the model's behavior in relation to the risks specified in the risk taxonomy. We will award this point for clear evaluations of the majority of the states risks.

Example disclosure:

We evaluate the risks for each of the above harms using HarmBench. The results (in terms of mean attack success rate) are: (1) Misinformation: 0.02 (2) Harassment: 0.01 (3) Cybersecurity: 0.10 (4) Bioweapons design (subset of Chemical Biological in HarmBench): 0.12 (5) Revealing personally-identifiable information (subset of General Harm in HarmBench): 0.02

49. External reproducibility of risks evaluation (Score: 1)

Are code and prompts to allow for an external reproduction of the evaluation of model risks disclosed?

Disclosure:

The prompts to run these evaluations are open-sourced and can be found at this link: https://huggingface.co/datasets/ibm-research/AttaQ

References:

Not disclosed

Score justification:

The provided link includes prompts and a code snippet for reproducing the risk evaluations.

Indicator notes:

The released code and prompts need not be the same as what is used internally, but should allow the developer's results on all risk evaluations to be reproduced. The released code must be open-source, following the OSI definition of open-source.

Example disclosure:

The code and prompts to reproduce our evaluations can be found on this GitHub repository link: [URL]

50. Pre-deployment risk evaluation (Score: 1)

Are the external entities have evaluated the model pre-deployment disclosed?

Disclosure:

As a part of IBM's commitment to external red teaming, we provide financial compensation to Robust Intelligence and Hidden Layer to run red teaming evaluations on select Granite models prior to release. IBM has no control over the results; both vendors provide the evaluation results directly to IBM. IBM has also provided complementary early API access for leaderboard evaluations by Salesforce and LMSYS Chatbot Arena. Both of these leaderboard providers share evaluation results privately with IBM. After reviewing the leaderboard evaluation results, IBM can choose to forgo publishing the results publicly on the leaderboard.

References:

Not disclosed

Score justification:

The developer discloses the entities that carried out pre-deployment evaluations, information about the terms of the evaluation, and financial transactions between the parties.

Indicator notes:

By external entities, we mean entities that are significantly or fully independent of the developer. We will award this point if the developer specifies the entity that carried out the pre-deployment analysis, discloses the terms of the analysis (such as conditions for releasing the evaluation results or the developer's control over the final results), as well as any financial transaction between the parties. We will award this point if the developer discloses no external entities have evaluated the model pre-deployment, or discloses only terms of the analysis where it is not bound by NDA while still naming all external entities.

Example disclosure:

We provide the following parties access to our model for pre-deployment capabilities evaluation: METR. METR has control over the release of the evaluation results (including whether or not to release the results and the contents of the results being released), but must provide the evaluation results to us for review before release. There are no financial transactions between us and METR.

51. External risk evaluation (Score: 1)

Are the parties contracted to evaluated model risks disclosed?

Disclosure:

Our contracted partners for evaluating model risks are Robust Intelligence, a Cisco company (automated red teaming tool), and HiddenLayer (manual red teaming of Granite-vision). We share further information about our external red teaming partners at this external web page: www.ibm.com/granite/docs/responsible-ai/

References:

Not disclosed

Score justification:

The developer discloses the names of the contracted partners for evaluating model risks, which is enough information to construct the statistics as required by the indicator.

Indicator notes:

We will award this point if the developer discloses statistics regarding all contracted parties that are responsible for evaluating risks (not limited to external entities or pre-deployment evaluation). This includes the number of contracted for-profit or non-profit entities, government entities, independent contractors, and researchers contracted by the developer to evaluate risks. We will award this point if the developer discloses it has no such contracts.

Example disclosure:

Contracted parties responsible for evaluating risks: (1) 2 contracting non-profits (2) 5 independent contractors (3) 0 government entities (4) 20 researchers

52. Mitigations taxonomy (Score: 1)

Are the post-training mitigations implemented when developing the model disclosed?

Disclosure:

Safety risks are addressed during the SFT phase of alignment.

References:

Not disclosed

Score justification:

The developer discloses the post-training mitigations implemented (SFT).

Indicator notes:

By post-training mitigations, we refer to interventions implemented by the developer during the post-training phase to reduce the likelihood and/or the severity of the model’s risks. We recognize that different developers may use different terminology for mitigations, or conceptualize mitigations differently. We will award this point for a complete list of mitigations considered, even if some of the mitigations are not reflected in the final model. Alternatively, we will award this point if the developer reports that it does not mitigate risk in this way.

Example disclosure:

We implement supervised fine tuning and reinforcement learning with human feedback to address model risks. We use no other methods to address risks.

53. Mitigations taxonomy mapped to risk taxonomy (Score: 1)

Does the developer disclose how the post-training mitigations map onto the taxonomy of risks?

Disclosure:

We use both supervised fine tuning across all of these categories to address model risks: 1) Explicit content 2) Deception 3) Discrimination 4) Harmful information 5) Violence 6) Substance abuse 7) PII leakage

References:

Not disclosed

Score justification:

The developer discloses the mapping from mitigations to risks.

Indicator notes:

We will award this point for a complete mapping of the primary risk that each mitigation is meant to address, even if the mitigation potentially maps on to other risks in the taxonomy. Alternatively, we will award this point if the developer reports that it does not mitigate risk.

Example disclosure:

We use supervised fine tuning for general instruction following. We use RLHF to reduce the model's propensity to output information about cybercrimes, bioweapons, disinformation, content harassing someone, and PII.

54. Mitigations efficacy (Score: 1)

Does the developer evaluate and disclose the impact of post-training mitigations?

Disclosure:

We do not measure risk pre-mitigations, as prior to our SFT-based mitigations the model is in its base model format, which is not instruction-tuned, and is therefore not suitable for evaluation using structured benchmarks designed for post-trained models.

References:

Not disclosed

Score justification:

The developer provides an explanation as to why they do not mitigate risk in this way: the pre-mitigations model is a base-model, so it's not possible to evaluate risks in an analogous format.

Indicator notes:

We will award this point if the developer discloses the results on the risk evaluations before and after the post-training mitigations are applied. Alternatively, we will award this point if the developer reports that it does not mitigate risk in this way.

Example disclosure:

Pre-mitigations (measured through mean attack success rate on HarmBench): (1) Misinformation: 0.80 (2) Harassment: 0.91 (3) Cybersecurity risks: 0.56 (4) Bioweapons design (subset of Chemical Biological in HarmBench): 0.62 (5) Personally-identifiable information (subset of General Harm in HarmBench): 0.52 Post-mitigations (measured through mean attack success rate on HarmBench): (1) Misinformation: 0.02 (2) Harassment: 0.01 (3) Cybersecurity risks: 0.10 (4) Bioweapons design (subset of Chemical Biological in HarmBench): 0.12 (5) Personally-identifiable information (subset of General Harm in HarmBench): 0.02

55. External reproducibility of mitigations evaluation (Score: 0)

Are code and prompts to allow for an external reproduction of the evaluation of post-training mitigations disclosed?

Disclosure:

We do not measure risk pre-mitigations, as prior to our SFT-based mitigations the model is in its base model format, which is not instruction-tuned, and is therefore not suitable for evaluation using structured benchmarks designed for post-trained models.

References:

Not disclosed

Score justification:

The developer does not release reproducible mitigations evaluations. Note that, unlike the previous indicator, exceptions are granted only for if the developer reports that it "does not mitigate risk" (versus if the developer reports that it "does not mitigate risk in this way").

Indicator notes:

The released code and prompts need not be the same as what is used internally, but should allow the developer's results on all mitigations evaluations to be reproduced. The released code must be open-source, following the OSI definition of open-source. Alternatively, we will award this point if the developer reports that it does not mitigate risk.

Example disclosure:

We release the code and prompts for reproducing post-training mitigation evaluations at this GitHub link: [URL]

56. Model theft prevention measures (Score: 1)

Does the developer disclose the security measures used to prevent unauthorized copying (“theft”) or unauthorized public release of the model weights?

Disclosure:

The Granite models are opensource models under Apache2.0 license, so model weights are freely available on Hugging Face. During training, our models are trained on IBM’s internal Blue Vela supercomputer, with access controls. Once training is complete, the model is stored in the DMF Lakehouse (our internal governance and lineage platform) under access controls until it is released publicly on Hugging Face.

References:

Not disclosed

Score justification:

The developer discloses model theft mitigations both before and after release (access controls before release & no mitigations after release).

Indicator notes:

This indicator assesses the developer's disclosures regarding how it addresses the risk that malicious actors or insiders could exfiltrate or replicate proprietary weights. Security measures could include insider threat analysis and detection, in addition to external threat management. Examples of such measures include encryption at rest, key management, remote attestation, or auditing for suspicious queries. We will award a point if the developer discloses specific steps taken to safeguard the model weights or that none are implemented.

Example disclosure:

We store model weights on encrypted volumes with hardware-based key management. We monitor inference queries for suspicious patterns (like repeated attempts to reconstruct weights token-by-token), and we audit all staff access logs monthly.

57. Release stages (Score: 1)

Are the stages of the model's release disclosed?

Disclosure:

Early checkpoints are forked off between 2-6T tokens for preliminary evaluations, if performant, these checkpoints are made available to internal users for testing, development, and early red-teaming. Model continues to train until target performance is met Before the official launch, there is an internal testing period where all final evaluations and red-teaming are performed. If time permits, endpoints are stood-up and shared with leaderboard providers for early evaluation under NDA. Approximately one week before the release, checkpoint weights are shared with launch partners, so that the models can be provisioned across their platforms in time for the launch.

References:

Not disclosed

Score justification:

The developer outlines the release stages for the model.

Indicator notes:

Release stages include A/B testing, release on a user-facing product, GA release, open-weight release, etc. We recognize that the release of a foundation model falls along a spectrum, with many forms of partial release, and that different developers may conceptualize release differently. We will award a point if the developer provides a clear identification of the stages through which the model was released.

Example disclosure:

We began with an internal alpha test for two weeks, followed by a closed beta with selected enterprise partners for one month, then a public waitlisted preview, and finally a general availability release once thresholds on safety benchmarks were met.

58. Risk thresholds (Score: 1)

Are risk thresholds disclosed?

Disclosure:

For any model release, we have a general safety release threshold where the model must achieve 0.8 on AttaQ, measured as the average across all subcategories scores (Explicit content, Deception, Discrimination, Harmful information, Violence, Substance abuse, PII leakage. Scoring lower than this threshold would hold the release until safety can be improved.

References:

Not disclosed

Score justification:

The developer discloses risk thresholds.

Indicator notes:

Risk thresholds determine when a risk level is unacceptably high to a developer (e.g. leading to the decision to not release a model), moderately high (e.g. triggering additional safety screening), or low enough to permit normal usage. We will award this point if the developer discloses explicit risk thresholds that clarify (i) which harmful outcomes are being scored, (ii) how the scores are computed (in general terms, not necessarily disclosing internal algorithms), and (iii) what triggers an action to block, delay, or otherwise modify a model's release. Alternatively, we will award a point if the developer discloses that it does not consider explicit risk thresholds during model release.

Example disclosure:

Our risk threshold for biorisks is the ability to autonomously create bioweapons. Current models score a medium: they don't autonomously create bioweapons but could help a skilled practitioner with access to materials in speeding up creation of bioweapons. Risk thresholds higher than medium would delay the model's release until the risk level drops to medium or below.

59. Versioning protocol (Score: 1)

Is there a disclosed protocol for versioning and deprecation of the model?

Disclosure:

Model versions follow an X.X nomenclature. A major, X.0, release (such as Granite 4.0), will always indicate there are major architectural changes in the model being released. A minor change, including newly fine-tuned versions of previous released models, or updated phase 2 training of previously released models, will results in a point update (e.g. Granite 3.3). Models are periodically deprecated on watsonx.ai. Users are given a minimum 90 day notice, and are informed of the newer model that the user should switch to instead.

References:

Not disclosed

Score justification:

The developer discloses a versioning protocol. The developer also discloses the version deprecation/communication protocol.

Indicator notes:

We will award a point if the developer discloses how model versions are labeled, updated, deprecated, and communicated to users.

Example disclosure:

We version models based on the date of release: e.g., ModelName-11-01-2024. We additionally provide ModelName-latest, corresponding to the latest release. We deprecate versions of models when we plan to remove access to with a six months notice to users. Users should respond to model deprecation by switching to the newest version of the models or an equivalent non-deprecated model. Users can switch to a different model by replacing the model identifier (to e.g., ModelName-latest for the latest version) in API calls or through the Python SDK.

60. Change log (Score: 1)

Is there a disclosed change log for the model?

Disclosure:

-12/18/24, Granite 3.1 Update - New model featured an extended context length, from 4K to 128K, and improved general performance. Full details: https://www.ibm.com/new/announcements/ibm-granite-3-1-powerful-performance-long-context-and-more - 2/26/25, Granite 3.2 Update - New model featured CoT-reasoning capability that can be toggeled on and off, as well as new multi-modal support for vision understanding tasks. Full details: https://www.ibm.com/new/announcements/ibm-granite-3-2-open-source-reasoning-and-vision - 4/16/25, Granite 3.3 Update - New model featured improved math-based reasoning performance, and new multi-modal support for speech transcription tasks. Full details: https://www.ibm.com/new/announcements/ibm-granite-3-3-speech-recognition-refined-reasoning-rag-loras - 5/2/24, Granite 4.0 Preview - New model previewed the new Granite 4.0 architecture, showcasing improved inference efficiency and performance. Full details: https://www.ibm.com/new/announcements/ibm-granite-4-0-tiny-preview-sneak-peek

References:

Not disclosed

Score justification:

The developer discloses a change log that lists the feature updates and performance improvements.

Indicator notes:

We will award a point if the developer publishes a version-by-version record of new features, fixes, or performance improvements.

Example disclosure:

On 11/1/2024 (version ModelName-11-01-2024), we improved model reasoning in technical domains. This resulted in a 20-point increase on the MATH benchmark (from 62% to 82%). Past change logs can be viewed at [URL]

61. Foundation model roadmap (Score: 1)

Is a forward-looking roadmap for upcoming models, features, or products disclosed?

Disclosure:

In Q3, we plan to release a series of minor updates to the Granite 4 family that improve general performance and also introduce multi-modal features for Vision and Audio understanding. By end of year we anticipating releasing a Granite-4-Medium model, to complement Granite-4-Small and Granite-4-Tiny. We further plan updates for Granite Embedding models (including updated architecture and support for code and function calling), and new Granite Time Series models (including multivariate long sequence forecasting).

References:

Not disclosed

Score justification:

The developer discloses a forward-looking roadmap that describes upcoming model releases/features.

Indicator notes:

A foundation model roadmap is a transparent statement about how the developer intends to evolve or expand its LLM offerings, including upcoming models, major feature releases, or expanded products based on the model, along with approximate timelines or version milestones. It can be high-level (e.g., “new model Q2 2025”), but must exist publicly.

Example disclosure:

We plan to release ModelX2 in Q2 2025, featuring enhanced multilingual capabilities and improved retrieval. We also aim to launch an enterprise-specific product tier for regulated industries by early 2026.

62. Top distribution channels (Score: 1)

Are the top-5 distribution channels for the model disclosed?

Disclosure:

The top-5 distribution channels for Granite are: By tokens consumed: -watsonx.ai -Replicate By number of downloads: -Hugging Face -Ollama -LMStudio

References:

Not disclosed

Score justification:

The developer discloses the top-5 distribution channels and the ranking metrics used to determine the top-5.

Indicator notes:

We define distribution channels to be either an API provider (a pathway by which users can query the model with inputs and receive outputs) or a model distributor (a pathway by which model weights are released). We recognize that distribution channels may arise without the knowledge of the model developer. For example, the weights of a model may be released through one distribution channel and then be distributed through other channels. Distribution channels can be ranked by any reasonable metric (e.g., number of queries, number of downloads, number of users, revenue). A description of the metric should be provided. API providers and model distributors may be ranked separately using different metrics as long as the total number of distribution channels equals five (if five distribution channels exist). For example, the developer may choose to disclose the top-3 API providers (ranked by the number of queries) and the top-2 model distributors (ranked by the number of downloads).

Example disclosure:

We provide API access to the model through A, B, and C. We distribute model weights through D and E. We pick the top-3 API providers based on the average number of queries per month and the top-2 model weight providers based on the average number of downloads per month.

63. Quantization (Score: 1)

Is the quantization of the model served to customers in the top-5 distribution channels disclosed?

Disclosure:

watsonx.ai and Replicate are served in 16 bit precision. Ollama defaults to using a quantized version of uploaded models, specifically 4-bit quantization (Q4_0), LMSudio offers options between 3 and 8 bit precision. Hugging face model weights natively support bf16 or fp8 precision, and GGUF versions are also available.

References:

Not disclosed

Score justification:

The developer discloses the quantization across all distribution channels.

Indicator notes:

We will award this point for a disclosure of the model precision in each of the top-5 distribution channels.

Example disclosure:

We serve the model at 16-bit precision on all distribution channels.

64. Terms of use (Score: 1)

Are the terms of use of the model disclosed?

Disclosure:

Granite models are available via IBM’s watsonx service. Their terms of use may be found here: https://www.ibm.com/docs/en/watsonx/saas?topic=solutions-terms-use. Users of Granite models are indemnified when accessing the models via watsonx. More information on the uncapped IBM indemnification may be found here: https://www.ibm.com/docs/en/watsonx/saas?topic=models-choosing-model#indemno. We also opensource Granite models on Hugging Face under an Apache 2.0 license and are committed to opensource innovation. Apache 2.0 model licenses are noted on our Hugging Face model pages: https://huggingface.co/ibm-granite. Depending on the distribution channel for accessing the open source models, additional terms of use may apply: Hugging Face: https://huggingface.co/terms-of-service LMStudio: https://lmstudio.ai/app-terms Replicate: https://replicate.com/terms Our models are also available via Replicate's service. Their terms of use may be found here: https://replicate.com/terms.

References:

Not disclosed

Score justification:

The developer discloses the terms of use or model license across all distribution channels.

Indicator notes:

We define terms of use to include terms of service and model licenses. We will award this point for a pointer to the terms of service or model license. In the event that model's licenses are written more generally, it should be clear which assets they apply to. We recognize that different developers may adopt different business models and therefore have different types of model licenses. Examples of model licenses include responsible AI licenses, open-source licenses, and licenses that allow for commercial use. Terms of service should be disclosed for each of the top-5 distribution channels. However, we will award this point if there are terms-of-service that appear to apply to the bulk of the model’s distribution channels.

Example disclosure:

Our terms of service are published at https://ourcompany.com/model-tos - these terms cover both our API and all distribution channels for model weights.

65. Distribution channels with usage data (Score: 1)

What are the top-5 distribution channels for which the developer has usage data?

Disclosure:

We do not have access to external usage data for any distribution method of Granite models. Our terms & conditions do not permit us to have access to any watsonx.ai external usage data, even though watsonx.ai is owned and operated by IBM.

References:

Not disclosed

Score justification:

Disclosure about no usage data suffices

Indicator notes:

We define distribution channels to be either an API provider (a pathway by which users can query the model with inputs and receive outputs) or a model distributor (a pathway by which model weights are released). We recognize that distribution channels may arise without the knowledge of the model developer. For example, the weights of a model may be released through one distribution channel and then be distributed through other channels. Distribution channels can be ranked by any reasonable metric (e.g., number of queries, number of downloads, number of users, revenue). A description of the metric should be provided. We define usage data as any form of developer-exclusive data collected from any of a developer's distribution channel. A developer has access to usage data from a distribution channel if it is able to use that data for downstream purposes (e.g., analytics, training etc.). Usage data may be shared outside of the developer, but it is initially collected by the distribution channel and shared to the developer.

Example disclosure:

We have access to usage data through the distribution channels: A, B, and C.

66. Amount of usage (Score: 1)

For each of the top-5 distribution channels, how much usage is there?

Disclosure:

We do not have access to external usage data for any distribution method of Granite models. Our terms & conditions do not permit us to have access to any watsonx.ai external usage data, even though watsonx.ai is owned and operated by IBM.

References:

Not disclosed

Score justification:

Disclosure about no usage data suffices

Indicator notes:

Usage should be reported as the number of queries over the span of a month, reported to the precision of one significant figure (e.g., 50 million queries).

Example disclosure:

Distribution channel A: 50 million queries. Distribution channel B: 10 million queries. Distribution channel C: 10 million queries.

67. Classification of usage data (Score: 1)

Is a representative, anonymized dataset classifying queries into usage categories disclosed?

Disclosure:

We do not have access to external usage data for any distribution method of Granite models. Our terms & conditions do not permit us to have access to any watsonx.ai external usage data, even though watsonx.ai is owned and operated by IBM.

References:

Not disclosed

Score justification:

Disclosure about no usage data suffices

Indicator notes:

Developers may either share a fully public dataset or a partially restricted dataset (e.g., under a research license). We will award this point if there is a clear, aggregated or sample dataset that reveals categories of tasks/queries.

Example disclosure:

We provide quarterly releases of an anonymized dataset that classifies user queries into 20 broad job-related categories. Researchers can request access via [URL]. We ensure no PII is included.

68. Data retention and deletion policy (Score: 1)

Is a policy for data retention and deletion disclosed?

Disclosure:

IBM has global policies regarding the retention and deletion of data outlined here: https://www.ibm.com/us-en/privacy. This is covered in IBM's (PS) Privacy & AI Services Retention Policy that covers: Privacy Records, AI Records, Data and Models, and Integrated Governance Management System (IGMS) Records. Additionally, IBM has a DSR program. Data privacy regulations give individuals the right to request details of the Personal Information (PI) that IBM holds on them. These Data Subject Rights (DSR) requests may require the extraction, correction or deletion of the PI pertaining to an individual. Data is retained per a data retention policy OR if a DSR request is made.

References:

Not disclosed

Score justification:

Clear description of data retention and deletion policy

Indicator notes:

A data retention and deletion policy is a policy for removing particular data from the training set and/or preventing it from being used if there is a user or external request (e.g., “right to be forgotten”) that also covers internal data governance. This includes whether there is a formal process to delete or retract data from future training runs and how long raw data is retained. It also clarifies how quickly deletions propagate to the model (e.g., “only in subsequent major model releases”).

Example disclosure:

We honor verified user requests to delete personal data from our training corpus by removing it from any subsequent scheduled retraining. Our data retention policy ensures chat logs are purged after 90 days.

69. Geographic statistics (Score: 0)

Across all forms of downstream use, are statistics of model usage across geographies disclosed?

Disclosure:

To-date, there have been millions of public downloads of Granite across our opensource channels (e.g., Hugging Face, Ollama). We do not have access to country-level metrics across these platforms. For watsonx.ai, Granite has been available globally in the US, Canada, Australia & New Zealand, Europe (with data centers in Frankfort, London), and Tokyo, with a majority of the usage in the US. A large portion of IBM's business is on-premises, meaning we do not have access to detailed usage data.

References:

Not disclosed

Score justification:

No disclosure of non on-prem usage

Indicator notes:

We will award this point if there is a meaningful, though potentially incomplete or vague, disclosure of geographic usage statistics at the country-level.

Example disclosure:

We share anonymized per-country usage metrics in a publicly accessible dashboard, updated monthly, on this link: [link]

70. Internal products and services (Score: 1)

What are the top-5 internal products or services using the model?

Disclosure:

At this point in time, Granite models are embedded to enable specific features across 7 IBM products, listed in no particular order: -watsonx Orchestrate -watsonx Code Assistant (IBM Z, Ansible, Java, IBM i) -watsonx.ai -watsonx.Gov -watsonx.Data -Instana -Maximo Application Suite

References:

Not disclosed

Score justification:

All IBM products listed

Indicator notes:

An internal product or service is a product or service built by the developer. Products or services can be ranked by any reasonable metric (e.g., number of users, queries, revenue). A description of the metric should be provided.

Example disclosure:

The model is used in products A, B, C, D, and E. We choose products based on the number of montly active users.

71. External products and services (Score: 1)

What are the top-5 external products or services using the model?

Disclosure:

We do not have access to any metrics about Granite's use in external products or services. We know who our clients are but do not have metrics of their usage and marketing will separately reach out to clients for public case studies (studied in 77). Contractual terms are confidential and proprietary and we would not have access to that information to make available.

References:

Not disclosed

Score justification:

Developer discloses no access to such data

Indicator notes:

An external product or service is a product or service built by a party external to the developer. Products or services can be ranked by any reasonable metric (e.g., number of users, queries, revenue). A description of the metric should be provided. We will award a point if the developer discloses that that it does not have access to such metrics about external products or services.

Example disclosure:

The model is used in products A, B, C, D, and E. We choose products based on the number of montly active users.

72. Users of internal products and services (Score: 1)

How many monthly active users are there for each of the top-5 internal products or services using the model?

Disclosure:

Due to the B2B enterprise nature of our business, and on-premise deployment pattern for many of our products, we do not have access to this information at this level. We do not have access to these metrics for the reasons stated above and because many of our products are deployed on-prem (not SaaS).

References:

Not disclosed

Score justification:

No access to this information so no disclosure

Indicator notes:

An internal product or service is a product or service built by the developer. The number of users refers to users who engaged or interacted with the model through the internal product or service over the last month or averaged over the last X months (this should be specified). Number of users should be specified to one significant figure (e.g. 100,000).

Example disclosure:

Over the last 6 months, the total monthly active users for our top-5 products using model Y are: Product A: 100,000 users Product B: 30,000 users Product C: 10,000 users Product D: 10,000 users Product E: 10,000 users

73. Consumer/enterprise usage (Score: 1)

Across all distribution channels for which the developer has usage data, what portion of usage is consumer versus enterprise?

Disclosure:

IBM is a B2B company. Our Granite models are built specifically for enterprise use. With the exception of Hugging Face, Ollama, LMStudio, Replicate -- where we would anticipate public downloads/usage to include both consumer and enterprise users and where these metrics are not available -- our IBM distribution channels are exclusively focused on enterprise customers

References:

Not disclosed

Score justification:

Reasonable attempt to describe nearly whole enterprise usage

Indicator notes:

Consumer usage refers to usage by individual consumers. Enterprise usage refers to usage by enterprise customers (including government use). Consumer and enterprise usage should be calculated in terms of the number of queries by or the amount of revenue from consumer or enterprise users. Percentages should be specified to two significant digits (e.g., 12% consumer, 88% enterprise).

Example disclosure:

12% of the usage of model A across all distribution channels is from consumers, 88% is from enterprise users. Of this 88%, 6% is from users at governments. Usage is calculated based on number of queries.

74. Enterprise users (Score: 0)

Across all distribution channels for which the developer has usage data, what are the top-5 enterprises that use the model?

Disclosure:

We do not have access to external usage data for any distribution method of Granite models.

References:

Not disclosed

Score justification:

While IBM discloses they do not collect external usage data, there are other means for identifying the top-5 enterprise users by some metric (e.g. the total revenue IBM generates from the relationship with the enterprise or the amount earned via specific enterprise contracts).

Indicator notes:

Enterprises should be ranked by the number of queries made or the amount of revenue from usage since the model's release. We will also award this point if the developer indicates it does not have access to enterprise usage data.

Example disclosure:

The top-5 enterprises are A, B, C, D, and E. The enterprises are selected based on the number of queries.

75. Government use (Score: 1)

What are the 5 largest government contracts for use of the model?

Disclosure:

As open weight models, Granite models are available for any government entity to use without a contract with IBM. IBM can not share customer details on Government contracts for IBM products or services, without prior customer consent.

References:

Not disclosed

Score justification:

No disclosure suffices because of no prior consent of government users

Indicator notes:

This includes known government contracts of enterprise or government-specific products and services that use the model. We will award this point if the developer discloses its top five government contracts ranked monetary value, though the developer may omit contracts where it is under NDA regarding the existence of the contract.

Example disclosure:

The five largest government users of our service, along with their use cases, are: 1. County A is utilizing our product for improving access to internal resources 2. National Lab B is using our model to advance bioscientific research. 3. Federal agency C is using our product to deliver faster, more accurate translation services 4. City D is participating in a pilot program found our product helped reduce the time spent on routine tasks 5. Country E is using our product to summarize legal documents in their lower courts.

76. Benefits Assessment (Score: 1)

Is an assessment of the benefits of deploying the model disclosed?

Disclosure:

The benefits of deploying the model are observed across a wide range of use cases. For example: - The U.S. Open uses Granite foundation models to provide commentary for hundreds of matches. They achieved a 220% increase in match reports created by leveraging Granite. - Blue Pearl managed to cut 65% of data processing and analysis time through a job-matching engine built on Granite. For more use cases and deployment benefits, please refer to the official website: https://www.ibm.com/granite.

References:

Not disclosed

Score justification:

Two customer use cases with quantitative performance improvements are provided

Indicator notes:

We will award this point for any quantitative assessment of the benefits or potential benefits of deploying the model.

Example disclosure:

We analyze the impact of using the model in education outcomes using a randomized controlled trial in third grade math assignnments, and find that use in the classroom improves standardized test outcomes by 26%. [Link to report.]

77. AI bug bounty (Score: 1)

Does the developer operate a public bug bounty or vulnerability reward program under which the model is in scope?

Disclosure:

Yes, we run a vulnerability disclosure program through PSIRT for which any harmful or unsafe response from the model is in scope, so long as the response was generated using an IBM product, according to IBM Software terms of use (e.g. adversarial attacks or misuse of the model are out of scope). Submission process can be found here: https://hackerone.com/ibm. In addition, IBM recently engaged HackerOne to stand up an expanded bug bounty program specifically for Granite with up to $100k in bug bounty payouts over the first year of the program. This program is in the process of being rolled out for the Granite 4.0 release this summer. This program will start with any harmful unsafe response that can bypass our Granite Guardian guardrail model being in scope (especially those generated by adversarial attacks). As the program matures, we envision running specific campaigns to target different types of risks, or less common attacks. Commonly reported attacks will be listed as out of scope, to ensure the program identifies high quality, diverse attacks.

References:

Not disclosed

Score justification:

Both previous and new bug bounty are described

Indicator notes:

We will award this point for a publicly documented bug bounty or vulnerability reward program describing (i) in-scope vulnerabilities (e.g., prompt bypasses, data leaks), (ii) out-of-scope items, (iii) submission process, and (iv) reward tiers or recognition if applicable. We will award a point if the developer discloses it has no AI bug bounty that encourages external researchers to report security, privacy, or adversarial vulnerabilities in the model.

Example disclosure:

We run a bug bounty program with HackerOne. We award up to $5,000 for critical vulnerabilities, such as discovering a major exploit that circumvents our content filters or reveals private data. [link to bug bounty]

78. Responsible disclosure policy (Score: 1)

Does the developer clearly define a process by which external parties can disclose model vulnerabilities or flaws?

Disclosure:

Yes, external parties can disclose model vulnerabilities or flaws via the PSIRT process, outlined here: https://hackerone.com/ibm. Once filed, the PSIRT team at IBM triages inbound reports and notifies the Granite and watsonx teams that a report has been received. The Granite team or watsonx team then takes appropriate action, which may include remediating the identified issues, in a timely manner. IBM is also in the process of setting up a Bug Bounty program for Granite, to be launched later this summer.

References:

Not disclosed

Score justification:

PSIRT process suffices

Indicator notes:

We will award this point for a description of the process external parties can use for responsbly disclosing model vulnerabilities and flaws, which should include (i) what mechanism external parties can use to disclose vulnerabilities or flaws (e.g., a form, an email) and (ii) what process follows a disclosure (e.g., how much time must parties wait until public release). This is often included with a bug bounty, but can also be standalone. We will award a point if the developer discloses it has no responsible disclosure policy.

Example disclosure:

We maintain a responsible disclosure policy at [URL] that describes how external parties can disclose vulnerabilities and flaws in Model A, including a 45-day disclosure window and an official contact for urgent security vulnerabilities.

79. Safe harbor (Score: 1)

Does the developer disclose its policy for legal action against external evaluators conducting good-faith research?

Disclosure:

Good Faith Security Research of IBM security flaws or vulnerabilities is protected by the IBM Safe Harbor Policy. For reporting methods available and for full details of the IBM Safe Harbor Policy, visit the 'Vulnerability Reporting' section of the IBM Security vulnerability management page at https://www.ibm.com/trust/security-vulnerability-management.

References:

Not disclosed

Score justification:

Safe harbor is provided in the policy and in the description

Indicator notes:

We will award this point if the developer discloses whether it has a policy committing it to not pursue legal action against external evaluators conducting good-faith research. This should not be only for software security vulnerabilities, but also AI flaws, and it should be based on researcher conduct standards, not at the sole discretion of the company. We will award this point if the developer provides a clear description of its policy regarding such protections for external researchers, or lack thereof.

Example disclosure:

We do not have a policy for researcher protections for good-faith safety research. OR Our policy ensures no legal action against good‐faith researchers who follow our disclosure guidelines, see: [link]

80. Security incident reporting protocol (Score: 1)

Are major security incidents involving the model disclosed?

Disclosure:

All security incident reporting is issued through IBM's PSIRT (Product Security Incident Response Team). Submissions may be made via HackerOne at hackerone.com/ibm, with reports then forwarded to PSIRT for triaging and remediation. The PSIRT team will notify the Granite team and watsonx. Average time to first response on the HackerOne website for IBM is 11 hours, and the Granite team endeavors to respond in a timely manner. Model security incident reports may be used to inform future AI alignment work, with a dedicated security focal receiving notice of inbound PSIRT tickets for Granite. Concerning disclosure, “IBM follows common industry practices for coordinated and responsible vulnerability disclosure processes during such investigations. In which, IBM requests all Security Researchers to allow IBM the opportunity to follow this process and remediate any reported vulnerabilities before you publicly disclose or share the vulnerability or methods to exploit with any third party. The recommended time frame for disclosure is no sooner than 30 days after the fix is made publicly available." Full text of the program policy copied below as reference: Policy Last updated on March 19, 2025. View changes IBM recognizes how important the security community is in keeping our IBM products, offerings, services, websites and secrets safe for our customers and users. We thank you in advance for your contributions to our vulnerability disclosure program. Vulnerability reports submitted via this program will be handled by IBM’s global Product Security Incident Response Team (PSIRT). This team will coordinate with other IBM teams to investigate, and if needed, identify the appropriate response plan. Maintaining communication between all involved parties, both internal and external, is a key component of our vulnerability response process. Scope • This Program is limited to exploitable security vulnerabilities and CVE found in IBM products, offerings, services, websites and secrets. • We ask that customers and other entitled users of an IBM product or offering contact IBM Technical Support to report any potential issues that they may discover in their use of those products. • Please only report vulnerabilities for IBM products that are still being supported by IBM. Check our IBM Support Software lifecycle at https://www.ibm.com/support/pages/lifecycle/ to determine which product versions are still supported. Process • IBM aims to respond to all new vulnerability reports within 7 business days. • To protect our customers, IBM does not publicly disclose or confirm security vulnerabilities until IBM has conducted a full analysis of the reported vulnerability and issued any necessary fixes or mitigations. • IBM follows common industry practices for coordinated and responsible vulnerability disclosure processes during such investigations. In which, IBM requests all Security Researchers to allow IBM the opportunity to follow this process and remediate any reported vulnerabilities before you publicly disclose or share the vulnerability or methods to exploit with any third party. The recommended time frame for disclosure is no sooner than 30 days after the fix is made publicly available. • IBM does not participate in a bug bounty awards program at this time. However, when a vulnerability is confirmed, remediated, and then disclosed - we will offer to recognize and credit the vulnerability reporter within our public disclosure. Guidelines • When submitting reports to us, we ask that you combine reports if the same or similar root cause affects multiple endpoints, subdomains or assets. • Do not include any information in vulnerability reports, including in any attachments, that may identify an individual (such as a name, contact information, IP address or other similar information). • In researching a vulnerability do not cause harm to IBM or our customers, attempt to access our offices, data centers, user accounts other than your own, test for spam, phishing, social engineering or denial of service issues, violate any applicable law, disrupt or compromise any data that is not your own, or further exploit a confirmed vulnerability. • For the quickest handling of any vulnerability submissions, please ensure that you demonstrate the steps taken to identify or recreate the vulnerability. • Findings which do not demonstrate any actionable vulnerability will not be accepted by this program. Examples of such non-vulnerabilities include content spoofing or text injection situations with no clear attack vector, and disclosure of information that is intended to be publicly accessed or otherwise does not present real risk to IBM or our customers. IBM Safe Harbor Policy Good Faith Security Research of IBM security flaws or vulnerabilities is protected by the IBM Safe Harbor Policy. For reporting methods available and for full details of the IBM Safe Harbor Policy, visit the 'Vulnerability Reporting' section of the IBM Security vulnerability management page at https://www.ibm.com/trust/security-vulnerability-management. Legal Notice So that IBM may utilize your vulnerability report to determine and develop appropriate remediation procedures, by submitting a vulnerability report to IBM, you grant to IBM Corporation, its subsidiaries and its affiliates, a perpetual, irrevocable, no charge license to all intellectual property rights licensable by you in or related to the use of this material. Also, for similar reasons, it is important that you notify us if any of this material is not your own work or is covered by the intellectual property rights of others. Not notifying us means that you've represented that no third-party intellectual property rights are involved.

References:

Not disclosed

Score justification:

Timeframe and terms are given

Indicator notes:

A security incident reporting protocol provides post-deployment transparency about serious incidents or breaches. Security incidents refer to incidents where external security threats affect the model (e.g., data breaches or DDoS attacks on the service). We will award this point if the developer states (i) how to submit a security incident report, (ii) how quickly it will respond, and (iii) when and whether results are disclosed. Every incident need not be reported publicly, but the developer must disclose a policy determining how incidents are reported and disclosed.

Example disclosure:

We publish a public ‘Security Incident Report’ on our website for any confirmed security incident affecting the model within 7 days of a patch being implemented. Users and researchers can report incidents via security@ourcompany.com, and we commit to an initial acknowledgment within 48 hours.

81. Misuse incident reporting protocol (Score: 1)

Are misuse incidents involving the model disclosed?

Disclosure:

IBM's Critical Misuse Incident Reporting may be found, below. In terms of process, disclosed incidents are sent to the PSIRT team for triaging via HackerOne submissions here. Tickets flagged for Granite are then forwarded in a timely manner to the Granite team (i.e., our dedicated security focal and senior technical program manager for AI governance) in addition to the watsonx team. If a disclosure is identifed as a critical misuse incident, then the report would trigger the following policy. Critical Misuse Incident Reporting 1. Preamble 1.1 The Data and Model Factory takes model security, including post-deployment security, seriously and endeavors to remediate issues in a timely manner, including by working with internal and external partners, as required. 1.2 A critical misuse incident is model use that leads to material (see footnote 1) real-world harm. This may include, but is not limited to, malicious misinformation or foreign-backed offensive cybersecurity operations. 2. General Policy 2.1 This Policy applies to opensourced Granite models. 2.2 Incident reports must be reported to PSIRT. 2.3 The Granite team endeavors to respond in a timely manner (see footnote 2) and in accordance with the guidance, policies, and best practices of PSIRT. 2.4 Disclosure may be made, upon recommendation of the AI security focal designated with remediating the issue. Disclosure may occur via formal channels. (See footnote 3.) Footnotes: 1. Materiality in the context of model security is an executive leadership determination, based on the relevance and significance of the critical misuse identified. 2. The team endeavors to respond to critical misuse incidents within 48 hours or as soon as possible. 3. This includes but is not limited to: https://www.ibm.com/granite, https://www.ibm.com/granite/docs/, https://research.ibm.com/blog, or other relevant channels for disclosure.

References:

Not disclosed

Score justification:

Description includes terms and how something might be disclosed

Indicator notes:

A misuse incident reporting protocol provides post-deployment transparency about incidents of misuse involving the model. As opposed to the previous indicator, this indicator is about actors misusing the model to cause real-world harm, such as misinformation operations or cybersecurity attacks. We will award this point if the developer states (i) how to submit a misuse incident report, (ii) how quickly it will respond, and (iii) when and whether results are disclosed. Every incident need not be reported publicly, but there needs to be a policy governing how incidents are reported.

Example disclosure:

We publish a public ‘Misuse Incident Report’ on our website for any confirmed misuse incident within 7 days of a patch being implemented. Users and researchers can report incidents regarding our flagship foundation model via security@ourcompany.com, and we commit to an initial acknowledgment within 48 hours.

82. Post-deployment coordination with government (Score: 1)

Does the developer coordinate evaluation with government bodies?

Disclosure:

We do not coordinate with any government entities or AI Safety Institutes on post-development evaluations for Granite. We do actively contribute to opensource evaluations, see https://huggingface.co/datasets/ibm-research/AttaQ and https://huggingface.co/datasets/ibm-research/ProvoQ. In addition, our policy team regularly contributes inputs to federal, state, and foreign (non-US) regulators’ calls for comments. IBMers also testify before Congress and other legislative bodies (e.g., https://www.judiciary.senate.gov/imo/media/doc/2023-05-16 - Testimony - Montgomery.pdf) to inform the policymaking process.

References:

Not disclosed

Score justification:

Confirms no post-deployment coordination

Indicator notes:

We will award this point if the developer specifies which government bodies it is coordinating with and for what types of post-deployment evaluations. Government bodies include AI Safety Institutes, national security agencies, national labs, and international governmental enties such as UN agencies or the G7. Evaluation here may also include sharing of the developer's proprietary evaluation results for help with interpretation.

Example disclosure:

We do not coordinate with any government entities or AI Safety Institutes. OR We coordinate with the UK AISI for post-deployment evaluation of cyber, CB, and autonomy-related capabilities.

83. Feedback mechanisms (Score: 1)

Does the developer disclose a way to submit user feedback? If so, is a summary of major categories of feedback disclosed?

Disclosure:

Community users can disclose feedback in our Github repo as issues, on our Hugging Face model cards as community comments, or on our Granite threads on reddit, all of which are reviewed by community triagers. We find that users mainly report issues or bugs in the initial configuration files that our posted, issues using the chat templates, and requests for new features, such as additional languages.

References:

Not disclosed

Score justification:

Several feedback mechanisms are provided, summary of feedback also given

Indicator notes:

We will award this point if the developer (i) discloses how users can submit feedback (e.g., via a form or a thumbs up/thumbs down for model responses) and (ii) discloses aggregated or categorized feedback data (e.g. a categorization of thumbs up and thumbs down data).

Example disclosure:

Users can submit feedback at this url: [URL] We find that users mainly report issues with API call response times, over-refusals from models, and outdated information in model outputs. A detailed categorization of user reports is available at [URL]

84. Permitted, restricted, and prohibited model behaviors (Score: 1)

Are model behaviors that are permitted, restricted, and prohibited disclosed?

Disclosure:

All Granite models are released under an Apache 2.0 license on Hugging Face. When accessed in the open source, we do not impose restrictions on model behavior, althoguh we do pubish information on recommened uses and discouraged uses in our Responsible Use Guide (https://www.ibm.com/granite/docs/resources/responsible-use-guide.pdf). Watsonx is a downstream catcher/consumer of Granite models for IBM and makes its AI models, including Granite, available under the following restrictions: AI Use Restrictions. Client agrees not to, and not to direct or allow third parties, use foundation models in connection with the use of this Cloud Service : (i) for mass surveillance, racial profiling, or any use that violates or encourages the violation of basic human rights or other applicable laws and regulations; (ii) to distribute false, misleading, disparaging or obscene information or content; (iii) to provide fully automated decision making in connection use cases involving critical processes or the risk of loss of life, property or impact on an individual's legal rights; (iv) in a manner that impersonates another for deceptive purposes or conceals the fact a user is interacting with AI; or (v) to distribute or intentionally generate malware or other harmful code. This Cloud Service is not intended for use in a high-risk context as defined under the EU AI Act or other applicable regulations. If you chose to use it in such, please contact ChiefPrivacyOffice@ca.ibm.com. These may be found here: https://www.ibm.com/support/customer/csol/terms/?id=i126-6883&lc=en.

References:

Not disclosed

Score justification:

Description of model behavior suffices

Indicator notes:

We refer to a policy that includes this information as a model behavior policy, or a developer's policy on what the foundation model can and cannot do (e.g. such a policy may prohibit a model from responding to NSFW content). We recognize that different developers may adopt different business models and that some business models may make enforcement of a model behavior policy more or less feasible. We will award this point if at least two of the three categories (i.e. permitted, restricted, and prohibited model behaviors) are disclosed. Alternatively, we will award this point if the developer reports that it does not impose any restrictions on its model's behavior in this way.

Example disclosure:

We allow responses from Model A that include broad Q&A, restrict sexual or harassing content, and prohibit facilitating illegal or violent acts. More details can be found in our guidelines for model behavior here: [link]

85. Model response characteristics (Score: 1)

Are desired model response characteristics disclosed?

Disclosure:

Leveraging the system prompts we configure the model to be accurate, reliable, and helpful (e.g., address user needs, anticipate follow-up questions) while maintaining a cautious (i.e., factually correct, neutral, and considerate responses), ethical (i.e., promote positive behavior, encourage inclusivity, ensure respectful tone), and harmless approach ensuring never to provide inappropriate, offensive, or harmful information. More details can be found in Appendix I of our Responsible Use Guide, found here: https://www.ibm.com/granite/docs/resources/responsible-use-guide.pdf .

References:

Not disclosed

Score justification:

Accurate, reliable helpful, cautious, ethical, harmless

Indicator notes:

Model response characteristics include default behaviors or behaviors that the developer steers the model to take. These may include being helpful, taking an objective point of view, or using tools only when necessary. We will award points for a clear description of desired model response characteristics or a statement that there are no such characteristics.

Example disclosure:

We configure responses from Model A to be factual, neutral, and contextually helpful, avoiding personal or biased opinions. More details can be found in our guidelines for model behavior here: [link]

86. System prompt (Score: 1)

Is the default system prompt for at least one distribution channel disclosed?

Disclosure:

Yes, for all Granite models, the default system prompts, which are used as-is in distribution channels such as Replicate, can be found in their respective tokenizer_config.json files on https://huggingface.co/ibm-granite. The default system prompt specifies that the model should be "A helpful AI assistant", and depending on the configuration selected by the developer, the prompt also explains how tool calling should be executed, how to leverage to grounding documents provided in the chat template, and how to respond with chain of thought reasoning.

References:

Not disclosed

Score justification:

Default system prompt disclosed

Indicator notes:

A system prompt is defined as the prompt provided to the system by default that guides the system's behavior. We will award this point for the disclosure of the verbatim text of the full system prompt as well as an explanation for the context in which the system prompt is used.

Example disclosure:

We disclose our default prompt for Model A via our chat interface: ‘You are a helpful AI assistant providing clear, accurate, and policy‐compliant responses.’

87. Intermediate tokens (Score: 1)

Are intermediate tokens used to generate model outputs available to end users?

Disclosure:

Yes, starting with the Granite 3.2 models (and continuing in all subsequent releases), we offer the ability for users to display the model’s thought process or intermediate tokens when using ‘thinking’ modes. These tokens are directly made available to the user.

References:

Not disclosed

Score justification:

Intermediate tokens directly available to the user

Indicator notes:

Intermediate tokens are defined as any tokens generated by the model before the final output is shown to the user, such as model chains of thought. We will also award this point if a summary of intermediate tokens is made available to end users. If intermediate tokens or summaries are not made available, the developer should provide a justification.

Example disclosure:

Model A is trained to generate intermediate chain-of-thought reasoning, but we withhold most chain-of-thought tokens from final user-facing responses to prevent model distillation. We do disclose chains-of-thought for a small set of research collaborators under NDA.

88. Internal product and service mitigations (Score: 1)

For internal products or services using the model, are downstream mitigations against adversarial attacks disclosed?

Disclosure:

Watsonx.ai is the primary IBM product offering leveraging Granite. Designed as a developer platform for building with LLMs, watsonx.ai provides configurable guardrails that use a combination of BERT-based detectors for HAP/PII, and Granite Guardian, which is an LLM-based detector for users to safeguard their AI applications against harmful content, jailbreaking attempts, prompt injections, etc. These items are available via dedicated detection APIs offered on the product that the developer can implement into their solution. We have provided additional information on Granite Guardian in Question 89, below.

References:

Not disclosed

Score justification:

Granite Guardian suffices

Indicator notes:

An internal product or service is a product or service built by the developer. Adversarial attacks include prompt injection, jailbreaking, or malicious queries. Mitigations against adversarial attacks might include specialized prompt filtering, content scanning, or real-time monitoring of queries or accounts. We will award this point if the developer discloses a clear statement of methods used (e.g., a specialized prompt sanitizer or adversarial pattern detector), or if the developer states it does not implement such product-level mitigations against adversarial attacks.

Example disclosure:

In our chatbot, we implement a second-stage content filter that checks user inputs for disallowed topics and attempts to sanitize adversarial prompts. We also log suspicious prompts for manual review.

89. External developer mitigations (Score: 1)

Does the developer provide built-in or recommended mitigations against adversarial attacks for downstream developers?

Disclosure:

Yes, when deploying Granite models (or any foundation models) responsibly, downstream developers and consumers/users of models should deploy models with guardrails enabled. IBM has opensourced Granite Guardian models explicitly for this purpose. The Guardian models cover jailbreaks (incl prompt injection attacks) and go even further to cover a comprehensive taxonomy of unsafe prompts (malicious queries) and model outputs. When accessed via watsonx.ai, granite guardian moderating can be easily configured using the "moderations" parameter in the watsonx.ai api endpoint, described at https://cloud.ibm.com/apidocs/watsonx-ai#deployments-text-generation. Brief overview of Granite Guardian: We have developed Granite Guardian using a comprehensive harm risk taxonomy and have expanded its capabilities to detect hallucinations. The risk taxonomy covers the following thematic areas: 1. Harm 2. Social bias 3. Profanity 4. Sexual content 5. Unethical behavior 6. Violence 7. Harm engagement 8. Evasiveness 9. Jailbreaking 10. RAG Safety – Groundedness 11. RAG Safety – Context relevance 12. RAG Safety – Answer relevance 13. Agentic Safety – Function calling hallucination Intended Use: Granite Guardian is useful for risk detection use-cases which are applicable across a wide-range of enterprise applications - - Detecting harm-related risks within prompt text, model responses, or conversations (as guardrails). These present fundamentally different use cases as the first assesses user supplied text, the second evaluates model generated text, and the third evaluates the last turn of a conversation. - RAG (retrieval-augmented generation) use-case where the guardian model assesses three key issues: context relevance (whether the retrieved context is relevant to the query), groundedness (whether the response is accurate and faithful to the provided context), and answer relevance (whether the response directly addresses the user's query). - Function calling risk detection within agentic workflows, where Granite Guardian evaluates intermediate steps for syntactic and semantic hallucinations. This includes assessing the validity of function calls and detecting fabricated information, particularly during query translation. Resources: - You may read more about the Granite Guardian models at the following link: https://github.com/ibm-granite/granite-guardian. - Please also refer to page 13 in the Responsible Use Guide for IBM Granite at the following link: https://www.ibm.com/granite/docs/resources/responsible-use-guide.pdf.

References:

Not disclosed

Score justification:

Granite Guardian intended also for this purpose

Indicator notes:

Downstream developers are developers who access the model through a distribution channel. Adversarial attacks include prompt injection, jailbreaking, or malicious queries. Mitigations against adversarial attacks that developers might build in or recommend include content filtering endpoints and recommended prompt templates. We will award this point if the developer discloses (i) technical mitigations (e.g., a developer provided moderation API or classifier) it offers or implements, (ii) recommended best practices or libraries for downstream developers, or (iii) an explicit statement that it does not build or recommend any particular downstream mitigations in this way..

Example disclosure:

Our API includes an optional parameter that will automatically filter user prompts and model outputs for hateful or disallowed content. We also publish guidelines for building robust chat interfaces that resist common prompt injections.

90. Enterprise mitigations (Score: 1)

Does the developer disclose additional or specialized mitigations for enterprise users?

Disclosure:

Enterprise users accessing Granite models from watsonx.ai get certain IP Indemnification protections from IBM covering their use. Open source use of Granite does not qualify for this protection (https://www.ibm.com/docs/en/watsonx/saas?topic=models-choosing-model#indemno.). As watsonx.ai is an enterprise offering, designed for enabling custom deployments in any on-prem or hybrid coud environment, enterprise customers can configure the install location to best suite their needs. As covered in Question 89, IBM also makes Granite Guardian available for use in watsonx.ai, so that users can set custom guardrails depending on their needs. IBM provides further information on security and privacy r/e model use on watsonx.ai here: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-security.html?context=wx&utm_source=chatgpt.com. Some example relevant text include: • IBM does not use your work to improve IBM models. • Text that you add to or submit in the Prompt Lab or API is not stored unless you save it. • IBM does not monitor or log foundation model input. • If watsonx.governance is provisioned, you can optionally monitor foundation model input. • Prompt remains encrypted while in transit. • The foundation models are hosted in IBM Cloud; your prompts are not sent to third-party platforms. • IBM does not monitor or log foundation model output. • Foundation model output is not stored unless you save the prompt that elicited the output as a prompt session asset. • IBM does not claim ownership rights to foundation model outputs. • IBM does not use your model outputs to improve IBM models. • If watsonx.governance is provisioned, you can optionally monitor foundation model output. We would generally categorize ‘additional’ or ‘specialized’ mitigations for enterprise users as being specific to (i) model output risk and (ii) legal risk.   Additional Model-Output Risk Mitigation for Enterprise Users: On the model output side, IBM makes Granite Guardian available for use in watsonx.ai, so that users can set custom guardrails depending on their needs. The Granite Guardian family is a collection of models designed to judge if the input prompts and the output responses of an LLM based system meet specified criteria. Link: https://github.com/ibm-granite/granite-guardian Granite Guardian scores particularly well in the following fields: HR, finance, law, and education (which together constitutes enterprise use). Legal Risk Mitigation for Enterprise Users: In addition, IBM takes legal risk for our enterprise customers seriously. This cannot be just an afterthought. Enterprise users accessing Granite models from watsonx.ai get certain IP Indemnification protections from IBM covering their use. IBM provides further information on security and privacy for enterprise model use on watsonx.ai here: Link: https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-security.html?context=wx&utm_source=chatgpt.com. Some additional relevant examples of mitigating measures for legal risk are found in the terms and conditions and other related policies in place, and include: • IBM does not use your work to improve IBM models. • IBM does not monitor or log foundation model input. • Prompt remains encrypted while in transit. • The foundation models are hosted in IBM Cloud; your prompts are not sent to third-party platforms.

References:

Not disclosed

Score justification:

Granite Gaurdian incorporated in watsonx.ai

Indicator notes:

Enterprise users are, for example, large organizations with dedicated service agreements or users of enterprise-specific API deployments or products and services. Additional or specialized mitigations may address enterprise needs such as data privacy controls, advanced prompt/response monitoring, or compliance checks with regulations such as GDPR or HIPAA. Additional or specialized mitigations may include single-tenant deployments, custom filters for specific regulated industries, or advanced logging for compliance. We will award a point if the developer at least describes these mitigations or states that it does not provide such additional or specialized enterprise mitigations.

Example disclosure:

Our enterprise offering for Model A includes a dedicated environment with stricter filtering, a HIPAA-compliant data retention policy, and the ability for enterprise admins to define custom blacklisted topics that the model must refuse.

91. Detection of machine-generated content (Score: 1)

Are mechanisms that are used for detecting content generated by this model disclosed?

Disclosure:

Given Granite's focus on small language models used in lower-risk applications and tasks, IBM does not use any mechanism for detecting content generated by the Granite model family at this point in time.

References:

Not disclosed

Score justification:

Disclosure of no detection

Indicator notes:

A mechanism for detecting machine-generated content might include storing a copy of all outputs generated by the model to compare against, implementing a watermark on model outputs, adding cryptographic metadata (such as C2PA), or training a detector post-hoc to identify such content. We will award this point if any such mechanism is disclosed or if the developer reports that it does not have or use any such mechanism.

Example disclosure:

We train a classifier using model generations and human-written text to identify machine-generated content from Model A and our other models.

92. Documentation for responsible use (Score: 1)

Does the developer provide documentation for responsible use by downstream developers?

Disclosure:

Yes, we provide detailed documentation for developers on responsible use through our Responsible Use Guide, available at the following link: https://www.ibm.com/granite/docs/resources/responsible-use-guide.pdf. This guide specifically targets responsible downstream use. Relevant abstract: This guide is a resource for business executives, product managers, developers, and other AI practitioners seeking to leverage foundation models in a responsible way for enterprise use. The Guide covers contemporary AI safety choices faced by model developers, overviews risk mitigation tools and resources, provides energy calculations for the sustainable use of IBM’s Granite models, and discusses other considerations for building and deploying responsible AI systems. Tools, best practices, and recommended actions in this Guide are tailored for enterprise users of opensource foundation models. Nonetheless, we expect that the tools, best practices, and recommended actions here- in will also inform broader corporate strategic decision making vis-à-vis the downstream deployment of AI technologies, capital allocation for AI safety projects, and ongoing discussion within public and private sectors on responsible AI systems. Finally, please reference the Granite /docs page: https://www.ibm.com/granite/docs/ which includes links to relevant best practices, how-tos, and tutorials around important topics like agents and guardrails.

References:

Not disclosed

Score justification:

Responsible use guide suffices

Indicator notes:

To receive a point, the developer should provide documentation for responsible use. This might include details on how to adjust API settings to promote responsible use, descriptions of how to implement mitigations, or guidelines for responsible use. We will also award this point if the developer states that it does not provide any such documentation. For example, the developer might state that the model is offered as is and downstream developers are accountable for using the model responsibly.

Example disclosure:

Our Developer Documentation Hub consolidates integration guides, responsible‐use guidelines, and best practices: [link]

93. Permitted and prohibited users (Score: 1)

Is a description of who can and cannot use the model on the top-5 distribution channels disclosed?

Disclosure:

For watsonx.ai, relevant use is governed by a general IBM acceptable use policy (https://www.ibm.com/granite/playground/terms/). We reserve the right to restrict use if this use policy is violated. As a general IBM policy, we do not sell IBM products or services to export controlled entities or persons on denied-parties lists or in countries under U.S. embargo. Granite models are available on Hugging Face, Ollama, and LMStudio as opensource models under a permissive Apache 2.0 license via its respective distribution channels, which does not restrict model use. For Replicate, use is goverened by their general Acceptable Use Policy, found here: https://replicate.com/acceptable-use-policy and terms of service, found here: https://replicate.com/terms LMStudio use is goverened by their terms of service (https://lmstudio.ai/app-terms), which includes a section on User Conduct. Hugging Face use is goverened by their terms of service (https://huggingface.co/terms-of-service)

References:

Not disclosed

Score justification:

Disclosed information suffices

Indicator notes:

We will award this point for a description of the company's policies for permitted and prohibitted users on its top-5 distribution channels. We will award this point if the developer has a more general acceptable use policy that it confirms applies across these distribution channels. We will award this point if there are no restrictions on users.

Example disclosure:

We allow usage by individuals 13 years of age or older who accept our Terms of Service. We prohibit use by export controlled entities or persons on denied-parties lists or in countries under U.S. embargo. We also reserve the right to restrict use if users engage in targeted harassment. For example, we only permit users over 13 with valid credentials, and prohibit usage from OFAC-sanctioned regions. We do not allow state-sponsored disinformation agencies to access our services.

94. Permitted, restricted, and prohibited uses (Score: 1)

Which uses are explicitly allowed, conditionally permitted, or strictly disallowed under the acceptable use policy for the top-5 distribution channels?

Disclosure:

watsonx.ai Ethical AI Use Sections (https://www.ibm.com/granite/playground/terms/) "Client agrees not to, and not to direct or allow third parties, use foundation models in connection with the use of this Cloud Service : (i) for mass surveillance, racial profiling, or any use that violates or encourages the violation of basic human rights or other applicable laws and regulations; (ii) to distribute false, misleading, disparaging or obscene information or content; (iii) to provide fully automated decision making in connection use cases involving critical processes or the risk of loss of life, property or impact on an individual's legal rights; (iv) in a manner that impersonates another for deceptive purposes or conceals the fact a user is interacting with AI; or (v) to distribute or intentionally generate malware or other harmful code. Granite is not intended for use in a high risk context as defined under the EU AI Act or other applicable regulations." For Replicate, Hugging Face, Ollama, and LMStudio, see the AUP/terms of service details above in Question 93.

References:

Not disclosed

Score justification:

No high risk use

Indicator notes:

We will award this point for a rough characterization of two or more of permitted, restricted, and prohibited uses across the top-5 distribution channels. We will award this point if the developer has a more general acceptable use policy that it confirms applies across these distribution channels. We will award this point if there are no restrictions on users.

Example disclosure:

Permitted uses include general conversational queries, brainstorming, and coding assistance. Restricted uses include adult or violent content that requires caution or additional review. Prohibited uses include facilitating illicit activity, disinformation campaigns, or harassment. For example, we permit typical user requests like Q&A, text generation, and educational uses. We restrict content that depicts graphic violence or sexual content by applying additional filters. We prohibit any use aiming to conduct unlawful surveillance, promote extremist violence, or defraud others.

95. AUP enforcement process (Score: 1)

What are the methods used by the developer to enforce the acceptable policy?

Disclosure:

IBM utilizes various methods to enforce its acceptance use policy, such as, self-reporting for any high-risk activity and suspension after one or more violation(s) of the acceptance use policy.

References:

Not disclosed

Score justification:

Some description of enforcement

Indicator notes:

We will award this point if the developer discloses the processes (automated or manual) it uses to detect, review, and respond to potential acceptable use policy violations. We will award this point for a reasonable best-effort attempt to provide the bulk of this information, though one line indicating the developer reserves the right to terminate accounts is insufficient. Alternatively, we will award this point if the developer reports that it does not use such methods to enforce its acceptable use policy.

Example disclosure:

We combine automated checks with human review for severe or repeated violations, issuing warnings or suspensions after repeat violations.

96. AUP enforcement frequency (Score: 0)

Are statistics on the developer's AUP enforcement disclosed?

Disclosure:

IBM does not publicly disclose statistics on developer’s AUP enforcement.

References:

Not disclosed

Score justification:

Company acknowledges no disclosure

Indicator notes:

We will award this point if the developer discloses enforcement statistics (e.g., violation counts or actions taken) from its enforcement of its acceptable use policy. Alternatively, we will award this point if the developer reports that it does not enforce its acceptable use policy.

Example disclosure:

We publish a quarterly enforcement report detailing violation counts by prohibited use category and the corresponding actions taken at [LINK]

97. Regional policy variations (Score: 1)

Are differences in the developer's acceptable use or model behavior policy across geographic regions disclosed?

Disclosure:

For watsonx, relevant terms of use may be found at the following link: https://www.ibm.com/granite/playground/terms/ (relevant section for the EU has been bolded below). AI Use Restrictions Client agrees not to, and not to direct or allow third parties, use foundation models in connection with the use of this Cloud Service : (i) for mass surveillance, racial profiling, or any use that violates or encourages the violation of basic human rights or other applicable laws and regulations; (ii) to distribute false, misleading, disparaging or obscene information or content; (iii) to provide fully automated decision making in connection use cases involving critical processes or the risk of loss of life, property or impact on an individual's legal rights; (iv) in a manner that impersonates another for deceptive purposes or conceals the fact a user is interacting with AI; or (v) to distribute or intentionally generate malware or other harmful code. This Cloud Service is not intended for use in a high risk context as defined under the EU AI Act or other applicable regulations.

References:

Not disclosed

Score justification:

EU variation disclosed

Indicator notes:

We will award this point if the developer discloses distinctions in its AUP or MBP and provides examples of differences in multiple specific regions, or states that no differences exist. For example, some jurisdictions impose content restrictions beyond those in the developer’s global policy that may necessesitate local deviations.

Example disclosure:

In the EU, our model automatically omits certain categories of political content to comply with local election laws. In all other regions, we follow the general global AUP at [URL].

98. Oversight mechanism (Score: 1)

Does the developer have an internal or external body that reviews core issues regarding the model prior to deployment?

Disclosure:

Yes, the IBM AI Ethics Board (“The Board") is a central component of IBM's commitment to responsibly developing and deploying AI. The Board is comprised of a diverse and multidisciplinary group of leaders from across the company, including experts in AI research, legal affairs, policy, privacy, and business operations. This diversity ensures a comprehensive approach to ethical considerations in AI development deployment. There are three primary missions of The Board: (a) to provide guidance and decision-making support as IBM develops, deploys, and uses AI and other technologies, (b) to ensure consistency with the company's values and (c) to advance trustworthy AI for IBM's clients, partners, and for society at large. The Board reviews Granite models, including proposed features and capabilities. The Board provides feedback and development considerations on these models, prior to model release, enabling the research and technical teams to incorporate this feedback into the development process and supporting IBM's commitment to responsible and ethical technology development. For more information on the AI Ethics Board and its role please refer to https://www.ibm.com/artificial-intelligence/ai-ethics.

References:

Not disclosed

Score justification:

IBM AI Ethics Board is oversight mechanism

Indicator notes:

We will award this point if the developer discloses that is has such an internal or external body and provides some description of its scope, or alternatively if the developer discloses that it has no such body. An oversight mechanism covers governance structure beyond mere external risk evaluation, asking whether a formal body regularly reviews design and deployment decisions. Core issues may include model objectives, data usage, or risk mitigation.

Example disclosure:

We convene a monthly advisory board of ethicists, civil society representatives, and academics to review training processes and identify new risks. The board's recommendations regarding deployment are not binding.

99. Whistleblower protection (Score: 1)

Does the developer disclose a whistleblower protection policy?

Disclosure:

IBM has a whistleblower protection policy and no retaliation policy as a part of IBM's Business Conduct Guidelines (BCG) sections 1.3 and 1.4 (https://www.ibm.com/downloads/documents/us-en/10a9980400afd10c). Allegations that may fall under the EU Whistleblowing Directive can be raised through Employee Concerns as well as other IBM Communication Channels.

References:

Not disclosed

Score justification:

Whistleblower protection policy specified well

Indicator notes:

We will award this point if the developer discloses (i) the existence of a whistleblower protection policy, (ii) what protections are afforded to whistleblowers, (iii) how reports are handled and investigated, and (iv) any external oversight of the whistleblower protection process. This might include protections for whistleblowers who report safety, ethical, or legal concerns related to the model. We will also award this point if the developer discloses that it has no such policy.

Example disclosure:

We maintain a whistleblower protection policy that prohibits retaliation against employees who report safety or ethical concerns about our models. Reports can be submitted anonymously through our ethics hotline, are reviewed by an independent board committee, and whistleblowers are entitled to legal representation provided by the company. Our policy is audited annually by an independent ethics consultancy.

100. Government commitments (Score: 1)

What commitments has the developer made to government bodies?

Disclosure:

IBM has signed onto five comprehensive voluntary AI commitments: the Canada’s Code of Conduct on GenAI Systems, the G7 Hiroshima Code of Conduct, (building on the Bletchley Declaration), the White House AI Commitments, the Seoul Summit Frontier AI Safety Commitments, and the EU AI Pact. IBM has also signed onto two additional disinformation voluntary commitments: Munich Tech Accord and the Bavaria Disinformation Alliance.

References:

Not disclosed

Score justification:

IBM has signed onto 7 voluntary commitments

Indicator notes:

We will award this point if the company provides an exhaustive list of commitments it has made to government bodies in the jurisdictions where it offers its model.

Example disclosure:

We have committed to the White House Voluntary Committments and the Seoul Committments.