IBM Granite

IBM Granite
	Screenshot of an example of Granite answer describing Wikipedia
Developer(s)	IBM Research
Initial release	November 7, 2023;14 months ago
Platform	IBM Watsonx (initially); GitHub ; Hugging Face ; RHEL AI
Type	Multimodal ; Large language model ; Generative pre-trained transformer ; Foundation model ;
License	Proprietary ; Code models: Open Source (Apache 2.0)
Website	www.ibm.com/granite

Last updated January 14, 2025

IBM Granite is a series of decoder-only AI foundation models created by IBM.^[3] It was announced on September 7, 2023,^[4]^[5] and an initial paper was published 4 days later.^[6] Initially intended for use in the IBM's cloud-based data and generative AI platform Watsonx along with other models,^[7] IBM opened the source code of some code models.^[8]^[9] Granite models are trained on datasets curated from Internet, academic publishings, code datasets, legal and finance documents.^[10]^[11]^[1]

Foundation models

A foundation model is an AI model trained on broad data at scale such that it can be adapted to a wide range of downstream tasks.^[12]

Granite's first foundation models were Granite.13b.instruct and Granite.13b.chat. The "13b" in their name comes from 13 billion, the amount of parameters they have as models, lesser than most of the larger models of the time. Later models vary from 3 to 34 billion parameters.^[4]^[13]

On May 6, 2024, IBM released the source code of four variations of Granite Code Models under Apache 2, an open source permissive license that allows completely free use, modification and sharing of the software, and put them on Hugging Face for public use.^[14]^[15] According to IBM's own report, Granite 8b outperforms Llama 3 on several coding related tasks within similar range of parameters.^[16]^[17]

Related Research Articles

SingleStore is a distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

Anaconda is an open source data science and artificial intelligence distribution platform for Python and R programming languages. Developed by Anaconda, Inc., an American company founded in 2012, the platform is used to develop and manage data science and AI projects. In 2024, Anaconda Inc. has about 300 employees and 45 million users.

Databricks, Inc. is a global data, analytics, and artificial intelligence (AI) company, founded in 2013 by the original creators of Apache Spark. The company provides a cloud-based platform to help enterprises build, scale, and govern data and AI, including generative AI and other machine learning models.

Multimodal learning is a type of deep learning that integrates and processes multiple types of data, referred to as modalities, such as text, audio, images, or video. This integration allows for a more holistic understanding of complex data, improving model performance in tasks like visual question answering, cross-modal retrieval, text-to-image generation, aesthetic ranking, and image captioning.

<span class="mw-page-title-main">GPT-2</span> 2019 text-generating language model

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained on a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.

A foundation model, also known as large X model (LxM), is a machine learning or deep learning model that is trained on vast datasets so it can be applied across a wide range of use cases. Generative AI applications like Large Language Models are often examples of foundation models.

Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

A generative pre-trained transformer (GPT) is a type of large language model (LLM) and a prominent framework for generative artificial intelligence. It is an artificial neural network that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs had these characteristics and are sometimes referred to broadly as GPTs.

A large language model (LLM) is a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with many parameters, and are trained with self-supervised learning on a vast amount of text.

Llama is a family of autoregressive large language models (LLMs) released by Meta AI starting in February 2023. The latest version is Llama 3.3, released in December 2024.

PaLM is a 540 billion-parameter transformer-based large language model (LLM) developed by Google AI. Researchers also trained smaller versions of PaLM to test the effects of model scale.

Open-source artificial intelligence is an AI system that is freely available to use, study, modify, and share. These attributes extend to each of the system's components, including datasets, code, and model parameters, promoting a collaborative and transparent approach to AI development. Free and open-source software (FOSS) licenses, such as the Apache License, MIT License, and GNU General Public License, outline the terms under which open-source artificial intelligence can be accessed, modified, and redistributed.

Watsonx is IBM's commercial generative AI and scientific data platform based on cloud. It offers a studio, data store, and governance toolkit. It supports multiple large language models (LLMs) along with IBM's own Granite.

Aleph Alpha GmbH is a German artificial intelligence (AI) startup company founded by Jonas Andrulis and Samuel Weinbach, both of whom have professional experience at companies such as Apple and Deloitte. Based in Heidelberg, the company aims to develop a sovereign technology stack for generative AI that operates independently of U.S. companies and complies with European data protection regulations, including the Artificial Intelligence Act. Aleph Alpha has established reportedly one of the most powerful AI clusters within its own data center, and specializes in developing large language models (LLM). These models are designed to provide transparency regarding the sources used for generating results and are intended for use by enterprises and governmental agencies. The training of these models has been conducted in five European languages.

Mistral AI, headquartered in Paris, France specializes in artificial intelligence (AI) products and focuses on open-weight large language models, (LLMs). Founded in April 2023 by former engineers from Google DeepMind and Meta Platforms, the company has gained prominence as an alternative to proprietary AI systems. Named after the mistral – a powerful, cold wind in southern France – the company emphasized openness and innovation in the AI field. Mistral AI positions itself as an alternative to proprietary models.

Huawei PanGu, PanGu, PanGu-Σ or PanGu-π is a multimodal large language model developed by Huawei. It was announced on July 7, 2023.

IBM Think is an annual business conference organized by IBM. Before 2018, IBM held similar business conferences under different names. Think is seen as a successor to World of Watson that was held in 2017. The conference name is a reference the 'Think' slogan used by IBM.

DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. It is a mixture-of-experts transformer model, with 132 billion parameters in total. 36 billion parameters are active for each token. The released model comes in either a base foundation model version or an instruction-tuned variant.

DeepSeek is a Chinese artificial intelligence (AI) firm and family of Large Language Models based in Hangzhou. It is founded and backed by the Chinese hedge fund, High-Flyer. It has released its models as open source. The latest version, DeepSeek-V3, is competitive with other LLMs released in 2024 such as that of Qwen and OpenAI.

References

1 2 McDowell, Steve. "IBM's New Granite Foundation Models Enable Safe Enterprise AI". Forbes.
↑ ibm-granite/granite-code-models, IBM Granite, 2024-05-08, retrieved 2024-05-08
↑ "IBM Granite". IBM . 24 June 2024.
1 2 Nirmal, Dinesh (September 7, 2023). "Building AI for business: IBM's Granite foundation models". IBM .
↑ "IBM debuts Granite series of hardware-efficient language models". September 7, 2023.
↑ "Granite Foundation Models" (PDF). IBM. 2023-11-30.
↑ Fritts, Harold (2024-04-22). "IBM Adds Meta Llama 3 To watsonx, Expands AI Offerings". StorageReview.com. Retrieved 2024-05-08.
↑ Jindal, Siddharth (2024-05-07). "IBM Releases Open-Source Granite Code Models, Outperforms Llama 3". Analytics India Magazine. Retrieved 2024-05-08.
↑ "Open sourcing IBM's Granite code models". 9 February 2021.
↑ Azhar, Ali (2024-04-08). "IBM Patents a Faster Method to Train LLMs for Enterprises". Datanami. Retrieved 2024-05-08.
↑ Wiggers, Kyle (2023-09-07). "IBM rolls out new generative AI features and models". TechCrunch. Retrieved 2024-05-08.
↑ "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. 18 August 2021.
↑ Pawar, Sahil (2023-09-11). "IBM Introduces Granite Series LLM Models for Watsonx Platform". Analytics Drift. Retrieved 2024-05-09.
↑ Nine, Adrianna (May 7, 2024). "IBM Makes Granite AI Models Open-Source Under New InstructLab Platform". ExtremeTech .
↑ "IBM open-sources its Granite AI models - and they mean business". ZDNET. Retrieved 2024-05-21.
↑ Jindal, Siddharth (2024-05-07). "IBM Releases Open-Source Granite Code Models, Outperforms Llama 3". Analytics India Magazine. Retrieved 2024-05-09.
↑ Synced (2024-05-13). "IBM's Granite Code: Powering Enterprise Software Development with AI Precision | Synced". syncedreview.com. Retrieved 2024-05-21.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[auto-1] 1 2 McDowell, Steve. "IBM's New Granite Foundation Models Enable Safe Enterprise AI". Forbes.

[2] ibm-granite/granite-code-models, IBM Granite, 2024-05-08, retrieved 2024-05-08

[3] "IBM Granite". IBM . 24 June 2024.

[auto1-4] 1 2 Nirmal, Dinesh (September 7, 2023). "Building AI for business: IBM's Granite foundation models". IBM .

[5] "IBM debuts Granite series of hardware-efficient language models". September 7, 2023.

[6] "Granite Foundation Models" (PDF). IBM. 2023-11-30.

[7] Fritts, Harold (2024-04-22). "IBM Adds Meta Llama 3 To watsonx, Expands AI Offerings". StorageReview.com. Retrieved 2024-05-08.

[8] Jindal, Siddharth (2024-05-07). "IBM Releases Open-Source Granite Code Models, Outperforms Llama 3". Analytics India Magazine. Retrieved 2024-05-08.

[9] "Open sourcing IBM's Granite code models". 9 February 2021.

[10] Azhar, Ali (2024-04-08). "IBM Patents a Faster Method to Train LLMs for Enterprises". Datanami. Retrieved 2024-05-08.

[11] Wiggers, Kyle (2023-09-07). "IBM rolls out new generative AI features and models". TechCrunch. Retrieved 2024-05-08.

[12] "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. 18 August 2021.

[13] Pawar, Sahil (2023-09-11). "IBM Introduces Granite Series LLM Models for Watsonx Platform". Analytics Drift. Retrieved 2024-05-09.

[14] Nine, Adrianna (May 7, 2024). "IBM Makes Granite AI Models Open-Source Under New InstructLab Platform". ExtremeTech .

[15] "IBM open-sources its Granite AI models - and they mean business". ZDNET. Retrieved 2024-05-21.

[16] Jindal, Siddharth (2024-05-07). "IBM Releases Open-Source Granite Code Models, Outperforms Llama 3". Analytics India Magazine. Retrieved 2024-05-09.

[17] Synced (2024-05-13). "IBM's Granite Code: Powering Enterprise Software Development with AI Precision | Synced". syncedreview.com. Retrieved 2024-05-21.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

IBM Granite

Contents

Foundation models

See also

Related Research Articles

References

External links


Screenshot of an example of Granite answer describing Wikipedia
Developer(s)	IBM Research ^[1]
Initial release	November 7, 2023;14 months ago (2023-11-07)
Platform	IBM Watsonx (initially) GitHub Hugging Face RHEL AI
Type	Multimodal Large language model Generative pre-trained transformer Foundation model
License	Proprietary Code models: Open Source (Apache 2.0)^[2]
Website	www.ibm.com/granite