Skip to content

IBM Unveils Granite 4: Efficient Open-Source Language Models

IBM's Granite 4 models promise greater efficiency with a mix of Transformer and Mamba architectures. They're coming to popular AI platforms soon.

In this image there is a table with many cores, a laptop, a pen and a few things on it.
In this image there is a table with many cores, a laptop, a pen and a few things on it.

IBM Unveils Granite 4: Efficient Open-Source Language Models

IBM has unveiled Granite 4, the latest generation of open-source language models. These models promise improved efficiency over their predecessors, thanks to a combination of two neural network architectures: Transformer and Mamba.

The Granite 4 series ranges in size from 3 to 32 billion parameters, catering to various computational needs. The largest model, Granite-4.0-H-Small, boasts 32 billion parameters and employs a mixture-of-experts design. Notably, Granite-4.0-Micro sticks to the Transformer architecture alone.

The new Mamba architecture, integrated into Granite 4, significantly reduces memory usage, particularly with long input prompts. This hardware-efficient network structure combines with the Transformer architecture, offering a balance between efficiency and performance. IBM has attributed the improved efficiency not just to the new architecture, but also to enhanced training methods and a larger training set.

IBM plans to make the Granite 4 models accessible through popular platforms like Amazon SageMaker JumpStart and Microsoft Azure AI. This move aims to facilitate the use of these efficient language models in a wide range of applications. The development team, led by unnamed IBM researchers, expresses confidence in the models' performance, crediting both architectural advancements and improved training methods.

Read also:

Latest