IBM Unveils Granite 4: Efficient Open-Source Language Models
IBM has unveiled Granite 4, the latest generation of open-source language models. These models promise improved efficiency over their predecessors, thanks to a combination of two neural network architectures: Transformer and Mamba.
The Granite 4 series ranges in size from 3 to 32 billion parameters, catering to various computational needs. The largest model, Granite-4.0-H-Small, boasts 32 billion parameters and employs a mixture-of-experts design. Notably, Granite-4.0-Micro sticks to the Transformer architecture alone.
The new Mamba architecture, integrated into Granite 4, significantly reduces memory usage, particularly with long input prompts. This hardware-efficient network structure combines with the Transformer architecture, offering a balance between efficiency and performance. IBM has attributed the improved efficiency not just to the new architecture, but also to enhanced training methods and a larger training set.
IBM plans to make the Granite 4 models accessible through popular platforms like Amazon SageMaker JumpStart and Microsoft Azure AI. This move aims to facilitate the use of these efficient language models in a wide range of applications. The development team, led by unnamed IBM researchers, expresses confidence in the models' performance, crediting both architectural advancements and improved training methods.
Read also:
- Web3 gaming platform, Pixelverse, debuts on Base and Farcaster networks
- Humorous escapade on holiday with Guido Cantz:
- Expands Presence in Singapore to Amplify Global Influence (Felicity)
- Amazon customer duped over Nvidia RTX 5070 Ti purchase: shipped item replaced with suspicious white powder; PC hardware fan deceived, discovers salt instead of GPU core days after receiving defective RTX 5090.