IBM Granite 4.0: Hyper-efficient, high performance hybrid models for India
India is the most populous country in the world, home to nearly 1.4 billion people and one of the most linguistically diverse societies on the planet. The Indian Constitution recognizes 22 official languages, but the linguistic richness extends much further: More than 1,500 languages and dialects are spoken across different states, communities, and regions. These languages span multiple families — including Indo-Aryan, Dravidian, Tibeto-Burman, and Austroasiatic — and exhibit deeply varied linguistic structures.
For AI systems and large language models (LLMs), India presents both an extraordinary opportunity and a unique challenge.
Indic languages are uniquely exigent for LLMs, because they have rich morphology that generates many word forms from a single root, as well as complex script systems with conjuncts and context-dependent rendering, and diverse orthographic norms that vary across regions and writing conventions. In addition, high-quality training data for most Indic languages is limited and often noisy compared to other languages, like English. Together, these factors mean to build for these Indic languages, you require specialized tokenization, modelling techniques, and curated datasets to achieve strong performance across the linguistic diversity of the Indian subcontinent.
The Granite 4.0 language models
Granite 4.0 features a new hybrid Mamba/transformer architecture that greatly reduces memory requirements, without sacrificing performance. They can be run on significantly cheaper GPUs, dramatically lowering costs when compared to conventional LLMs while also achieving much faster inference speeds at scale.
The new Granite 4.0 family of models, open sourced under a standard Apache 2.0 license, are the world’s first open models to receive ISO 42001 certification. They are also cryptographically signed, confirming their adherence to internationally recognized best practices for security, governance, and transparency.
The launch of Granite 4.0 kicked off a new era for IBM’s family of enterprise-ready large language models doubling down on small, efficient language models that provide competitive performance at reduced costs and latency.
For the Indian subcontinent, Granite 4.0 also brings breakthrough capabilities on Indian languages in their model size categories. They achieve strong performance across knowledge and skill benchmarks related to Indian languages and knowledge, at reduced cost and latency, comparable to earlier models.
Pre-training and post-training Indic data
The Granite 4.0 models have been trained on approximately 100 billion tokens of Indian-language data during pre-training and around 1.5 million post-training instances. The pre-training corpus was sourced entirely from publicly available Indian-language datasets. For post-training, we used a combination of English (supervised fine-tuning) SFT datasets translated into major Indian languages, as well as synthetically generated multi-turn conversations. Both the pre-training and post-training datasets underwent rigorous filtering to ensure that only high-quality, reliable examples were included.
Granite 4.0 Indic performance
We compared the performance of Granite 4.0 models on both general reasoning and Indian culture-specific datasets, against other multilingual models like the Llama and Gemma series of LLMs.
The Granite 4.0 models are consistently establishing themselves as top performers across both small (<7B parameters) and large (>7B parameters) model categories. While the Sarvam-m model surpasses Granite-4.0-h-small, it does so at a substantially higher computational cost due to its dense architecture. In the small-model group, the Granite-4.0-h-tiny model achieves the highest overall score, showcasing a strong performance advantage-driven by its Mixture-of-Experts (MoE) architecture, while the dense Granite-4.0-micro model also remains highly competitive.
In the large-model category, the Granite-4.0-h-small (30B) continues to lead as the benchmark model, outperforming all non-Granite alternatives, with Sarvam-m being the only exception. Overall, these results demonstrate that Granite 4.0 models set a new standard of excellence through their effective use of both dense and highly efficient MoE designs.
Detailed Results
Granite 4.0’s development incorporated Indian languages throughout both the pre-training and SFT stages. While the model has benefited from extensive alignment and instruction tuning in English, the post-training process has not yet been fully optimized for the linguistic diversity of Indian languages. As future work, we plan to strengthen Granite 4.0’s post-training pipeline by more deeply integrating Indic languages, expanding coverage across dialects, and refining instruction-following, reasoning, and conversational capabilities in these languages. This effort will ensure that Granite 4.0 becomes more culturally and linguistically robust for India’s diverse language landscape.
You can get started with Granite on multiple platforms through our partners, as well as try it out now on the Granite Playground and watsonx.ai.
Related posts
- Technical noteYue Zhu, Radu Stoica, Animesh Trivedi, Jonathan Terner, Frank Schmuck, Jeremy Cohn, Christof Schmitt, Anthony Hsu, Guy Margalit, Vasily Tarasov, Swaminathan Sundararaman, Talia Gershon, and Vincent Hsu
It takes a village to make open infrastructure for AI a reality
NewsPeter HessCtrl+Z for agents
ResearchKim MartineauThe future of AI is in your hands
NewsMike Murphy
