IBM Granite 4.0: Hyper-efficient, high performance hybrid models for India

India is the most populous country in the world, home to nearly 1.4 billion people and one of the most linguistically diverse societies on the planet. The Indian Constitution recognizes 22 official languages, but the linguistic richness extends much further: More than 1,500 languages and dialects are spoken across different states, communities, and regions. These languages span multiple families — including Indo-Aryan, Dravidian, Tibeto-Burman, and Austroasiatic — and exhibit deeply varied linguistic structures.

For AI systems and large language models (LLMs), India presents both an extraordinary opportunity and a unique challenge.

Indic languages are uniquely exigent for LLMs, because they have rich morphology that generates many word forms from a single root, as well as complex script systems with conjuncts and context-dependent rendering, and diverse orthographic norms that vary across regions and writing conventions. In addition, high-quality training data for most Indic languages is limited and often noisy compared to other languages, like English. Together, these factors mean to build for these Indic languages, you require specialized tokenization, modelling techniques, and curated datasets to achieve strong performance across the linguistic diversity of the Indian subcontinent.

The Granite 4.0 language models

Granite 4.0 features a new hybrid Mamba/transformer architecture that greatly reduces memory requirements, without sacrificing performance. They can be run on significantly cheaper GPUs, dramatically lowering costs when compared to conventional LLMs while also achieving much faster inference speeds at scale.

The new Granite 4.0 family of models, open sourced under a standard Apache 2.0 license, are the world’s first open models to receive ISO 42001 certification. They are also cryptographically signed, confirming their adherence to internationally recognized best practices for security, governance, and transparency.

The launch of Granite 4.0 kicked off a new era for IBM’s family of enterprise-ready large language models doubling down on small, efficient language models that provide competitive performance at reduced costs and latency.

For the Indian subcontinent, Granite 4.0 also brings breakthrough capabilities on Indian languages in their model size categories. They achieve strong performance across knowledge and skill benchmarks related to Indian languages and knowledge, at reduced cost and latency, comparable to earlier models.

Pre-training and post-training Indic data

The Granite 4.0 models have been trained on approximately 100 billion tokens of Indian-language data during pre-training and around 1.5 million post-training instances. The pre-training corpus was sourced entirely from publicly available Indian-language datasets. For post-training, we used a combination of English (supervised fine-tuning) SFT datasets translated into major Indian languages, as well as synthetically generated multi-turn conversations. Both the pre-training and post-training datasets underwent rigorous filtering to ensure that only high-quality, reliable examples were included.

Granite 4.0 Indic performance

We compared the performance of Granite 4.0 models on both general reasoning and Indian culture-specific datasets, against other multilingual models like the Llama and Gemma series of LLMs.

Screenshot 2025-11-18 at 4.49.26 PM.png — Production-scale memory requirements vs. average accuracy across multiple Indic knowledge and skill benchmarks. Shaded region indicates favorable performance / memory tradeoff. Benchmarks used: ARC-Challenge, BoolQ, MILU-IN, MILU-EN, Global-MMLU-Lite, Global-MMLU-Full, Trivia-QA, Sanskriti, BhashaBench Krishi, BhashaBench Ayur, BhashaBench Legal, BhashaBench Finance, GSM8k indic, GSM8k-Native, GSm8k-Roman.

The Granite 4.0 models are consistently establishing themselves as top performers across both small (<7B parameters) and large (>7B parameters) model categories. While the Sarvam-m model surpasses Granite-4.0-h-small, it does so at a substantially higher computational cost due to its dense architecture. In the small-model group, the Granite-4.0-h-tiny model achieves the highest overall score, showcasing a strong performance advantage-driven by its Mixture-of-Experts (MoE) architecture, while the dense Granite-4.0-micro model also remains highly competitive.

In the large-model category, the Granite-4.0-h-small (30B) continues to lead as the benchmark model, outperforming all non-Granite alternatives, with Sarvam-m being the only exception. Overall, these results demonstrate that Granite 4.0 models set a new standard of excellence through their effective use of both dense and highly efficient MoE designs.

Detailed Results

Granite 4.0’s development incorporated Indian languages throughout both the pre-training and SFT stages. While the model has benefited from extensive alignment and instruction tuning in English, the post-training process has not yet been fully optimized for the linguistic diversity of Indian languages. As future work, we plan to strengthen Granite 4.0’s post-training pipeline by more deeply integrating Indic languages, expanding coverage across dialects, and refining instruction-following, reasoning, and conversational capabilities in these languages. This effort will ensure that Granite 4.0 becomes more culturally and linguistically robust for India’s diverse language landscape.

You can get started with Granite on multiple platforms through our partners, as well as try it out now on the Granite Playground and watsonx.ai.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter

Meet the IBM researchers trying to raise AI’s intelligence-per-watt ratio
Q & A
Kim Martineau
21 Jan 2026
- AI
LLMs have model cards. Now, benchmarks do, too
Release
Kim Martineau
16 Dec 2025
Boost your tools: Introducing ToolOps, the tool lifecycle extension in ALTK
Technical note
Himanshu Gupta, Jim Laredo, Neelamadhav Gantayat, Jayachandu Bandlamudi, Prerna Agarwal, Sameep Mehta, Renuka Sindhgatta, Ritwik Chaudhuri, and Rohith Vallam
11 Dec 2025
- AI
IBM Granite tops Stanford’s list as the world’s most transparent model
News
Peter Hess
09 Dec 2025

The Granite 4.0 language models

Pre-training and post-training Indic data

Granite 4.0 Indic performance

Detailed Results

Related posts

Meet the IBM researchers trying to raise AI’s intelligence-per-watt ratio

LLMs have model cards. Now, benchmarks do, too

Boost your tools: Introducing ToolOps, the tool lifecycle extension in ALTK

IBM Granite tops Stanford’s list as the world’s most transparent model