News
3 minute read

Scaling to meet future of India’s AI needs

The generative AI explosion has been a revolution for countless industries around the world. But many of the influential models released to date are meant for English-speaking users, and few other languages. For the 1.5 billion people who speak Indic languages originating from the Indian subcontinent, these models fall short.  

That’s something that BharatGen, an initiative funded by the Indian Department of Science and Technology, has looked to change. Since its inception last year, the team housed at IIT Bombay have been working to build models that can support India’s national and commercial needs. To date, they’ve built initial models for the 14 most popular Indic languages and plan to go well beyond the 22 scheduled languages of India. The goal is to bring the AI revolution to the underserved linguistic and cultural diversity of India, and IBM will be helping BharatGen do just that.  

Today, IBM and BharatGen announced a new partnership to help drive the adoption of AI in India. Using the multimodal AI and large language models that BharatGen is working on, the two will work together to develop and scale the unique tools needed to serve Indian language speakers.  

IBM has a long history of turning cutting-edge AI concepts into enterprise-ready AI solutions robust enough to be trusted by some of the largest organizations in the world. The company’s expertise with data for AI training, model governance and training technologies will be key to helping BharatGen achieve its goal of creating and deploying efficient AI models for Indian languages. And an early encounter, when some of the founding members of BharatGen worked with IBM at an AI Alliance event at IBM Research India, helped foster that partnership. The university was a founding member of the AI Alliance, and a team from IBM Research India was presenting, showing how IBM’s InstructLab tool could be used to fine-tune small models for Indic languages.

IBMa_BharatGen_Team_with DST-for Press Release.jpeg
Standing from left to right: Ramesh Karwani, Head Technology Policy, Global Regulatory Affairs, IBM India, Jaikrishnan Hari, Strategy and Business Development, IBM Research India, Dr. Amith Singhee, Director, IBM Research India, Sandip Patel, Managing Director, IBM India and South Asia, Shri Abhay Karandikar, Secretary, Department of Science and Technology (DST), Government of India, Prof Ganesh Ramakrishnan ,Principal Investigator, BharatGen, Dr. Ekta Kapoor, Scientist- G and Head - Frontier and Futuristic Technologies (FFT) Division, DST, Prof Aditya Maheshwari, IIM Indore and Consortium Member, BharatGen, and other officials from DST.

This was a first step that eventually led to a joint demonstration of the idea on India’s Bharat Nyaya Samhita, and now today’s announcement. The BharatGen team will be looking to the IBM Research team based in India, to collaborate on model technologies like these, as well as data preparation, and scaling data prep work for complex, governed pipelines. “They want to create solution templates that integrate their sovereign models, which we will together integrate with IBM's platforms and software to create scalable AI pipelines that serve the needs of the country,” said Amith Singhee, the director of IBM Research India and the CTO of IBM India and South Asia.  

The collaboration between the two organizations will focus on expanding BharatGen’s applications across a range of industries, starting out with education, agriculture, banking, healthcare, and citizen services. They’ll integrate also with IBM’s growing family of Granite models, and build use case templates for those industries with IBM watsonx and Red Hat OpenShift AI. The two will work together to build data and AI pipelines built on open-source technologies that have been enhanced to work reliably for Indic languages, and build a governance framework, using IBM’s deep research and development into governance to guide the process. They’ll also work together to build benchmarks specifically tailored for India and Indic languages, to ensure the work they’re doing meets “India’s national and commercial needs,” Singhee said.  

The languages and tools that BharatGen have built so far are just the beginning. There are roughly 22 major Indian languages and dialects, but there are over 120 recognized languages spoken in India, and hundreds of dialects. BharatGen’s mandate is to build AI tools that serve the entire country, “ensuring broader digital participation and equity,” Singhee said, which means there is still much to do to meet the population where they are.  

“This collaboration aligns with BharatGen’s vision of enabling scalable and inclusive AI innovation for India,” said Ganesh Ramakrishnan, a professor at IIT Bombay and the principal investigator at BharatGen. “With support from IBM, we will accelerate the deployment of BharatGen’s foundational Indic models, development of high-performance solutions and strengthen our open research ecosystem for AI.” 

Related posts