February 25, 2024

In a significant development, the BharatGPT group has introduced ‘Hanooman,’ a groundbreaking series of large language models (LLMs) proficient in responding across 11 Indian languages, including Hindi, Tamil, and Marathi, with plans for further expansion. Led by IIT Bombay in collaboration with seven other prestigious Indian engineering institutes, the initiative, announced on Tuesday, is backed by Reliance Industries Ltd and the Department of Science and Technology.

Understanding Hanooman:

  • Hanooman represents a pioneering advancement in AI technology, serving as a versatile tool capable of engaging users in various domains such as healthcare, governance, financial services, and education. Unlike traditional chatbots,
  • Hanooman is a multimodal AI system capable of generating text, speech, videos, and more across multiple Indian languages. Notably, one of its customized iterations, VizzhyGPT, is tailored specifically for healthcare applications, leveraging extensive medical data for enhanced functionality. The range of Hanooman models varies significantly in size, with parameters spanning from 1.5 billion to an impressive 40 billion.

Challenges and Solutions:

  • During the unveiling ceremony, Vishnu Vardhan, the Founder of Seetha Mahalaxmi Healthcare (SML), underscored the challenges posed by the quality of datasets in Indian languages. He emphasized the prevalence of synthetic datasets derived from translations, which could potentially introduce inaccuracies or distortions. This acknowledgment highlights the importance of refining dataset quality to ensure the efficacy and reliability of AI models like Hanooman.

Other Endeavors in Indian Language Models:

  • While BharatGPT spearheads the development of Hanooman, several other startups, including Sarvam and Krutrim, supported by prominent VC investors like Lightspeed Venture Partners and Vinod Khosla’s fund, are also actively engaged in creating AI models tailored for the Indian market. This indicates a burgeoning interest and investment in leveraging AI technologies to address the linguistic and cultural diversity inherent in the Indian context.

Understanding Large Language Models (LLMs):

  • Large language models, such as Hanooman, employ deep learning techniques to process extensive textual data, enabling them to comprehend language structures and meanings effectively. By training on vast datasets like Wikipedia, OpenWebText, and the Common Crawl Corpus, these models develop the ability to discern semantic nuances and relationships within language, thereby facilitating natural language processing and generation.
  • In conclusion, the unveiling of Hanooman represents a significant milestone in the advancement of AI technology tailored for the Indian linguistic landscape. With its multifaceted capabilities and potential applications across various sectors, Hanooman exemplifies the growing synergy between AI innovation and indigenous language preservation and accessibility.

