Phi-3-mini L: Microsoft’s Smallest AI model :What are Large Language Model (LLM) and small language models (SLMs)

April 29, 2024

Phi-3-mini L: Microsoft’s Smallest AI model :What are Large Language Model (LLM) and small language models (SLMs)

Why in News? Microsoft claims that its latest small language models have outperformed several AI models of its size, as well as bigger ones. It said that India’s ITC also leveraged the new Phi-3-mini.

A few days after Meta unveiled its Llama 3 Large Language Model (LLM), Microsoft on Tuesday (April 23) unveiled the latest version of its ‘lightweight’ AI model – the Phi-3-Mini. Microsoft has described the Phi-3 as a family of open AI models that are the most capable and cost-effective small language models (SLMs) available.

What is Phi-3-mini?

Phi-3-Mini is believed to be first among the three small models that Microsoft is planning to release. It has reportedly outperformed models of the same size and the next size up across a variety of benchmarks, in areas like language, reasoning, coding, and maths.

What are LLM & SLM?

Essentially, language models are the backbone of AI applications like ChatGPT, Claude, Gemini, etc. These models are trained on existing data to solve common language problems such as text classification, answering questions, text generation, document summarisation, etc.

The ‘Large’ in LLMs has two meanings — the enormous size of training data; and the parameter count. In the field of Machine Learning, where machines are equipped to learn things themselves without being instructed, parameters are the memories and knowledge that a machine has learned during its model training. They define the skill of the model in solving a specific problem.

A large language model (LLM) is a type of artificial intelligence (AI) program that can recognize and generate text, among other tasks. LLMs are trained on huge sets of data — hence the name “large.” LLMs are built on machine learning: specifically, a type of neural network called a transformer model.

In simpler terms, an LLM is a computer program that has been fed enough examples to be able to recognize and interpret human language or other types of complex data. Many LLMs are trained on data that has been gathered from the Internet — thousands or millions of gigabytes’ worth of text. But the quality of the samples impacts how well LLMs will learn natural language, so an LLM’s programmers may use a more curated data set.

What are small language models?

Small language models are all about challenging the notion that bigger is always better in natural language processing.
Unlike the hundreds of billions of parameters (variables that a model learns during training) models like GPT-4 or Gemini Advanced boast, SLMs range from ‘only’ a few million to a few billion parameters.

Courses

Phi-3-mini L: Microsoft’s Smallest AI model :What are Large Language Model (LLM) and small language models (SLMs)

What are small language models?

Categories

Recent Posts

Get In Touch

Quick Links

Courses

Newsletter

Follow Us