Naming Conventions of LLM Models
Introduction
When we see any LLM model names like GPT-4o, Claude 3 Sonnet, or LLaMA-2-7B-chat we wonder why companies give such weird names to their models. But let me tell you, these names have lots of meanings inside it. They provide lots of information about that model.
- Common Patterns:
Suffix Meaning
Turbo —> Optimised for speed + cost
Mini —> Smaller + cheaper
Pro —> High capability
Flash —> Ultra-fast
Instruct —> Fine-tuned to follow instructions
Chat —> Optimised for conversations
rlhf → trained with human feedback - Size Hierarchy:
xxl > xl > large > base > small - Size Indicators:
7B, 13B, 70B → parameters - Versioning:
v0.1, v1, v2 → iteration of fine-tuning
There are mainly two types of LLMs, lets understand their naming convention one by one.
-
Paid Models
Paid models are mainly business or customer oriented. So, their naming convention mainly focused on Simplicity, branding and positioning.
General Pattern in paid models:
[Model Family] + [Version] + [Variant / Capability Tier]Lets see some examples:
Example 1: GPT-4o
Breakdown:
GPT → Model family
4 → Generation (improvement over GPT-3.5)
o (omni) → Multimodal capabilityMeaning:
A 4th-gen model capable of handling text, image, audio, etc.Example 2: Gemini 1.5 Pro
Breakdown:
Gemini → Model family
1.5 → Incremental upgrade
Pro → High capabilityOther variants:
Flash → Faster, cheaper
Ultra → Most powerfulPaid models naming designed for easy understanding of non-technical users, marketing tiers and product differentiation.
-
Open-Source Model
Open source model naming is more technical and architecture oriented.
General Pattern in open source models:
[organization]/[model-family]-[version]-[size]-[variant]-[format]Example 1 : meta-llama/Llama-2-7b-chat-hf
Breakdown:
meta-llama → Organization
Llama-2 → Model family + version
7b → 7 billion parameters
chat → Fine-tuned for conversation
hf → Hugging Face formatMeaning:
A 7B parameter chat-optimized LLaMA v2 modelExample 2: mistralai/Mistral-7B-Instruct-v0.1
Breakdown:
Mistral-7B → Base model
Instruct → Instruction-following
v0.1 → Version of fine-tuningMeaning:
Instruction-tuned version of Mistral 7B (early release)
Final Thoughts:
– Paid models are designed like products
– Open-source models are designed like engineering artifacts
Understanding this difference can help us to select a better model for our specific requirements.
To read more such technical blogs, please follow us on social media. Thanks.
