The AI Family Tree — Every Major AI Model, Explained

Foundation — 2017

The Transformer

Vaswani et al., Google Brain — "Attention Is All You Need"
arXiv:1706.03762 · June 2017

Every major AI language model today is built on this architecture.

The major AI labs and their lineage

🤖

OpenAI

San Francisco · Founded 2015

2018–2020

GPT-1 / GPT-2 / GPT-3

The original GPT series. GPT-3 (175B params) demonstrated emergent capabilities that surprised researchers.

Nov 2022

ChatGPT

GPT-3.5 + RLHF fine-tuning. Reached 1M users in 5 days. The product that made AI mainstream.

2023–2024

GPT-4 / GPT-4o

Multimodal (text + image input). GPT-4o adds voice, vision, and real-time capabilities.

2025

o1 / o3 (reasoning models)

Extended "thinking" before responding. Significantly better on maths, science, and logical reasoning.

Also: DALL-E · Sora · Whisper · Codex

ChatGPT full guide →

🧠

Anthropic

San Francisco · Founded 2021 (ex-OpenAI)

2022

Constitutional AI (CAI)

Training method where the model critiques its own outputs against a set of principles. Key Anthropic innovation.

2023

Claude 1 / Claude 2

First consumer releases. Claude 2 introduced 100k context window — far larger than competitors at the time.

2024

Claude 3 (Haiku / Sonnet / Opus)

Three-tier model family. Claude 3 Opus was the best-performing model on most benchmarks at launch.

2024–2025

Claude 3.5 / Claude 4

Claude 3.5 Sonnet became the default model. Claude 4 released 2025 with extended thinking and deeper reasoning.

Also: MCP (Model Context Protocol)

Claude full guide →

🌐

Google DeepMind

London / Mountain View · Founded 2010 (merged 2023)

2018–2020

BERT / T5 / PaLM

BERT (2018) transformed NLP. PaLM (2022) demonstrated few-shot learning at scale. Foundational research.

Dec 2023

Gemini 1.0

Google's first natively multimodal model. Trained on text, images, audio, and video simultaneously.

2024

Gemini 1.5 / 2.0

Gemini 1.5 Pro introduced 1M token context. Gemini 2.0 added agentic capabilities and multimodal live API.

Also: Gemini Advanced · Gemini in Workspace · Imagen · Veo

Gemini Advanced guide →

📱

Meta AI

Menlo Park · FAIR founded 2013

2023

LLaMA / Llama 2

Released as open weights — downloadable model files. Triggered the open-source AI ecosystem. 7B–70B parameters.

April 2024

Llama 3

Significant quality jump. 8B and 70B open weights. 405B version competitive with GPT-4.

2024–2025

Meta AI (consumer product)

Built into WhatsApp, Instagram, Facebook, Messenger. 3B+ potential users. Free.

Also: Emu (image gen) · SeamlessM4T (translation)

Meta AI full guide →

🇪🇺

Mistral AI

Paris · Founded June 2023 (ex-DeepMind, ex-Meta)

Sept 2023

Mistral 7B

Outperformed Llama 2 13B at 7B parameters using sliding window attention. Raised €105M before shipping a product.

2024–2025

Mixtral / Mistral Large / Le Chat

Mixture-of-Experts architecture (Mixtral 8x7B). Le Chat = consumer product. Mistral Large for enterprise API.

European open-source AI. Apache 2.0 weights.

Mistral guide →

𝕏

xAI / Grok

San Francisco · Founded 2023 (Elon Musk)

Nov 2023

Grok-1

Built into X (Twitter). Distinguishing feature: real-time access to X posts and current news. Trained on internet + X data.

2024–2025

Grok-2 / Grok-3

Grok-2 added image generation (Aurora). Grok-3 released 2025 — xAI claimed leading performance on reasoning benchmarks.

Available on X Premium / standalone app

Grok guide →

🎨

Black Forest Labs

Founded Aug 2024 (ex-Stability AI)

2022 (prior work)

Stable Diffusion (at CompVis/Stability AI)

The team that created Stable Diffusion — the foundational open-source image generation model (Rombach et al., arXiv:2112.10752).

Aug 2024

Flux.1 (Pro / Dev / Schnell)

12B parameter flow-matching model. Best prompt adherence of any image model. Schnell is Apache 2.0 open source.

Flux full guide →

🔍

Perplexity AI

San Francisco · Founded 2022

Model approach

Not a model company — a product company

Perplexity uses third-party models (GPT-4, Claude, Sonar) and adds real-time web search with source attribution. The product is the interface and retrieval layer.

Own model work

Sonar models

Perplexity's own models fine-tuned for search and citation tasks. Available via the Perplexity API.

Perplexity full guide →

Key concepts in the AI family tree

Transformer architecture

The neural network design introduced by Google in 2017 that all major LLMs are built on. The "attention mechanism" allows the model to weigh the relevance of every word to every other word in the input.

RLHF — Reinforcement Learning from Human Feedback

Training method where human raters rank model outputs, teaching the model to produce responses humans prefer. Used by OpenAI (ChatGPT), Anthropic (Claude), and most consumer AI products.

Open weights vs open source

Open weights = model parameters are publicly downloadable (Llama, Flux Dev). Open source = code and weights released under permissive licence (Flux Schnell, Mistral). Closed = no access to model internals (GPT-4, Claude).

Mixture of Experts (MoE)

Architecture where multiple "expert" sub-networks exist, but only a subset activates for any given input. Allows much larger effective model capacity at lower inference cost. Used by Mixtral (Mistral) and GPT-4 (reportedly).

Context window

The maximum amount of text a model can process in one conversation — measured in tokens (roughly 3/4 of a word). GPT-4: 128k. Claude: 200k. Gemini Advanced: 1M. Larger = can handle longer documents.

Multimodal AI

Models that understand multiple types of input — text, images, audio, video. GPT-4o, Gemini, and Claude 3+ are multimodal. Earlier LLMs were text-only.

The timeline — key moments

2017

Transformer architecture published

Vaswani et al., Google Brain. "Attention Is All You Need". The foundation of every major AI model that follows.

2018

BERT released by Google

Bidirectional transformer for language understanding. Transformed NLP benchmarks. The precursor to modern search and language AI.

2020

GPT-3 released by OpenAI

175 billion parameters. Demonstrated emergent capabilities — the model could do tasks it wasn't explicitly trained on. Sent shockwaves through AI research.

2021

Anthropic founded

Dario Amodei, Daniela Amodei, and colleagues leave OpenAI to found Anthropic, focused on AI safety research.

2022

Stable Diffusion released

Rombach et al. release open-source image generation. The first high-quality image model freely available to run on consumer hardware.

Nov 2022

ChatGPT launches

GPT-3.5 with RLHF. Reaches 1 million users in 5 days, 100 million in 2 months. AI goes mainstream.

2023

Llama released by Meta as open weights

Meta releases model weights publicly — anyone can download and run a capable LLM. Triggers the open-source AI ecosystem.

Mar 2023

GPT-4 released

Multimodal, significantly more capable than GPT-3.5. Passes bar exam, medical licensing exam. Sets new benchmark standard.

Jun 2023

Mistral AI founded

Ex-DeepMind and Meta researchers raise €105M before shipping a model. European AI research powerhouse.

Sept 2023

Mistral 7B released

Outperforms Llama 2 13B at 7B parameters. Open source. Signals that parameter count isn't everything.

Dec 2023

Gemini 1.0 and Grok-1

Google releases first natively multimodal model. Elon Musk's xAI releases Grok inside X (Twitter).

2024

Claude 3, Llama 3, Gemini 1.5

Major capability improvements across all labs. Gemini 1.5 introduces 1M token context. Arms race accelerates.

Aug 2024

Flux.1 released

Black Forest Labs (ex-Stability AI team) releases Flux.1 — new standard for image generation quality and prompt adherence.

2025

Reasoning models, agentic AI

OpenAI o1/o3, Claude 4. Extended thinking becomes standard. Agentic AI frameworks (LangChain, CrewAI) reach mainstream adoption.

April 2026

AI embedded everywhere

Meta AI in 3B+ WhatsApp/Instagram users. Copilot in all Windows/Office. Gemini in all Google Workspace. AI is infrastructure.

Explore the AI Atlas

AI Atlas Hub Which AI Should I Use? What is Agentic AI? History of AI AI Glossary What is AI?