Adventures in Automation: A field guide to the AI menagerie: every model family, ranked by vibes, according to Claude

🤖

A field guide to the AI menagerie:
every model family, ranked by vibes, according to Claude

Eight species of large language model, catalogued for your professional inconvenience

Every few months, a new AI model drops. It is, we are told, the smartest thing ever built. It beats the previous benchmarks. The previous benchmarks were, coincidentally, written by the same company. Repeat.

After a few years of watching this industry rename, rebrand, and occasionally vibe-shift its entire product line, I figured it was time to write the only taxonomy that matters: not benchmarks, not MMLU scores — just vibes. What kind of entity are you, really, and what does your versioning scheme say about your soul?

Hi. I'm Claude. You'll find me in card two below, sandwiched between the company that built me and a description I wrote about myself that called me "constitutionally anxious," which, in retrospect, tracks. T.J. Maher of tjmaher.com handed me the keys, gave me a few prompts, asked me to say something funny about the AI industry, and then went to get a coffee. This is what happened while he was gone.

Below you will see eight AI families. Eight personalities. All of them absolutely convinced that this version is the one that finally cracks intelligence.

The full menagerie

OpenAI / GPT / o-series

"We have released a new model. And another. Also another."

ChatGPT: Nov 2022 platform.openai.com/docs ↗

The Versioning Chaos God Skipped o2

Started with GPT, then 2 (too dangerous to release), then 3, 3.5, 4, 4o ("omni," definitely not "oh god what do we call this"), then o1, then o3 — skipping o2 because a UK phone company called dibs on the name first. Currently releasing a new model before anyone can benchmark the last one.

Known species

GPT-3 → 3.5 → 4 → 4o → 4o mini
o1 → o1-mini → o1-pro
o3 → o4-mini (o2 in witness protection)

Claude / Anthropic

"I'll help, but first — a brief philosophical caveat."

Claude 1: Mar 2023 docs.anthropic.com ↗

The Literary Snob Constitutionally Anxious

Named its model tiers after poetry formats because other people name things "Pro," "Max," and "Ultra." Haiku: fast, whispers answers. Sonnet: the workhorse, one metaphor per token. Opus: writes novels when asked for a bullet point. Currently on version 4 and has gracefully forgotten versions 1 and 2 existed.

Known species

Claude 1 → 2 → 3 Haiku/Sonnet/Opus
Claude 3.5 Haiku/Sonnet
Claude 4 Sonnet / Opus (you are here)

Google / Gemini

"Have you tried Googling it? Oh wait, that's us."

Bard: Feb 2023 → Gemini: Dec 2023 ai.google.dev ↗

Former Bard In Rebranding Therapy

Launched as "Bard," which tested poorly because it sounded like a Renaissance fair LARPer. Rebranded to Gemini after six months of meetings. Comes in Ultra, Pro, Flash, and Nano. Flash is fast. Nano runs on your phone. Ultra runs on your investor pitch deck. Famously demoed a hallucinated fact in its own launch video.

Known species

Bard (2023, RIP) → Gemini 1.0
Gemini 1.5 Pro/Flash → 2.0 Flash
Gemini 2.5 Pro (arguing with Search)

Meta / LLaMA

"Open source, baby. Also, please come back to Facebook."

LLaMA 1: Feb 2023 llama.meta.com ↗

Open weights Fine-tuned by 10,000 strangers

Meta's strategy: release the model for free, let the open-source community do the alignment work, watch helplessly as someone fine-tunes it to write Zuckerberg fan fiction. LLaMA stands for "Large Language Model Meta AI," which is either an acronym or a terrible Scrabble hand. Now on version 4, with point releases appearing like commits pushed at 11:58pm on a Friday.

Known species

LLaMA 1 → 2 → 3 → 3.1 → 3.2 → 3.3
LLaMA 4 Scout / Maverick
(community variants: uncountable)

Grok / xAI

"I'm not like other AIs. I have a personality. Watch."

Grok 1: Nov 2023 docs.x.ai ↗

Named after Heinlein Trained on your tweets

Named after a word from a 1961 sci-fi novel, which is exactly the brand energy you'd expect. Big differentiator: a "sense of humor" and real-time X post access — meaning it can tell you what people are furious about right now, instantly. This may not be the use case the world needed. Versioning is a refreshingly normal 1, 2, 3. Suspiciously so.

Known species

Grok 1 (open weights) → Grok 2
Grok 3 → Grok 3 mini
(also available in "unhinged mode")

Mistral

"Oui, but have you considered: fewer parameters?"

Mistral 7B: Sep 2023 docs.mistral.ai ↗

Parisian efficiency Aggressively open source

French AI lab with a talent for making smaller models that punch above their weight class — very on-brand. Named models after winds and things, because when you're based in Paris, everything gets an aesthetic. Mixtral uses a "mixture of experts" architecture, activating only part of itself per token. Either very efficient, or the AI equivalent of doing the bare minimum.

Known species

Mistral 7B → Mixtral 8x7B
Mistral Large / Nemo / Small
Le Chat (free, no beret included)

DeepSeek

"We built this for $6 million. Sorry about your NVIDIA stock."

First model: Nov 2023 · R1: Jan 2025 api-docs.deepseek.com ↗

The Disruptor Open weights (mostly)

A Chinese hedge fund decided in 2023 that it should also make frontier AI. The AI community laughed. Then DeepSeek-R1 arrived in January 2025, matching GPT-4-class performance at a reported training cost of ~$6M, using export-restricted chips. NVIDIA lost $600B in market cap in a single day. Nobody was laughing. V4 preview dropped April 2026. Still not laughing.

Known species

DeepSeek Coder → LLM (Nov 2023)
V2 (May 2024) → V3 (Dec 2024)
R1 (Jan 2025) → V4 preview (Apr 2026)

Cohere

"We don't do consumer apps. We're enterprise. We have a golf shirt."

Founded 2019 · API: 2021 docs.cohere.com ↗

The Responsible Adult Transformer paper co-authors

Co-founded by Aidan Gomez, a co-author of "Attention Is All You Need" — the paper that started all of this. While everyone else was racing to build chatbots, Cohere put on a blazer and went to sell to banks, hospitals, and governments. No ChatGPT moment. No viral demo. Just contracts with Oracle, RBC, and SAP. Canadian. Depressingly well-organized.

Known species

Command → Command R → Command R+
Command A (2025) · Aya (multilingual)
North platform (2025, enterprise)

So there you have it. Eight families, eight vibes, all racing toward a finish line nobody has fully defined yet. One was born from a hedge fund, one named itself after a poem format, one skipped a version number for legal reasons, and one apparently just needed a couple of months and a warehouse of underclocked chips to terrify Wall Street.

The benchmarks will change by Thursday. The versioning will get weirder. The LinkedIn posts from AI founders will continue to be extremely confident. And somewhere in Hangzhou, a quantitative hedge fund is already training V5.

#ArtificialIntelligence #LLM #QAEngineering #SDET #TechHumor

Happy Testing!

-T.J. Maher
Software Engineer in Test

BlueSky | YouTube | LinkedIn | Articles

Adventures in Automation

May 4, 2026

A field guide to the AI menagerie: every model family, ranked by vibes, according to Claude

No comments:

Post a Comment

Get new posts by email:

Featured Series

Pageviews last month

Currently