Claude in 2026: Models, Apps, Claude Code, and the API

Matthias Meyer

Most people who use Claude have only seen one tenth of it. They open the chat window, type a question, get an answer, and close the tab. That is Claude the way a phone is a calculator. It works, but it misses the point.

Claude in 2026 is four things wearing one brain. There is the chat product at claude.ai. There is Claude Code, an agent that lives in your terminal and edits real files. There is the developer API that lets you build Claude into your own software. And under all three sits one family of models, available not just from Anthropic but on Amazon, Google, and Microsoft clouds too. The skill that actually matters is not "prompting." It is knowing which of these four surfaces to reach for when. This guide walks all four, with the real numbers, so you can stop guessing.

I run a design and AI studio on Mallorca, and we use every one of these surfaces daily, on our own work and on client systems. What follows is the map I wish someone had handed me.

The model family, in one table#

Everything starts with the models. As of June 2026 there are three current ones, and they are genuinely different tools, not just sizes of the same thing.

	Claude Opus 4.8	Claude Sonnet 4.6	Claude Haiku 4.5
API model ID	`claude-opus-4-8`	`claude-sonnet-4-6`	`claude-haiku-4-5`
Best at	hardest reasoning, long agentic coding	speed plus intelligence	fastest, near-frontier
Context window	1M tokens	1M tokens	200K tokens
Max output	128K tokens	64K tokens	64K tokens
Price (input / output per 1M)	$5 / $25	$3 / $15	$1 / $5
SWE-bench Verified	88.6%	79.6%	strong, lower

Opus 4.8 shipped on 28 May 2026 and is the flagship. It scores 88.6% on SWE-bench Verified, the standard benchmark for fixing real GitHub issues, up from 87.6% for Opus 4.7. On Terminal-Bench 2.1 it hits 90.1%. The number Anthropic leaned on hardest at launch was not a benchmark though: Opus 4.8 is roughly four times less likely than its predecessor to let a flaw in its own code pass unremarked. That honesty improvement matters more in practice than a point of SWE-bench, because the failure mode of a coding agent is rarely "can't solve it" and usually "solved it wrong and told you it was fine."

A 1M-token context window means Opus 4.8 and Sonnet 4.6 can hold roughly 555,000 to 750,000 words at once, an entire mid-sized codebase or a stack of contracts. Haiku stays at a still-large 200K. One nuance worth knowing: a big context window does not mean you should fill it. Performance degrades as context fills up, a problem people now call context rot. The window is headroom, not a target.

Pricing scales the way you would expect. Opus is five times the input price of Haiku and five times the output price. The practical rule we use: Haiku for high-volume, well-defined work like classification, extraction, and routing. Sonnet as the everyday workhorse for most chat and coding. Opus when the task is genuinely hard, long-running, or expensive to get wrong. There is also a fast mode on Opus 4.8 at $10 input and $50 output per million tokens, for when latency matters more than cost.

All three read text and images, speak dozens of languages, and run on the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. So if your company is locked into AWS or Azure procurement, you can still use the same models through the cloud you already pay for. One caveat: on Microsoft Foundry, Opus 4.8 currently runs with a 200K window rather than the full 1M.

Surface one: claude.ai, the chat product#

This is the part everyone knows, and it has quietly grown into a serious workspace. Worth understanding before you pay for anything.

Projects are the feature most people miss. A Project is a container with its own instructions and uploaded knowledge. Drop your brand guide, your API docs, and your tone rules into a Project once, and every conversation inside it inherits that context automatically. For a small business this is the difference between re-explaining your company every morning and never explaining it again.

Artifacts turn a chat into a live workspace. Ask for a small web app, a chart, or a document, and Claude renders it next to the conversation where you can preview and iterate. Artifacts now hold persistent storage up to 20MB each, can call APIs, can talk to external services through MCP, and can refresh with live data when you reopen them. People are shipping genuinely useful internal tools this way without touching a code editor.

Then there is the connective tissue. Connectors are how claude.ai plugs into the outside world, and they run on the Model Context Protocol, the open standard Anthropic released for connecting AI to tools and data. Through MCP, Claude reaches Gmail, Google Drive, Slack, GitHub, Notion, Stripe, and hundreds of other services. As a non-developer you add a connector with a few clicks and Claude can suddenly read your calendar or triage your inbox. This is the same MCP that developers build servers for, which is the elegant part: the protocol is one thing, exposed at every level.

Two more surfaces broaden where Claude lives. Claude in Chrome, in beta for paid plans since late April 2026, puts Claude in a browser side panel where it can see the page and click through it with you. And the Claude desktop app added Cowork, which reached general availability across paid plans in April 2026 and lets Claude read, edit, and create files in a folder you choose, running multi-step work on its own. Voice mode on mobile is now free for everyone.

The plans, with real prices:

Plan	Price	For
Free	$0	trying it, light use
Pro	$20/mo ($17 annual)	individuals, daily use
Max 5x	$100/mo	heavy users, more usage
Max 20x	$200/mo	power users, Claude Code
Team	$25-30/seat/mo	small teams
Enterprise	custom	larger orgs, ~70-seat floor

The honest version: Free is fine to evaluate. Pro is the right tier for one person doing real work. Max exists because the people who run Claude Code all day kept hitting limits, and the higher Max tier and the Team Premium seat are where Claude Code usage is most comfortable.

Surface two: Claude Code, the agent in your terminal#

This is the one that changes how work feels, and it is the most misunderstood. Claude Code is not a chatbot in a terminal. Anthropic describes it as an agentic coding tool that reads your codebase, edits files, runs commands, and integrates with your development tools. You tell it what you want in plain language and it does the work across many files, runs the tests, and shows you the diff.

Installing it is one line. On macOS, Linux, or WSL:

curl -fsSL https://claude.ai/install.sh | bash

There is also Homebrew (brew install --cask claude-code) and WinGet for Windows. Then you run claude inside any project and log in. It does not stay in the terminal either. The same engine runs in a VS Code and Cursor extension, JetBrains IDEs, a desktop app with side-by-side sessions, the browser at claude.ai/code, and the iOS app. Your config follows you across all of them.

The single highest-leverage thing in Claude Code is a file called CLAUDE.md. Anthropic's own best-practices doc calls it the agent's constitution. It is a markdown file in your project root that Claude reads at the start of every session. You put your coding standards, your architecture decisions, your preferred libraries, and your review checklist in it, and you stop repeating yourself. On top of that, Claude Code now builds auto memory as it works, saving things like your build command and debugging insights across sessions without you writing them down.

From there it gets powerful in layers. These are the pieces worth knowing by name, because they are the words you use to make Claude configure itself:

MCP servers. The same Model Context Protocol from claude.ai, here in your terminal. claude mcp add wires up a server, and now Claude Code can read your Jira tickets, query your database, or use any tool you give it. Servers are configured per user, per project, or locally, and a shared .mcp.json checks the project's servers into git for the whole team.
Subagents. A subagent is a separate Claude session with its own context window, spawned to handle a noisy or parallelizable task and report back only a summary. The main conversation stays clean. Type /agents to manage them. This is how you run several streams of work at once.
Hooks. Event-driven shell commands that fire deterministically when something happens, like PreToolUse, PostToolUse, or SessionStart. Unlike a prompt, a hook always runs. People use them to auto-format after every edit, block dangerous commands, or re-index a codebase. The newest addition lets a hook call an MCP tool directly, not just a shell command.
Skills. A Skill is a SKILL.md file in .claude/skills/ that packages a repeatable workflow, invoked as /your-skill or automatically when Claude judges it relevant. Unlike a subagent, a Skill runs in the current conversation, no new context, no spawning. Good for codifying a recipe your team reuses.
Plugins. A versioned bundle that ships skills, subagents, slash commands, hooks, output styles, and MCP servers together as one installable unit, shareable through a marketplace. If a Skill is a recipe card, a plugin is the whole cookbook.
Plan mode. Claude reads and proposes a written plan without touching anything. You approve it, then it executes. The discipline this enforces is the whole game.
Checkpoints. Claude Code tracks your session so you can rewind to an earlier state if a change went wrong.

Underneath all of it is a workflow Anthropic recommends and that genuinely works: explore, plan, code, commit. Let Claude read the relevant files first in plan mode, have it write down what it will change and in what order, then let it implement against that plan, then commit with a clear message. Skipping the explore and plan steps is the most common reason a session goes sideways.

And because it follows the Unix philosophy, you can script it. The -p flag runs Claude headless, so you can pipe into it:

git diff main --name-only | claude -p "review these changed files for security issues"

That single pattern, Claude in a pipe, is what turns it from an assistant into infrastructure. We run reviews, translations, and audits this way in CI.

Surface three: the API and the Agent SDK, for builders#

When you want Claude inside your own product, you drop to the API. The core is the Messages API, with official SDKs for Python, TypeScript, and Go. A minimal call looks like this:

from anthropic import Anthropic

client = Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Summarize this contract."}],
)
print(message.content[0].text)

From that foundation, a handful of features do most of the heavy lifting, and knowing they exist saves you money and rebuilds.

Tool use, also called function calling, lets Claude decide when to call functions you define and with what arguments. It is the basis of every agent. You hand Claude a list of tools with JSON schemas, and it returns structured calls you execute.

Prompt caching is the one that pays for itself. You mark a stable chunk of your prompt, a long system prompt or a big document, with cache_control, and subsequent calls reuse it. Cache reads are charged at roughly ten percent of the normal input rate, so repeated context gets up to ninety percent cheaper. The default cache lives five minutes, with a one-hour option. If you are sending the same instructions on every request and not caching, you are overpaying.

Extended thinking lets the model reason before answering, with a budget_tokens you control:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    thinking={"type": "enabled", "budget_tokens": 2000},
    messages=[{"role": "user", "content": "Plan a database migration."}],
)

The rest of the platform is a toolbox you pull from as needed. The Batch API runs large jobs asynchronously within 24 hours at a flat 50% discount, ideal for bulk work that is not time-sensitive. The Files API handles documents and images you reference across calls. Citations make Claude point to the exact sentences it used, which is how you build trustworthy, checkable output. There are first-party tools too: a web search tool for current data, a code execution tool that runs Python in the call, and an MCP connector that lets Claude reach any remote MCP server without you writing client code. A memory tool for long-running agents is in public beta.

When you want to build a full agent rather than make single calls, there is the Claude Agent SDK, formerly the Claude Code SDK. It gives you the same agent loop, tool handling, and context management that power Claude Code, in Python and TypeScript, with full control over orchestration and permissions. The mental model: the API is for calls, the Agent SDK is for agents.

Which surface for which job#

Put the four together and the decision gets simple.

Reach for claude.ai when a human is in the loop and the work is thinking, writing, analysis, or a quick tool you build in Artifacts. Reach for Claude Code when the work touches files, a codebase, or your local machine, and you want an agent that acts, not just answers. Reach for the API when Claude needs to live inside software you ship to other people. And pick the cloud version, Bedrock, Vertex, or Foundry, when procurement or data residency makes that the path of least resistance.

For the model inside any of those: Haiku for volume, Sonnet for most things, Opus when it is hard or costly to be wrong.

The traps nobody puts on the box#

A guide that only lists features is marketing. Here is what actually bites.

The first is confident wrong code. Claude will sometimes invent an API method that sounds right but was renamed two versions ago, or cite a library function that does not exist. The fix is not to trust less in a vague way, it is to give the model current facts. Point it at real documentation, and in Claude Code, give it tools that read your actual code instead of guessing. Opus 4.8 is meaningfully better here, but better is not zero.

The second is context rot. As a session grows long, the model's attention thins out and quality drops. The 1M window invites you to dump everything in, and then the answer gets worse. Keep sessions scoped. Use subagents to push noisy work into separate context. Start fresh when a thread has wandered.

The third is the one to take seriously if you connect Claude to your tools: the lethal trifecta. Security researchers use this term for three conditions that, together, make an agent exploitable. Access to private data, exposure to untrusted content, and a way to send data out. Any one alone is fine. All three at once means a malicious instruction hidden in, say, a web page or an email can make your agent leak what it can see. MCP makes Claude far more capable and, in the same motion, widens this attack surface. The defenses are unglamorous and real: run only MCP servers you trust, use the permission allowlist so the agent cannot run anything it likes, and follow least privilege. Do not give an agent both your secrets and an open door at the same time.

None of these are reasons to avoid Claude. They are the difference between using it well and getting surprised.

Where to start#

If you have never gone past the chat window, do three things this week. Make one Project and load it with your real context. Install Claude Code and write a ten-line CLAUDE.md for one repo. Add one MCP connector to something you actually use. That is the whole arc of this guide in three moves, from chat to agent to connected.

The reason any of this matters for a business is that the gap between islands of tools is where time leaks, and Claude across these four surfaces is, more than anything, a way to close those gaps. We connect Claude to client systems through MCP for exactly that, and the StudioMeyer Memory server is our own answer to the part this guide keeps circling, giving Claude a memory that survives between sessions.

If you want the terminal side on its own, our walkthrough on setting up Claude Code without the jargon is the natural next read.

Claude is no longer one window. Learn the four surfaces and the model stops being a clever toy and starts being something you build on.