The CMS Vocabulary War Has Started

9 May 2026 · 9 min read

Every major CMS has picked up a new label in the past six months. The wording varies a little: "AI Content Operating System", "AI-ready content infrastructure", "the platform agents reach for first". The label is the easy part. What sits underneath the label, and how an AI agent actually executes against it, is the part that decides which of these vendors is still standing in 2027.

Everyone is an operating system now

Sanity has relaunched as an "AI Content Operating System". Adobe is positioning Experience Manager as the AI-ready enterprise platform of choice. Contentful talks about "content infrastructure" rather than a CMS. Notion is quietly becoming the place AI pipelines reach for when they need a working surface for notes, briefs, and intermediate state. Every vendor in the content category has, in the same window, picked up the same vocabulary.

The reason for the rebrand is real. Structured content is the fuel that AI agents run on. Whoever owns the layer the agents read from owns the workflow that follows. Search distribution shifted in the late nineties. Social distribution shifted in the late noughties. Agent distribution is shifting now, and the vendors who run their planning cycle on quarterly earnings can already see the line. They are racing to claim the first-mover position before the customer has finished evaluating the question.

That is fine, as far as it goes. The trouble is that the new label is the easy part. What lies underneath is the difficult part, and what lies underneath, in most of these cases, is the same SaaS product the vendor was selling last year with a small adapter glued on the front.

We have seen this script before

In 2015 every B2B software company became a "platform". The word arrived because Salesforce had just taught the market that platform companies traded at higher multiples than product companies, so every SaaS vendor adopted the label. Some of them were genuinely platforms, in the sense that third parties could build self-contained businesses on top of them, and what those businesses ran on belonged to anyone who wrote against the API. Most just added an integrations page and rewrote the homepage.

The platform claims that survived the cycle were the ones where what the customer ran on actually belonged to the customer. On the open web that turned out to be an HTML page on a domain the publisher owned. On iOS and Android it was a binary that could be archived, re-signed, and installed elsewhere when the store relationship turned. With Salesforce, it was a schema customers could mostly export and rebuild on other infrastructure. The platform claims that did not survive the cycle were the ones where, when the vendor relationship ended, the customer was left with nothing they could run anywhere else.

The same line is being drawn in 2026. The "AI Content Operating System" claims that will survive are the ones where what an agent runs against belongs to the publisher. The ones that will not are the ones where the vendor's database is still the only place that content can be read and validated.

What does an agent actually run against?

You can describe every operating system by naming what it actually runs. On Unix that is a process. On the web that is an HTTP request against a URL. In Schema.org's vocabulary it is a typed thing with properties. Whatever the smallest item of execution turns out to be, the rest of the system is convenience built around it.

If a vendor sells you an "AI Content Operating System", the first question is straightforward: what does an agent actually execute against? Is it a content record in the vendor's CMS? A schema definition stored against the vendor's GraphQL endpoint? A prompt template that lives inside the vendor's authoring tool? An MCP tool call that the vendor's server resolves? Each of those choices carries a different commitment, different exit costs, and a different kind of long-term risk.

Three pressures will pull on whichever choice the vendor makes. First, whatever the agent runs against has to be portable, because an agent cannot guarantee it will speak only one vendor's protocol throughout a multi-step workflow. Second, it has to be self-describing, because an agent cannot afford to round-trip to a remote service every time it needs to know whether a record is current, in scope, or licensed for the use it is about to make. Third, it has to be verifiable on receipt, because an agent that cites a record without checking provenance produces hallucinations the user has no way to audit.

If the vendor's answer cannot meet those three tests, the operating-system label is decoration. The customer is still buying a product. They will discover this when they try to take their content to a second agent platform and find that whatever the agent fetches only resolves inside the vendor's product walls.

SaaS with an MCP server bolted on is not an operating system

The most common pattern in the new wave of "AI Content Operating System" launches is the same SaaS CMS the vendor shipped last year, plus an MCP server at the front door. The MCP server lets an AI agent call the vendor's existing API in a slightly more polite way. That is genuinely useful work. In many of these announcements it is also the only thing that has actually changed. The CMS still owns the content. The agent is still a guest. The vendor still owns the database, the auth boundary, and the contractual terms.

An MCP server is an interface, not a runtime. It is the modern equivalent of putting an OAuth-secured REST endpoint in front of a database in 2010 and calling the database a platform. The interface is welcome. What runs underneath has not changed.

Three smell tests separate marketing from architecture. First, can the customer run whatever the agent fetches offline, against an unrelated agent, without the vendor's authentication service in the loop? If not, that content lives in the vendor's database. Second, does it carry its own license terms, scope of use, audience, and status, or do those terms live in the vendor's admin UI? If the terms live in the admin UI, an agent that retrieves the content downstream has no way to apply them. Third, when the content leaves the vendor's premises, does anything sign it as authentic? If nothing signs it, the agent has the content but cannot tell whether it is current or fabricated.

Apply those three tests to most of the recent rebrands and the operating-system claim falls apart. The product is fine. The product was always fine. The "operating system" label is just oversold.

Cog files

I have been working through this question for two years, in conversations with publishers, AI vendors, accessibility regulators, and the AI agents themselves. The answer that keeps surviving contact with reality is the cog file.

A cog file is a portable, machine-readable piece of content that an agent can read, validate, and act on without a human in the loop. It carries its own contract. License terms, audience, status, scope, provenance, and integrity signature live inside the file rather than in a remote database the agent has to query. It is delivered as markdown plus YAML frontmatter, a format that every modern AI tool, every static-site engine, and every content pipeline already parses without any vendor-specific software in the path.

Cog files come in two forms that mirror what an agent actually needs from content. An info-cog describes something: a product page, a policy, a glossary entry, a service description, a knowledge-base article, a manuscript chapter. An action-cog does something: it carries a runbook an agent can execute, a prompt template that names its own model and tools, a workflow with declared inputs and outputs. The information class and the action class are explicit, so the agent does not have to infer intent from prose.

That answers the question of what an agent runs against. It runs against a cog file. The interface is markdown plus YAML, which has a thirty-year head start on every proprietary content format. The verification layer is REGINALD, which signs and indexes cog files so an agent can check, before quoting, that the file is current and from the named source.

This is the two-pillar pattern the rest of the MX work has been pointing at. MX makes content machine-readable. REGINALD makes it machine-trustworthy. The combined effect is fewer hallucinations, because agents cite attested facts rather than inferences. Lower inference cost and energy, because agents do not have to reconstruct meaning from unstructured text. Alignment with the provenance requirements that the EU AI Act, the European Accessibility Act, and emerging digital-records legislation are beginning to write into law.

None of this requires a vendor relationship. A publisher who writes content as cog files and signs them with REGINALD owns what the agent runs against on day one. If the CMS used to author the file shuts down, the file keeps working. If the agent platform targeted today gets acquired and the API changes, the file keeps working. If a regulator asks who said this and when, the file answers.

What to ask your CMS vendor

Five questions separate the vendors whose architecture will survive the cycle from the vendors who will quietly retire the operating-system label two years from now.

What does an agent actually execute against in your AI Content Operating System? Name it.
Can I read and validate that without your software in the loop? Show me the file.
Where do the license terms, scope, and provenance live: in the file, or in your database?
What signs the content once it leaves your premises, so an agent can verify it is genuine and current?
If your company is acquired tomorrow and the API changes, what survives in the agent's hands? What needs to be rebuilt?

The answers separate vendors quickly. The ones who can name what an agent runs against, point at the file, and show the signature will still have customers operating their platforms in 2028. The ones who change the subject back to the dashboard, the integrations page, or the partner ecosystem are telling you, without meaning to, that they sold a product and called it infrastructure.

The vocabulary war has started. The architecture war is about to. Choose the vendors whose content keeps working when the vendor stops.

Three ways forward

The first is hiring us. If your CMS is on the list above and you want a clear answer to those five questions for your own estate, that is what CogNovaMX does. We run audits that name the gaps where machine readers stumble and miss what you meant, we advise on what to change first, and we train teams to write and ship content that survives the agent layer.

The second is joining the work. I founded The Gathering, the open community building the standards behind all of this. The Gathering is vendor-neutral, run by editors, contributors, content authors, accessibility specialists, and AI engineers who want the agent web to be readable by everyone. If that describes your role, the door is open.

The third is reading the books. There are three: a free introduction, the Handbook, and The Protocols (publishing in July 2026).