Many Agents, One Metadata Layer

30 April 2026 · 9 min read

The week AWS Quick launched

AWS announced Quick this month, an agent platform aimed squarely at the same territory ChatGPT, Claude and Perplexity have been carving up for two years. It joins a list that has grown faster than anyone in the industry can keep track of. Cowork, the multi-agent collaboration framework. OpenClaw, the open-runtime project gathering momentum on GitHub. Cursor, the developer agent that has reshaped how working programmers ship code. Microsoft Copilot in three flavours, embedded in Office, in GitHub, and in Windows. Google's Gemini agents, Anthropic's Claude with computer use, Replit's agentic coding tools, Perplexity Labs. Vertical specialists in legal, in medical, in finance, in DevOps, in customer support. The list continues, and another platform will probably launch in the time it takes me to finish writing this paragraph.

Every one of these is doing the same first task before it can do anything useful. Every one is reading the public web, the published PDFs, the API documentation, the company sites, the help articles, the policy pages, the pricing tables, the regulatory filings. Every one of them, given an unfamiliar URL, has to perform context discovery: what is this site, what is this organisation, what may I do with this content, who is the author, when was it last updated, what version chain does it sit on, which standards does it claim conformance to, which downstream resources does it point at. Every agent does that work. Every agent does it again on the next visit, and on the visit after that, because no agent shares its context-discovery cache with any other.

The waste is enormous. The redundancy is invisible to the publisher. And the cost shows up everywhere except on the publisher's bill.

What each agent is actually doing

Strip away the marketing copy and every agent platform is, at the layer that matters, a context-discovery engine attached to an action engine. The action engines differ. The context-discovery engines do not, in any meaningful way. They all need the same five things from any document or URL they are asked to act on.

Identity: what is this thing, and is it the current version. Provenance: where did it come from, who made it, when. Lifecycle: is the content still authoritative or has it been superseded. Affordances: what may an agent do with the content, and what should it do next. Semantics: what is this about, in machine-resolvable terms, so the agent can decide whether the document is relevant before reading the body.

An agent that finds those five answers in metadata does its job in a tree walk. An agent that has to derive them from prose does its job in a vision-and-language reconstruction. The compute differential is one to two orders of magnitude per document. The error rate diverges similarly: the metadata path produces a deterministic answer, the reconstruction path produces an estimate that may or may not be correct, with errors that propagate downstream into citations, generated contracts, and customer support conversations the publisher is not in the room for.

Every new agent platform that ships in 2026 will pay these costs again. The publishers who reach the agents will pay them. The end users who read the agent's output will pay them. The grid that runs the inference will pay them in megawatts.

The substrate, not the application

The argument I want to make in this post is structural: the answer to the agent proliferation problem is not another agent. The agent layer is in good shape. There are dozens of platforms, more arriving each week, each one good at some specific subset of the work. They will fight over the action layer for the next several years and the result of that fight will be three or four winning platforms in each category, plus a long tail of vertical specialists. That is fine. That is how layers settle.

The substrate underneath is not in good shape. The substrate is the metadata that every agent has to read, regardless of which platform built it. The substrate is the structure tree of a tagged PDF, the JSON-LD on the canonical URL, the XMP packet that survives copying and syndication, the llms.txt manifest, the agent-card.json service description, the .mx.yaml.md folder declaration that tells an agent what the directory it is reading actually contains. The substrate is everywhere or it is nowhere. If it is in the HTML but missing from the PDF, the agent that downloads the PDF starts again. If it is in the canonical site but missing from the third-party syndication, the agent that found the syndication has no way back to the canonical. If it is in the file but missing from the folder, the agent has to crawl the folder to figure out what the file is even part of.

An operating system gives every application the same access to the same primitives. Files, processes, network, time, identifiers. Every application can assume that open() works, that the file system has folders, that processes have names, that the kernel will tell you what time it is. Applications do not reinvent these primitives. They were standardised long ago. Productivity at the application layer comes from not having to do the substrate work twice.

MX is the same kind of move at the agent layer. Identity, provenance, lifecycle, affordances, semantics: these are the primitives an agent needs from any artefact. They should not be rediscovered for each new platform. They should be declared once, by the publisher, in the carrier the artefact is shipped in, in a vocabulary every agent can read.

MX in every carrier

The carrier discipline I keep coming back to in this blog and in MX: The Protocols is the same point made differently. A modern publisher ships HTML, PDF, DOCX, EPUB, MP4, audio, CSV, ICS, RSS, and increasingly Markdown via content-negotiation. Each of those formats has its own native idiom for declaring meaning. HTML has structured data. PDF has a structure tree and an XMP packet. DOCX has OOXML styles. EPUB has a navigation document. WebVTT carries cues for audio and video. CSVW lets a CSV declare its schema in JSON-LD. None of these are MX-specific; they are open W3C and ISO standards that have existed for years.

What MX adds is the consolidation. The same fields that an agent reads from a page's <meta> tags should be readable from a PDF's XMP packet. The same governance signals that the page declares should travel with the document when the document is downloaded. The same canonical URL declared on the HTML should appear in the PDF metadata so that an agent holding a stale copy can find the current one. The same training-data policy. The same conformance claims. The same author. The same licence.

Without that consolidation, every agent that reaches the document via a different path has to start its context discovery from scratch. With it, the document carries its own context, in every form it ships in, surviving every copy and every syndication. That is the substrate.

`.mx.yaml.md` at folder boundaries

One layer up from the file is the folder. A directory containing forty PDFs and ten markdown files is, to an agent, a wall of forty-plus context-discovery jobs. The agent that figures out the folder is a quarterly-report archive on the third PDF has wasted the cost of two PDFs to reach that conclusion. The agent that follows behind, working for a different platform, will repeat the same wasted reads.

This is the gap .mx.yaml.md closes. A folder-level metadata file declares, in one place, what the directory is, what it inherits from its parent, what it contains, who maintains it, what governance applies. The convention is already in use across the MX project. The generator script lives in scripts/mx/, the validator runs on commit, and every folder of any size in the project carries one. The format is small: a YAML frontmatter declaring the folder's purpose, a markdown narrative explaining the contents in human-readable form, an inheritance chain so child folders adopt their parent's defaults without restating them.

An agent reading .mx.yaml.md first answers all the discovery questions about the folder before reading any individual file. Two outcomes follow. First, the agent skips folders that are not relevant to its task; it does not read fifteen quarterly reports to discover the directory is the historical archive of an investor relations site. Second, the agent that does read individual files reads them with the folder's context already in hand: the same author chain, the same governance, the same conformance claims, the same canonical site context.

The convention is not exotic. It is one file per folder, the same way README.md has been one file per repository for thirty years. What is new is making it machine-first by default, with the YAML frontmatter as the load-bearing structure and the markdown narrative as the human accompaniment.

Three vectors, compounded across every reader

The case I made for the audit pays back through three vectors: reduced inference cost, fewer hallucinations, lower regulatory exposure. The same three vectors compound differently when the substrate is everywhere.

Inference cost compounds across agents. A site whose metadata is consolidated does less work for each agent that reads it, and there are more agents arriving each week. The savings are not per-page; they are per-page-per-agent, multiplied by the visit frequency. As agent traffic surpasses human traffic on most public corpora, this is the curve that bends the energy bill.

Hallucination compounds across the chain of agents that read each other's output. An agent that has misread a table cites the misread numbers. The next agent that summarises the first agent's output cites them again. By the time the error reaches the human reader, three agents have signed off on it. Each of those agents would have done their job correctly if the substrate had answered the question deterministically. Fix the substrate, and you remove a class of error from a class of conversations the publisher will never witness.

Energy compounds twice. Once in the inference savings I have already described. Twice in the failure modes the substrate prevents: an agent that has to retry against three sources before it gets a consistent answer pays its inference bill three times.

The MX OS position

The position I want this blog post to land on the table is one I expect to argue many more times before the industry settles. The MX layer is not an application. It is an operating system primitive. The metadata that lives in HTML, PDF, DOCX, EPUB, audio, video, CSV, in llms.txt, in agent-card.json, in .mx.yaml.md at every folder boundary, is the substrate every agent platform builds on, whether the platform is from AWS or Anthropic or a graduate-student project on GitHub.

The publishers who carry the substrate get read correctly by every agent that arrives. The publishers who do not get read correctly by the agent that has the budget to do the reconstruction work, and incorrectly by every other agent. As agent budgets compress (and they will; the long tail of vertical agents cannot afford to do vision-based reconstruction on every read), the gap between the substrate-equipped publishers and the rest will widen.

The work is not glamorous. It is field declarations and folder metadata and structure trees in tagged PDFs. It is exactly the kind of work that wins the long game and looks tedious in the short one. The publishers who took the same posture in the early HTML era, the ones who learned semantic markup and accessibility and structured data while everyone else was chasing the latest framework, are the publishers whose content is most readable today. The MX OS position says: do that again, deliberately, for the agent era.

What to do this week

If you are responsible for a published corpus, three actions return their cost quickly. Add MX governance fields to your HTML <meta> tags so that the basic identity, audience, content-policy and licence claims are machine-readable. Regenerate your public PDFs with a tagged-PDF pipeline so they carry the same fields in their XMP packet, plus the structure tree the European Accessibility Act mandates. Drop a .mx.yaml.md file at every folder boundary in your published source tree so that an agent reading the folder does not have to crawl every file to figure out what is there.

None of these is a multi-quarter programme. Each is a finite, scoped pass that an engineering team can complete and verify against a specific gate. The audit work I described in the previous post is the entry point: it tells you which fields are missing, which PDFs are untagged, which folders are bare. The discipline is what keeps the substrate consistent as the corpus grows.

The agent platforms will keep launching. Another one will land while you are reading this. The substrate they all build on is the same substrate, and it is yours to ship.