Index

Who answers when the machine decides?

RESONEO, the search-research outfit publishing at think.resoneo.com, tested more than two million domains against classification data built into Chrome and set out what it found in May 2026. The finding worth sitting with is that the browser carries internal lists, shipped and updated quietly alongside it, that decide how Chrome's machine features behave on a given site. The site is the subject of those decisions. It is never a party to them.

What the browser carries

One list keeps Gemini-in-Chrome from offering contextual actions — summarise this page, fill this form, compare these offers — on around thirty thousand domains. The pattern is banks, payment and credit services, cryptocurrency platforms, tax authorities, central banks, hospitals and telemedicine. Google's reasoning is plain enough: on pages where a wrong suggestion has real consequences, the safer choice is to keep the model out.

A second list does the opposite for a different population. Around thirty thousand mostly-commerce domains are marked so that, once visited, their pages are turned into vector embeddings on the user's own device. This is the browser's semantic memory: it lets history be searched by meaning rather than by exact words. Pages from sites that are not on the list do not get this treatment. News and media are almost entirely absent, which suggests Google sees little point vectorising content that goes stale.

A third layer handles commerce. Merchants are segmented by spending category, flagged for instalment-payment eligibility, and flagged where they reject the virtual card numbers Chrome can generate. Alongside the lists sits a live model — reverse-engineered by Dan Petrovic at Dejan — that decides whether a page is a shopping page at all. It chops the page into chunks of around a hundred words and truncates each to sixty-four tokens before judging.

What unites these is more telling than any one of them. Each is a judgement about a site, formed outside the site, applied without the site's knowledge, and with nothing the site can publish to change it. Two are membership in registers held inside the browser binary. One is inference run afresh at page load. In every case the site is acted upon.

The cost of guessing

Take the shopping classifier first, because it is the cleanest example. The site already knows it sells things. The model spends compute rediscovering that fact, page by page, across the install base. The work of chunking and embedding and scoring is energy spent inferring something the site could have stated outright. A declared Product and Offer record, with currency in ISO 4217, says the same thing once and removes the guess. This is the reduce-inference argument in a single case, and it is the Convergence Principle as well: the declaration that saves the machine its guesswork is the same record a person can read and check. The interests line up rather than trade against each other.

There is an irony worth marking, because it is Google's own. Google's guidance to site owners, through Search Central, tells them to mark up products with schema.org Product and Offer so its systems can read price, currency and availability reliably instead of inferring them. The shopping classifier leans on a hundred words of truncated text more than on that declared markup. The advice says declare; the browser guesses. A site that followed the guidance to the letter is still read by inference — which says something about how much weight a declaration carries when the reader is under no obligation to honour it.

That principle scales past shopping. Every classification a browser infers from page content is work it would not need to do if the site told it plainly. Inference is not free — not in latency, not in error rate, not in energy. Declaration is cheaper on all three.

The cost of deciding in private

The Gemini block list raises a different point, and it is worth being fair to Google here. Keeping a model away from a banking or medical page is a reasonable instinct. The problem is not the caution. It is where the caution lives and who exercises it.

This is one vendor judging sensitivity on behalf of every site, silently, inside the browser. A sensitive site that is not on the list gets no caution at all. A listed site that would happily permit some machine actions has no way to say so. There is no mechanism to read the judgement, contest it, or contribute to it. The decision is real and it has consequences, and it sits somewhere the site cannot reach.

A policy COG is where that decision could live instead — as a statement the site publishes about which machine actions are permitted against which content, with a human in the loop and the record readable by both sides. Attestation keeps this honest without overreaching: an Ed25519 signature over the canonicalised bytes confirms the statement is genuinely the site's and has not been altered. It says nothing about whether the content is editorially safe, because attestation is about provenance and integrity, never editorial truth. So MX would not be certifying a bank's pages as fit for an agent to act on. It would let the bank make a signed, checkable declaration of how it wants machines to behave on its pages — and let the browser, or any other reader, resolve that declaration rather than consult a list the bank never saw.

What a signature can and cannot do

There is an obvious objection to all of this, and Google has put it on the record. Its May 2026 Search guidance tells site owners not to build machine-readable files for AI — no llms.txt, no special text files — partly because the systems already read the live HTML, and partly because such a file is unverifiable: a field a site can fill with anything, with nothing checking what it says against the page. Google has likened it to the old keywords meta tag, which search engines learned to ignore for that reason.

The objection is right, and it is worth being honest about how far it reaches, because a signature does not answer it. An attested COG can say anything, exactly as a web page can, exactly as an llms.txt can. Attestation signs the bytes; it does not inspect the claims. By design it covers provenance and integrity and stops there, leaving the reader to accept the claim or refuse it. That is not a shortfall in the design — it is the design. No machine should be ruling, on a site's behalf, that a statement is true. The reader decides, with a human in the loop where it matters.

So what does the signature buy, if not truth? An accountable author. The keywords meta tag failed not only because it went unchecked but because it went unowned: anyone could stuff it, and no one carried the cost. A signed declaration has someone on the hook — a key the origin controls, bytes that cannot be altered without breaking the signature, a claim a reader can trace to whoever made it and weigh accordingly. It can still be a lie. But it is a lie with a name against it, and a name can be doubted, revoked, audited, and held to its record over time. An anonymous claim can be none of those things.

Now set that beside the other side of the ledger. When Google's classifier decides your page is not a shopping page, or a list decides an agent should not act on you, or a model summarises you and gets you wrong — who is accountable for that? The judgement is unsigned. You cannot see it, you cannot contest it, and no party has put a name to it that you could question or appeal. A declaration you signed can be wrong and answered for. An inference made about you can be wrong and answered for by no one. Both can say anything; only one can be held to account.

That is the question the inference model leaves unanswered. A signed declaration places responsibility with the site and hands the reader something to judge. An inference hides the judgement, places responsibility nowhere, and gives the site no say in a verdict it cannot even read. Google was right that an unverifiable file is worth little. It is worth asking, in the same breath, what an unaccountable inference is worth — and who answers for it when it is wrong.

Discovery as membership

The semantic-memory list carries the same shape in a quieter form. The richer treatment — being searchable by meaning — is gated by inclusion in a register the site cannot see and cannot apply to join. Discovery collapses into being known to one vendor. The site publishes nothing that earns it a place; it is either on the list or it is not.

MX's discovery layer inverts that order. The site publishes what it is and what it offers, and being found follows from what has been declared rather than from membership in a private register. The difference is not cosmetic. One model puts the site at the mercy of a list it has no sight of. The other puts the signal in the site's own hands, where it can be read, checked and corrected.

Where this stops being flattering to MX

Honesty about the limit matters more than the rhetorical win. MX compliance does not, today, get a site onto Chrome's semantic-memory list or off its Gemini block. These lists sit at the consumption layer, inside the browser, at the point where machine features are switched on or off. MX sits at the publishing layer, where a site declares its provenance, context and usage. The two run in parallel, and Chrome's lists pay no attention to what a site declares.

So this is not a case of MX answering Chrome's lists. It is a case of Chrome's lists showing, at the scale of two million domains, that the deciding already happens — by inference and by hidden register — and that the site is left out of it. MX is the argument that the site should be in it.

It is not only Chrome

Chrome is the instance that got measured, not the whole of the thing. The same pattern — read the rendered page, infer what it is, decide in private what to do with it — runs through every system that now reads the web on a person's behalf.

Other browser agents keep their own version of it. Perplexity's Comet, built on Chromium, acts inside the page — summarising it, filling its forms, running tasks across it. Microsoft's Copilot in Edge reasons across the open tabs and completes work on the user's behalf, and where it is allowed to act can be fenced by rules an administrator sets: its own private register of where an agent may operate, drawn up by a different party for different reasons. Brave's Leo summarises pages, videos and documents and can drive the browser from within an isolated profile, working from Brave's own independent search index rather than anyone else's. Each forms its own read of a site and decides on its own terms what to act on, what to summarise and what to trust. The criteria differ from one to the next, and none are published.

The assistants that fetch and read pages do the same work at the moment of retrieval. When a chatbot pulls a page to answer a question, it judges for itself what the page is, whether to trust it, what to quote and what to drop. RESONEO's own studies of ChatGPT Search and Perplexity show how much of this is bespoke heuristics rather than anything a site can see or set. Summariser foundation models go furthest, because they discard the most: handed a page to condense, a model builds its own representation of what matters, keeps a fraction and drops the rest — often including any sense of where a claim came from. It cannot tell a first-party statement the site stands behind from boilerplate scraped on the way past, because nothing in the rendered page carries that distinction in a form the model is obliged to honour.

So a site does not face one hidden rulebook. It faces many, each maintained by a different vendor, none compatible with the next, all reconstructing by inference the same facts the site already holds — what it is, whether it trades, whether it is sensitive, whether a given statement is authoritative. The cost of that falls on the site, left trying to stay legible to a crowd of systems it cannot see, on terms it cannot read, that change without notice. The vendors half-acknowledge the problem themselves: agentic browsing is widely held, including by the companies shipping it, to carry real risk, and a recurring complaint across the field is sites that were never built to be acted upon by an agent at all.

This is where declaration earns its keep across the field rather than against Chrome alone. One attested record is written once and readable by any agent that chooses to read it; inference is paid again by every system that passes through. The energy argument does not merely hold as more agents arrive — it compounds. And attestation matters more as their number grows, not less: when many models ingest your content, a signature over the canonicalised bytes is the only way any of them can separate a genuine first-party statement from a scrape or a forgery. The more agents there are guessing, the more a site needs a way to say, once and checkably, this is mine and this is what it means.

None of these systems read MX today either, so the parallel-layer point holds here as well — now multiplied across the field rather than confined to one browser. That is not an argument against a shared standard. It is the argument for one. The alternative to a standard anyone can read is not the absence of rules. It is a rulebook per vendor, with the site outside every one of them.

A classification is a decision

Calling these lists and models "classification" softens what they do. A classification is a decision. The shopping model decides whether you trade. The block list decides whether an agent may act on you. The semantic-memory list decides whether you are remembered by meaning or not remembered at all. Each is a determination made about a party, by an automated system, with consequences for that party — and that is the category law has started to take an interest in.

The EU AI Act has been in force since August 2024, the first horizontal statute of its kind, built on the idea that automated systems carry obligations in proportion to what they decide. Its timeline moved in May 2026: a provisionally agreed Digital Omnibus deal deferred the heavier duties for high-risk systems to 2027 and 2028, while the transparency obligations — among them the duty to tell people when they are dealing with an AI system at all — stayed on course for August 2026. Adjacent regimes lean the same way, from the data-protection limits on decisions taken by machine alone to the risk-management and accountability frameworks emerging on both sides of the Atlantic.

The claim has to be made carefully. A browser's shopping classifier is almost certainly not a high-risk system under the Act, and nothing here says any company is in breach of anything. The point is the direction, not a particular clause. Regulation is moving steadily towards disclosure that a decision was automated, a party responsible for it, and a route to contest it. The classification pattern is built the other way: the decision is undisclosed, the responsible party is unnamed, and there is no route to contest a verdict you cannot see. Whatever the law demands of any single system today, a web run on opaque automated decisions runs against the direction the law is taking.

This marks the narrow and accurate place MX occupies. MX does not regulate the model; that is what law is for. What it offers sits on the other side of the relationship — a way for the party being decided about to keep an accountable record of its own. A declaration with an author, a signature and a date is the kind of evidence an accountability regime expects: structured, queryable, checkable, able to be held against its maker later. MX makes that evidence; it does not make anyone compliant, and it does not make a model behave. But it means that when the question of who said what, and on what basis, finally has to be answered, the declaring side has something to show. The inferring side, unless the law compels it, still has nothing.

A frontier with no shared rules

Chrome is the rulebook that has been measured, but there is no agreed standard beneath any of these systems. Each writes its own, and Google has written one and shipped it inside Chrome. It does not publish it, does not explain how a site lands on a list, and offers no route to read or contest the entries. The figures here surfaced through reverse engineering, not disclosure. That is the frontier as it stands: the party that ships the browser writes the rules, writes them for itself, and keeps them to itself.

The fracture runs inside Google as much as between vendors. In the same month, its Search guidance told site owners not to bother making machine-readable files for AI, while its browser tooling began auditing pages for exactly those signals — a summary file for agents, semantic markup, a clean accessibility tree. That last signal is the Convergence Principle in Google's own tooling: the accessibility tree an agent reads is the structure a screen reader reads, so building for one builds for the other. One arm of the company says declarations for machines are not worth making; another checks whether you have made them. If a single company's own products cannot agree on whether to read what a site declares, a site has no fixed target to build towards. The rules are not only unshared between vendors; they are unsettled within them.

Chunking shows the same split, and resolves it tellingly. The Search guidance tells site owners not to chunk their content for AI, to leave their pages whole. Google's cloud tooling, in the same period, tells developers building on its models to chunk documents with care — tune the chunk size, set the overlap, run a layout-aware parser so the pieces keep their meaning. Both are Google's advice. The reconciliation is the revealing part: do not chunk your content, because the machine will chunk it for you, exactly as Chrome's classifier already does when it splits a page into hundred-word pieces. Deciding where your meaning begins and ends is reserved to the reader, not granted to the writer — the inference stance in miniature. The site supplies the text; the machine keeps the right to cut it up and rule on what it meant.

This is the case for community-led standards, and it is why The Gathering exists as an independent body rather than a product feature. A standard a site can read, implement and help shape is the alternative to a register it cannot see. The COG format and the registry concept are open and governed in common for that reason; REGINALD is one implementation of them, not the standard itself. The point of holding the two apart is that the rules do not belong to whoever happens to own the most-used implementation.

The "or else" is not dramatic, it is mechanical. If the community does not set these rules, the rules still get set — by whoever ships the binary, in private, to suit their own ends. The absence of a shared standard is not neutral ground. It is ground claimed by default, and we are watching it be claimed.

The question the finding leaves open

So the open question is whether declared, attested signals will ever feed classifications of this kind, or whether vendors will keep deciding alone — and whether the standard those signals follow is one anyone can read, or one held inside a browser.

The MX position is clear. A site states what it is and how it wishes machines to treat it. The statement is attested so a reader can trust its origin and integrity. The human stays in the loop for editorial judgement, which no signature should ever stand in for. None of this requires Google to adopt anything. It requires only that we are clear about the choice the research puts in front of us: a web where machines guess and gatekeep in private, or one where sites declare in the open. Chrome has shown us the first at scale. The second is still ours to build.


The Chrome classification research is RESONEO's, published at think.resoneo.com. The shopping-classifier reverse engineering is Dan Petrovic's at Dejan.