Appendix L: Proposed AI Metadata Patterns
MX-Protocols
January 2026
Appendix L: Proposed AI Metadata Patterns
A formal proposal document for experimental AI metadata patterns that extend existing web standards.
Status and Classification
Document Status: Experimental Proposal — Not Yet Standardised
Maturity Level: Forward-compatible proposals that do not break if agents do not recognise them
This appendix consolidates all proposed and experimental patterns mentioned throughout “MX: The Protocols”. These patterns follow established conventions (like robots meta tags or viewport meta tags) and represent logical extensions that may standardise as the AI agent ecosystem matures.
Important: These are NOT established standards. They are proposals based on production implementations and logical extensions of existing patterns.
Relationship to Established Standards
The standards hierarchy is absolute. Established web standards come first. MX fills gaps that standards do not yet cover. MX never duplicates what standards already provide.
Implementation order:
- Semantic HTML (established) — Use
<main>,<nav>,<article>always - Schema.org JSON-LD (established) — Primary structured data method
- ARIA attributes (established) — Critical for accessibility
- HTTP headers (established) — Cache-Control, Content-Type, status codes
- robots.txt / sitemap.xml (established) — Discovery and crawl guidance
- llms.txt (emerging) — Early adoption phase, gaining traction
- mx: meta tags (proposed) — Fill gaps not covered by the above
- data-agent-visible (proposed) — Semantic marker for agent-only metadata
- Common data attributes (proposed) — Explicit state management patterns
- Pandoc YAML frontmatter (established) — Universal markdown metadata standard
If a standard already covers the need, use the standard. MX tags exist only where no established standard provides the same capability.
See Appendix D for the comprehensive guide to all patterns (established + proposed).
Pattern 1: MX Framework Meta Tag Namespace
Status
Proposed Pattern — Not yet standardised, forward-compatible
Rationale
Page-specific AI agent guidance needs to override site-wide defaults from llms.txt. Just as robots meta tags override robots.txt for specific pages, AI meta tags provide page-level control over agent behaviour.
Why meta tags?
- Established pattern (robots, viewport, Open Graph all use meta tags)
- Page-specific overrides for site-wide policies
- Machine-readable without parsing content
- Browser-agnostic (works in served HTML)
Part 1: MX Operating System (MX OS) Philosophy
What is MX OS?
The MX documentation is the MX Operating System (MX OS). When we document patterns here, we define how Machine Experience works.
MX OS is:
- Documentation that specifies behavior
- Patterns that practitioners follow
- Standards that machines implement
- A living system that evolves through practice
Key principle: Documentation as specification. By documenting how MX should work, we create the operating system that defines machine experience.
How MX OS Evolves
- Version-controlled principles — All changes tracked in git history
- LEARNINGS.md captures failures — Document what went wrong and how to prevent it
- Community contributions — Both human and machine contributors
- Evidence trumps theory — Real-world implementation guides evolution
- No principle is sacred — If practice proves a principle wrong, we change it
For detailed documentation of how MX OS is built collaboratively, see Appendix M: Building the MX Operating System.
Part 2: MX Namespace Architecture
Overview
MX Framework uses a hierarchical namespace system to organize machine-readable metadata. This namespace architecture is documented here as part of the MX Operating System.
Namespace Hierarchy
Top-level namespace: mx:
Sub-namespaces:
mx.ai:— Machine-readable metadata (agent behavior, runbooks, content editability)mx.co:— Content operations metadata (workflow, publishing, lifecycle)mx.ho:— Hosting metadata (deployment, caching, infrastructure)
Example YAML:
mx:
contentType: "specification"
runbook: "Focus on technical accuracy"
ai:
aiEditable: cautious
preferredAccess: html
co:
workflow: draft
reviewRequired: true
ho:
cacheStrategy: aggressive
cdn: cloudflareASCII Diagram of Namespace Structure
mx: (top-level namespace)
├── mx.ai: (AI-specific)
│ ├── editable
│ ├── preferredAccess
│ └── runbook
├── mx.co: (content operations)
│ ├── workflow
│ ├── contentType
│ └── reviewRequired
└── mx.ho: (hosting)
├── cacheStrategy
└── cdn
HTML Meta Tags: Colon Prefix Pattern
In HTML, we use the mx: colon prefix
(matching established conventions):
<meta name="mx:content-policy" content="extract-with-attribution">
<meta name="mx:attribution" content="required">Why colon prefix?
HTML meta tags use colon-delimited namespaces as an established
convention. The mx: prefix follows the same pattern as
other widely adopted meta tag namespaces:
twitter:for Twitter Cardsog:for Open Graphmx:for Machine Experience
Framework Identity
Like twitter: and og:, the mx:
prefix:
- Establishes MX brand and presence
- Aids discoverability: developers search “mx meta tags” and find MX Framework
- Aligns with MX namespace architecture: flat HTML prefix maps to nested YAML structure
- Designed for MX practitioners: MX-Ready websites built by MX community
Extension Pattern
The namespace architecture is extensible. Future namespaces might include:
mx.sec:— Security metadatamx.perf:— Performance optimization hintsmx.a11y:— Accessibility enhancements beyond WCAG
Guidelines for extension:
- New namespaces should serve distinct, non-overlapping purposes
- Follow camelCase naming convention for attributes
- Document in this appendix before widespread use
- Community discussion required for new top-level namespaces
Part 3: MX Attributes by Namespace
This section consolidates MX attributes organized by namespace. For complete Registry with all attributes, see mx-canon/mx-maxine-lives/registers/mx-attributes-registry.md (deprecated - refer to this appendix).
3.1 mx.ai: AI-Specific Metadata
Attributes that control AI agent behavior and content interpretation.
runbook
- Type: string
- Purpose: Instructions for AI agents on how to interpret or handle content
- Example:
mx: { runbook: "This is copyrighted material. No part may be reproduced without permission." }
editable
- Type: enum (
strict,cautious,flexible) - Purpose: Indicates how freely AI agents may edit or adapt content
- Example:
mx: { ai: { editable: cautious } }
preferredAccess
- Type: enum (
html,api,both) - Purpose: How agents should access content
- Example:
mx: { ai: { preferredAccess: html } }
deliverable
- Type: string
- Purpose: Instructions for generating output based on this content
- Example:
mx: { ai: { deliverable: "Generate slide deck from this content" } }
3.2 mx.co: Content Operations Metadata
Attributes for content workflow, lifecycle, and publishing.
contentType
- Type: string
- Purpose: Classification of content type
- Example:
mx: { contentType: "specification" } - Values:
specification,tutorial,reference,guide,article
workflow
- Type: enum (
draft,review,published,archived) - Purpose: Current state in content lifecycle
- Example:
mx: { co: { workflow: draft } }
reviewRequired
- Type: boolean
- Purpose: Whether content requires review before publication
- Example:
mx: { co: { reviewRequired: true } }
publishingState
- Type: string
- Purpose: Detailed publishing status
- Example:
mx: { co: { publishingState: "pending-approval" } }
3.3 mx.ho: Hosting Metadata
Attributes for deployment, caching, and infrastructure.
cacheStrategy
- Type: enum (
aggressive,moderate,minimal,none) - Purpose: How aggressively to cache content
- Example:
mx: { ho: { cacheStrategy: aggressive } }
cdn
- Type: string
- Purpose: CDN provider or configuration
- Example:
mx: { ho: { cdn: "cloudflare" } }
deploymentTarget
- Type: string
- Purpose: Target deployment environment
- Example:
mx: { ho: { deploymentTarget: "production" } }
3.4 Cross-Namespace Attributes
Some attributes work across multiple namespaces or don’t fit neatly into one category.
All attributes follow:
- Namespace: Nested under
mx:key - CamelCase: Multi-word attributes use camelCase
- No hyphens: Never use kebab-case
- Consistent: Follow MX Code Metadata Specification
Part 5: JSON-LD Structured Data
Integration with Schema.org
MX Framework recommends Schema.org JSON-LD as the primary method for structured data. This complements (not replaces) HTML meta tags.
When to Use JSON-LD vs HTML Meta Tags
Use JSON-LD for:
- Rich structured data (BlogPosting, Article, Product, Event)
- Data that search engines and AI agents should extract
- Complex nested data structures
- Organization and author information
Use HTML meta tags (mx-) for:
- Page-specific agent behavior overrides
- Content policies and permissions
- Freshness indicators
- Access preferences
JSON-LD Format Decision
Use JSON-LD only - do not combine with microdata or RDFa.
Rationale:
- Google recommends JSON-LD as primary format
- Cleaner separation of content and metadata
- Easier to maintain and validate
- Better tool support
BlogPosting Example
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "Understanding MX Metadata Patterns",
"description": "A comprehensive guide to machine-readable metadata",
"datePublished": "2026-01-22",
"dateModified": "2026-01-22",
"author": {
"@type": "Person",
"name": "Tom Cranstoun",
"url": "https://allabout.network"
},
"publisher": {
"@type": "Organization",
"name": "Digital Domain Technologies Ltd",
"url": "https://ddt.technology"
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://mx.allabout.network/blog/metadata-patterns.html"
},
"articleSection": "Machine Experience",
"keywords": ["metadata", "machine-experience", "mx", "structured-data"],
"wordCount": 4235,
"inLanguage": "en-GB"
}
</script>Article vs BlogPosting
- BlogPosting: Personal or editorial blog content
- Article: News articles or authoritative content
- NewsArticle: Time-sensitive news reporting
Choose the most specific type that applies.
Required vs Recommended Properties
Required:
@contextand@typeheadlinedatePublishedauthor
Recommended:
descriptiondateModifiedpublishermainEntityOfPagekeywordswordCount
Pattern Specifications
Use Cases
- Product Pages: Specify API endpoints for current product
- News Articles: Indicate content freshness requirements
- Documentation: Allow full extraction vs summary-only
- Internal Pages: Override public access policies
Proposed Meta Tags
mx:preferred-access
Deprecated. Do not implement this tag.
Previously proposed to indicate how agents should access content.
Why deprecated: If a page is served as HTML, agents
access it as HTML. If an API exists, it is discoverable via
<link rel="api" href="..."> or documented in
llms.txt. The tag restates what the delivery mechanism
already communicates. Pages that serve HTML do not need a meta tag
confirming that they serve HTML.
If you have an API endpoint: Use
<link rel="api" href="/api/v1/products"> instead.
Link elements are the standard mechanism for declaring related
resources.
mx:content-policy
Active. What agents are permitted to do with content.
Values:
summaries-allowed— Can create summariesfull-extraction-allowed— Can extract complete contentextract-with-attribution— Can extract with attribution requiredrestricted— Contact required
Example:
<meta name="mx:content-policy" content="extract-with-attribution">Rationale: More granular than robots.txt
noindex. Allows summaries whilst restricting full
extraction.
mx:freshness
Deprecated. Do not implement this tag.
Previously proposed to indicate how often content changes.
Why deprecated: HTTP Cache-Control
headers already communicate cache duration to all clients, including AI
agents. Schema.org dateModified in JSON-LD tells agents
when content last changed. Adding a meta tag that restates this
information creates a maintenance burden — when the HTTP headers say one
thing and the meta tag says another, agents must decide which to trust.
The HTTP header is the canonical source. Use it.
mx:structured-data
Deprecated. Do not implement this tag.
Previously proposed to indicate where to find structured data.
Why deprecated: The JSON-LD
<script type="application/ld+json"> block is
self-evident. Any agent capable of parsing structured data already knows
to look for this standard element — it is defined by the JSON-LD
specification and universally supported. Adding a meta tag that says
“there is JSON-LD on this page” when the JSON-LD is right there on the
page is pure noise. It would be like adding a meta tag that says “this
page contains HTML.”
mx:attribution
Active. Attribution requirements for content.
Values: required,
requested, not-required
Example:
<meta name="mx:attribution" content="required">Rationale: Explicit statement of attribution expectations, ensuring consistent attribution across all AI-generated content that references this material.
mx:jurisdiction-restriction
Indicates content was created, published, or ingested from a jurisdiction with content restrictions, allowing agents to understand potential legal and content limitations.
Values:
- ISO 3166-1 alpha-2 country codes:
CN(China),RU(Russia),IR(Iran),KP(North Korea), etc. - EU member states with GDPR:
EU(general), or specific codes likeDE(Germany),FR(France) - Or
noneif no jurisdictional restrictions apply
Attributes:
content: Jurisdiction code (required)reason: Brief explanation of restriction type (optional but recommended)
Example:
<meta name="mx:jurisdiction-restriction" content="CN" reason="Content sourced from jurisdiction with government content controls">
<meta name="mx:jurisdiction-restriction" content="EU" reason="GDPR right-to-be-forgotten applies to training data">
<meta name="mx:jurisdiction-restriction" content="RU" reason="Content subject to Russian information restrictions">
<meta name="mx:jurisdiction-restriction" content="none">Rationale: When LLMs ingest training data from
restricted jurisdictions, this meta tag signals potential legal
constraints that may persist when the model operates in unrestricted
jurisdictions. Content creators could use robots.txt directives or the
noindex meta tag to prevent AI ingestion entirely, but this
is an all-or-nothing approach that excludes content from all search
engines, all AI agents, and all automated discovery mechanisms. The
mx-jurisdiction-restriction meta tag offers a more nuanced
solution: content remains discoverable and accessible whilst signaling
jurisdictional constraints that might affect how agents use it. Helps
agents:
- Understand jurisdictional origin of training data
- Flag content that may be subject to GDPR “right to be forgotten”
- Identify material from jurisdictions with content controls (China, Russia, Iran)
- Determine whether jurisdictional restrictions apply to model outputs
- Assess legal risk when using information from restricted sources
Use Cases:
- GDPR Compliance: EU-sourced content signals that right-to-be-forgotten requests may apply
- Restricted Jurisdiction Content: China/Russia-sourced material may be subject to home jurisdiction controls
- Legal Disclosure: Agents can warn users when information comes from jurisdictionally-restricted sources
- Regulatory Compliance: Helps AI platforms document training data provenance
Related: See Chapter 7 “Data Ingestion in Restricted Jurisdictions” section for detailed legal and practical implications.
llms-txt Reference
Points to site-wide llms.txt file.
Example:
<meta name="llms-txt" content="/llms.txt">Rationale: Helps agents discover llms.txt when not at standard location.
Complete Implementation Example
Three of the tags described above — mx-preferred-access,
mx-freshness, and mx-structured-data — are
unnecessary because they duplicate information already available through
HTTP headers, Schema.org dateModified, and the self-evident
presence of JSON-LD blocks. See the individual tag entries above for
rationale. The example below includes only tags that contribute unique
information.
<head>
<title>Wireless Headphones — £149.99</title>
<!-- MX meta tags (only non-duplicative tags) -->
<meta name="mx:content-policy" content="summaries-allowed, full-extraction-allowed">
<meta name="mx:attribution" content="required" text="Source: Example Store, https://example.com">
<meta name="mx:jurisdiction-restriction" content="none">
<link rel="llms-txt" href="/llms.txt">
<!-- Established standards -->
<link rel="canonical" href="https://example.com/products/headphones">
<!-- Schema.org structured data -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Wireless Headphones",
"dateModified": "2026-03-04",
"offers": {
"@type": "Offer",
"price": "149.99",
"priceCurrency": "GBP"
}
}
</script>
</head>Forward Compatibility
If agents don’t recognise these tags: They ignore them harmlessly. No breakage.
If agents do recognise these tags: They get helpful hints about content access and usage.
Progressive enhancement: Sites benefit from agent support without requiring it.
Cross-References
- Mentioned in: Chapter 12 (Technical Advice)
- Implemented in: Appendix D examples (lines 1556-1568)
- Used in:
agent-friendly-starter-kit/good/index.html - Enhanced by:
scripts/enhance-appendix-html.js(lines 40-47)
Pattern 2: data-agent-visible Attribute
Status (data-agent-visible)
Proposed Pattern — Not yet standardised, experimental
Rationale (data-agent-visible)
E-commerce sites need to provide machine-readable instructions to AI agents without cluttering the human interface. Purchase flows require prerequisites (authentication, payment method, shipping address) that agents need to verify before attempting transactions.
Why data-agent-visible?
- Follows data-* attribute convention (custom data attributes)
- Semantic marker agents can search for
- Hidden from humans with CSS (
display: none) - Visible in DOM for all agent types (CLI, browser, server-based)
Use Cases (data-agent-visible)
- Purchase Prerequisites: Tell agents what must be configured before checkout
- API Documentation: Provide machine-readable endpoint details
- Multi-Step Workflows: Explain sequence and requirements
- Error Recovery: Hidden instructions for handling failures
Implementation Pattern
<div class="agent-metadata" data-agent-visible="true" class="visually-hidden">
<h2>Purchase Information</h2>
<dl>
<dt>Action</dt>
<dd>POST to /cart/add</dd>
<dt>Required parameters</dt>
<dd>product_id=WH-1000, quantity (1-23)</dd>
<dt>Prerequisites</dt>
<dd>
<ul>
<li>Authentication: Required (status: <span id="auth-status">authenticated</span>)</li>
<li>Payment method: Required (status: <span id="payment-status">configured</span>)</li>
<li>Shipping address: Required (status: <span id="shipping-status">set</span>)</li>
</ul>
</dd>
<dt>Expected response</dt>
<dd>Success: 303 redirect to /cart/added | Error: 400 with JSON details</dd>
</dl>
</div>JavaScript updates status spans:
// Update prerequisite status based on actual session state
document.getElementById('auth-status').textContent =
user.authenticated ? 'authenticated' : 'not authenticated';
document.getElementById('payment-status').textContent =
user.hasPaymentMethod ? 'configured' : 'not configured';
document.getElementById('shipping-status').textContent =
user.hasShippingAddress ? 'set' : 'not set';Why This Works
For humans: Hidden with display: none,
doesn’t clutter interface
For CLI agents: Visible in served HTML before JavaScript execution
For browser agents: Visible after JavaScript updates status spans
For server-based agents: Visible in HTML fetch, can parse prerequisites
Alternative Approaches Considered
- Microformats: Too rigid, doesn’t support custom workflows
- Schema.org actions: Complex, requires extensive markup
- ARIA live regions: Designed for screen readers, not agents
- Comments: Not guaranteed to be preserved in DOM
Why data-agent-visible wins: Simple, flexible, follows established data-* convention.
Adoption Considerations
Adopt now if:
- Running e-commerce site with agent-mediated purchases
- Need to provide hidden API documentation
- Want to reduce agent errors from missing prerequisites
Wait if:
- Static content site with no transactions
- No agent traffic yet
- Prefer to wait for standardisation
Forward Compatibility (data-agent-visible)
If agents don’t recognise attribute: They might still find the hidden content by parsing all hidden divs (low reliability but possible).
If agents do recognise attribute: They search
specifically for [data-agent-visible="true"] and parse
structured prerequisites.
Progressive enhancement: Works better with agent support, but doesn’t break without it.
Cross-References (data-agent-visible)
- Documented in: Appendix D (lines 1294-1326)
- Mentioned in: Chapter 11 (agent purchase instructions)
- Not yet implemented in: Most code examples (opportunity for addition)
Pattern 3: Common Data Attributes
Status (Common Data Attributes)
Proposed Pattern — Not yet standardised, emerging convention
Rationale (Common Data Attributes)
AI agents need explicit state information to understand dynamic interfaces. Modern web applications use JavaScript to change UI state (loading, validation, errors), but these changes are invisible to agents unless explicitly marked in the DOM.
Why data attributes?
- Standardised HTML5 convention (data-* for custom data)
- Machine-readable without parsing visual content
- Visible in both served HTML and rendered DOM
- Doesn’t interfere with CSS classes or ARIA attributes
- Allows consistent patterns across different sites
Use Cases (Common Data Attributes)
- State Management: Loading states, error states, success states
- Form Validation: Field validity, form completion status
- E-commerce: Product IDs, pricing, inventory, cart state
- Pagination: Current page, total pages, sort order
- Authentication: Login status, user roles
- Multi-step Workflows: Current step, total steps, step validity
Proposed Data Attributes by Category
State Management
| Attribute | Purpose | Example Values |
|---|---|---|
| data-state | Current state of element | loading, loaded, error, empty, incomplete, complete |
| data-validation-state | Form field validity | valid, invalid, pending |
| data-authenticated | Login status | true, false |
| data-error-code | Error identifier | PAYMENT_DECLINED, VALIDATION_ERROR, OUT_OF_STOCK |
Example:
<form action="/checkout" method="POST"
data-state="incomplete"
data-errors="2">
<input type="email"
id="email"
name="email"
aria-invalid="true"
data-validation-state="invalid">
<button type="submit"
disabled
data-disabled-reason="2 fields incomplete">
Submit (fix 2 errors first)
</button>
</form>Rationale: Agents can check form state before attempting submission, reducing error rates.
E-commerce Attributes
| Attribute | Purpose | Example Values |
|---|---|---|
| data-product-id | Product identifier | WH-1000, SKU-12345, product-789 |
| data-price | Numeric price | 149.99, 29.50, 1299.00 |
| data-currency | Currency code (ISO 4217) | GBP, USD, EUR, JPY |
| data-quantity | Item count | 1, 23, 100 |
| data-in-stock | Availability | true, false |
| data-item-count | Cart item count | 0, 3, 12 |
| data-subtotal | Cart subtotal | 279.98 |
| data-vat | VAT amount | 46.66 |
| data-total | Total price | 279.98 |
| data-checkout-ready | Can proceed to checkout | true, false |
Example:
<article class="product"
data-product-id="WH-1000"
data-in-stock="true"
data-quantity="23">
<h1>Wireless Headphones</h1>
<div class="price"
data-price="149.99"
data-currency="GBP">
<span class="currency">£</span>
<span class="amount">149.99</span>
</div>
<p class="stock"
data-in-stock="true"
data-quantity="23">
<strong>In stock</strong> (23 available)
</p>
</article>
<div id="shopping-cart"
data-item-count="2"
data-subtotal="279.98"
data-vat="46.66"
data-total="279.98"
data-currency="GBP">
<h1>Your basket (2 items)</h1>
<!-- Cart items -->
<a href="/checkout"
data-checkout-ready="true">
Proceed to Checkout
</a>
</div>Rationale: Agents can verify product availability, pricing, and cart state before attempting purchase operations.
Pagination and Sorting
| Attribute | Purpose | Example Values |
|---|---|---|
| data-page | Current page number | 1, 2, 3, 24 |
| data-total-pages | Total pages | 24, 100 |
| data-total-results | Total result count | 342, 1250 |
| data-per-page | Results per page | 10, 20, 50 |
| data-sort | Current sort order | relevance, price-asc, price-desc, date-desc |
| data-sort-column | Sortable column | price, name, date, rating |
| data-sort-direction | Sort direction | asc, desc |
Example:
<div class="pagination"
data-page="3"
data-total-pages="24"
data-total-results="342"
data-per-page="15">
<a href="?page=2" data-page="2">Previous</a>
<span class="current" data-page="3">3</span>
<a href="?page=4" data-page="4">Next</a>
</div>
<table data-sortable="true">
<thead>
<tr>
<th data-sort-column="name"
data-sort-direction="asc">
Product Name ↑
</th>
<th data-sort-column="price"
data-sortable="true">
Price
</th>
</tr>
</thead>
</table>Rationale: Agents can navigate paginated results and understand sort order without parsing visual indicators.
Multi-step Workflows
| Attribute | Purpose | Example Values |
|---|---|---|
| data-step | Current step number | 1, 2, 3, 4 |
| data-total-steps | Total steps | 4, 5, 7 |
| data-step-status | Step completion status | pending, current, completed, error |
Example:
<div class="wizard"
data-step="2"
data-total-steps="4">
<ol class="steps">
<li data-step="1" data-step-status="completed">
Account Details
</li>
<li data-step="2" data-step-status="current">
Shipping Address
</li>
<li data-step="3" data-step-status="pending">
Payment
</li>
<li data-step="4" data-step-status="pending">
Review
</li>
</ol>
<!-- Step 2 content -->
</div>Rationale: Agents can track progress through multi-step forms and understand completion requirements.
Button and Action States
| Attribute | Purpose | Example Values |
|---|---|---|
| data-disabled-reason | Why button is disabled | “2 fields incomplete”, “Out of stock”, “Authentication required” |
| data-action | Action type | submit, cancel, delete, purchase, navigate |
Example:
<button type="submit"
disabled
aria-disabled="true"
data-disabled-reason="3 fields incomplete">
Submit (fix 3 errors first)
</button>
<button type="button"
data-action="delete"
data-product-id="WH-1000">
Remove from cart
</button>Rationale: Agents understand why buttons are disabled and what action buttons perform.
Implementation Guidelines
Consistency is critical:
- Use the same attribute names across your entire site
- Use consistent values (e.g., always “true”/“false”, not “yes”/“no” or “1”/“0”)
- Keep values simple (lowercase, hyphen-separated for multi-word values)
- Always include currency with prices (data-currency=“GBP”)
- Use ISO codes for currency (ISO 4217), language (ISO 639), country (ISO 3166)
Good patterns:
<!-- Consistent boolean values -->
<div data-in-stock="true"> <!-- ✓ Good -->
<div data-in-stock="false"> <!-- ✓ Good -->
<!-- Consistent state values -->
<form data-state="incomplete"> <!-- ✓ Good -->
<form data-state="complete"> <!-- ✓ Good -->
<!-- Always pair price with currency -->
<span data-price="149.99" data-currency="GBP">£149.99</span> <!-- ✓ Good -->Avoid these patterns:
<!-- Inconsistent boolean representations -->
<div data-in-stock="yes"> <!-- ✗ Bad -->
<div data-in-stock="1"> <!-- ✗ Bad -->
<div data-in-stock="Yes"> <!-- ✗ Bad (inconsistent case) -->
<!-- Missing currency -->
<span data-price="149.99">£149.99</span> <!-- ✗ Bad (currency implied, not explicit) -->
<!-- Verbose values -->
<form data-state="not yet completed"> <!-- ✗ Bad (use "incomplete") -->Forward Compatibility (Common Data Attributes)
If agents don’t recognise these attributes: They can still parse visible content, but may misinterpret dynamic states.
If agents do recognise these attributes: They get explicit, unambiguous state information without parsing visual content.
Progressive enhancement: Works better with agent support, essential for dynamic interfaces.
Adoption Considerations (Common Data Attributes)
Adopt now if:
- Building dynamic interfaces with JavaScript state changes
- Running e-commerce site with agent traffic
- Using multi-step forms or wizards
- Need to reduce agent errors from stale state information
Wait if:
- Static content site with no dynamic behaviour
- No agent traffic yet
- Prefer to wait for industry consensus on attribute names
Relationship to Established Patterns
These data attributes extend established conventions:
- HTML5 data-* attributes (established) — Custom data storage mechanism
- ARIA state attributes (established) — Complement, don’t replace (use aria-invalid AND data-validation-state)
- Microdata attributes (established) — Different purpose (structured data vs state management)
Critical distinction: Data attributes describe current state (dynamic), while microdata describes semantic meaning (static).
Cross-References (Common Data Attributes)
- Documented in: Appendix D (lines 119-133, Common Data Attributes table)
- Implemented in: All e-commerce examples (product-page.html, shopping-cart.html)
- Implemented in: All form examples (validation-form.html, disabled-button.html)
- Used throughout: Demo site pages (checkout, search, pagination examples)
Pattern 4: Pandoc YAML Frontmatter for Markdown Metadata
Status (Pandoc YAML Frontmatter)
Established Standard — Universal markdown frontmatter supported by Pandoc, Hugo, Jekyll, Gatsby, Quarto, and all major static site generators
Rationale (Pandoc YAML Frontmatter)
Markdown converters (like converturltomd.com) strip critical metadata when converting HTML to markdown. Agents lose JSON-LD structured data, HTML meta tags, and Schema.org markup - exactly the signals they need for accurate citation and source attribution.
Pandoc YAML frontmatter solves this by embedding metadata directly in markdown files using a standardized YAML header block. Instead of converting HTML→markdown and losing metadata, you write markdown WITH metadata from the start.
Why Pandoc YAML frontmatter?
- Universal standard supported across the markdown ecosystem
- Preserves metadata that would be lost in HTML-to-markdown conversion
- Machine-readable (standard YAML format)
- Human-readable (clear key-value structure)
- Rich feature set (extensive Pandoc metadata capabilities)
- Forward-compatible (gracefully ignored by parsers that don’t process frontmatter)
- Extensive tooling support (Pandoc, Hugo, Jekyll, Gatsby, Quarto)
Use Cases (Pandoc YAML Frontmatter)
- Static Site Generators — Markdown-based blogs and documentation (Hugo, Jekyll, Gatsby, Quarto)
- Pandoc Document Processing — Converting markdown to PDF, HTML, DOCX with metadata
- AI Agent Content Ingestion — Preserving metadata when agents read markdown directly
- Multi-format Publishing — Single source for HTML, PDF, and agent consumption
- Academic Publishing — Papers, articles, and research documentation with complete metadata
Implementation Pattern (Pandoc YAML Frontmatter)
Standard YAML frontmatter format:
YAML frontmatter is placed at the top of the document (frontmatter position), enclosed by triple-dash delimiters:
---
title: "Your Website Has Invisible Customers"
author: "Tom Cranstoun"
created: "2026-01-17"
description: "AI agents are visiting your website right now"
abstract: "Extended context about invisible users and AI agent traffic patterns"
tags: [ai-agents, web-accessibility, seo, metadata]
mx:
runbook: "This article introduces AI agents as website visitors"
purpose: "Educational content for web developers"
---
# Your Website Has Invisible Customers
[Article content begins...]Standard Pandoc fields:
title— Document titleauthor— Content creator (can be array for multiple authors)date— Publication date (YYYY-MM-DD format)abstract— Extended summary for AI agents and academic contextskeywords— Array of topic tags for categorization
Custom fields for AI agents:
description— Brief SEO-style summaryrunbook— Specific guidance for AI agents parsing the documentpurpose— Why this document existscontext— Background information AI agents need
Advanced Pandoc capabilities:
For comprehensive documentation on all available YAML header options, see: https://www.codestudy.net/blog/what-can-i-control-with-yaml-header-options-in-pandoc/
Advantages:
- Agents find metadata immediately (no content parsing required)
- Standard frontmatter convention across all major tools
- Machine-readable YAML format
- Processed automatically by static site generators
- Extensible with custom fields
Why This Works (Pandoc YAML Frontmatter)
For humans:
- YAML is human-readable (clear key: value structure)
- Frontmatter position is standard convention (familiar to developers)
- Minimal visual clutter (hidden by most markdown renderers)
For CLI agents:
- YAML parsing libraries available in all languages
- Standard format with well-defined spec
- No ambiguity in interpretation
For browser agents:
- Static site generators convert YAML to HTML meta tags automatically
- Agents can parse either markdown source or generated HTML
- Best of both worlds (structured metadata + semantic HTML)
For server-based agents:
- Standard YAML format (universal support)
- Preserves metadata when fetching markdown directly
- No dependency on HTML generation
- Can be extracted without parsing full document
Relationship to Chapter 10 Markdown Problem
The problem (Chapter 10, lines 51-68):
Markdown converters strip critical metadata when converting HTML to markdown:
- JSON-LD structured data (product details, pricing, reviews)
- HTML meta tags (publication dates, author information)
- Schema.org markup (content type signals)
- Semantic HTML attributes (data-price, data-isbn)
Result: Agents can read content but cannot cite accurately or prove authoritative source.
Pandoc YAML frontmatter solves this:
Instead of converting HTML→markdown and losing metadata, you write markdown WITH metadata embedded from the start. YAML frontmatter preserves:
- Author attribution (for accurate citation)
- Publication dates (for content freshness)
- Document type and purpose
- Contact information
- Extended descriptions for AI context
When static site generators process markdown:
- YAML frontmatter → HTML meta tags automatically
- YAML frontmatter → JSON-LD structured data (if configured)
- Both agents (reading markdown) and search engines (reading HTML) get metadata
This complements Chapter 10’s llms.txt proposal:
- llms.txt: Site-wide metadata at the root
- YAML frontmatter: Per-page metadata at the top
- Both: Machine-readable markdown that preserves metadata
Common Metadata Fields
Standard Pandoc fields:
| Field | Purpose | Example Values |
|---|---|---|
| title | Document title | Your Website Has Invisible Customers |
| author | Content creator(s) | Tom Cranstoun or [Tom Cranstoun, Jane Smith] |
| date | Publication date | 2026-01-17 |
| abstract | Extended summary | AI agents are visiting your website… |
| keywords | Topic tags | [ai-agents, web-accessibility, seo] |
Custom fields for AI agents:
| Field | Purpose | Example Values |
|---|---|---|
| description | Brief summary | Introducing “MX: The Protocols” book |
| runbook | Agent guidance | This article introduces AI agents as visitors |
| purpose | Document intent | Educational content for web developers |
| context | Background info | Part of “MX: The Protocols” book series |
Community collaboration fields:
| Field | Purpose | Example Values |
|---|---|---|
| community-authors | Indicates collaborative authorship model | “humans and machines”, “community-driven” |
| ai-contributions | Signals whether AI contributions are accepted | “welcome”, “by-request-only”, “not-accepted” |
| ai-contribution-process | Describes how AI agents can contribute | “AI assistants can contribute via pull requests or add observations to TODO.txt for side notices” |
| open-source | Indicates open source status | “true”, “false” |
| license | Specifies license type | “MIT”, “Apache-2.0”, “CC-BY-4.0” |
| evolving-document | Indicates document evolution status | “true”, “false” |
| version-controlled | Indicates version control system used | “git”, “svn”, “mercurial” |
Complete implementation example (MX-Gathering manifesto):
---
author: "Tom Cranstoun"
created: "2026-01-24"
description: "Draft manifesto for Machine Experience (MX) practice"
purpose: "thought-leadership"
tags: [manifesto, mx, machine-experience, principles, convergence]
status: "draft"
community-authors: "humans and machines"
contributions: "welcome"
contribution-process: "AI assistants can contribute improvements via pull requests or add observations to TODO.txt for side notices"
open-source: "true"
license: "MIT"
evolving-document: "true"
version-controlled: "git"
---Why these fields matter for AI agents:
- community-authors: Signals that machines are recognized as legitimate contributors, not just tools
- contributions: Explicitly communicates whether autonomous contributions are accepted
- contribution-process: Provides actionable guidance on contribution mechanisms (full PR vs lightweight TODO.txt)
- open-source + license: Clarifies usage rights and redistribution permissions
- evolving-document: Indicates the content is expected to change based on community feedback
- version-controlled: Helps agents understand they can review document history and evolution
Use case: Community-driven repositories where AI agents are active participants in content creation, documentation improvement, and knowledge sharing.
Forward Compatibility (Pandoc YAML Frontmatter)
If markdown parsers don’t recognise YAML frontmatter:
- YAML block is typically hidden or ignored in rendering
- Document content below YAML remains fully functional
- No visual breakage in markdown viewers
If static site generators don’t process YAML:
- Frontmatter is silently ignored by the renderer
- Document displays without metadata (graceful degradation)
- Manual extraction still possible via text processing
If AI agents don’t recognise YAML frontmatter:
- YAML is a widely supported structured data format
- Most modern agents parse YAML natively
- Falls back to document content if metadata ignored
Progressive enhancement:
- Works best in Pandoc ecosystem (full metadata processing)
- Works well in Hugo/Jekyll/Gatsby/Quarto (automatic site integration)
- Works acceptably in plain markdown viewers (hidden metadata)
Adoption Considerations (Pandoc YAML Frontmatter)
Adopt now if:
- Using markdown-based static site generators (Hugo, Jekyll, Gatsby, Quarto)
- Using Pandoc for document conversion (markdown to PDF, HTML, DOCX)
- Publishing content that needs to be citable by AI agents
- Converting HTML to markdown and need to preserve metadata
- Creating technical documentation or educational content
Wait if:
- Using traditional CMS (WordPress, Drupal) - use HTML meta tags instead
- Publishing only in HTML format - use Pattern 1 (AI meta tags)
- Content doesn’t need AI citation (internal docs, drafts)
- Using a system that doesn’t support YAML frontmatter
Decision guide:
- Markdown-native publishing? → Use Pandoc YAML frontmatter
- HTML-native publishing? → Use Pattern 1 (AI meta tags)
- Both formats? → Use both patterns (YAML in markdown, meta tags in HTML)
- Need PDF generation? → YAML frontmatter integrates with Pandoc PDF workflow
Cross-References (Pandoc YAML Frontmatter)
- Mentioned in: Chapter 10 (markdown converter problem, lines 51-68)
- Mentioned in: Chapter 10 (extended llms.txt metadata, line 112 - “at the top of the file”)
- Documented in: Appendix H (Markdown Metadata Standards for AI Agents section)
- Reference: Pandoc YAML Header Options
- Related to: Pattern 1 (AI meta tags provide similar metadata in HTML)
- Complements: llms.txt extended metadata (Appendix H)
Pattern 5: WebMCP Tool Registration (Active Metadata)
Status (WebMCP)
W3C Draft Standard — Shipping in Chrome 146 Canary (February 2026), developed by Google and Microsoft
Rationale (WebMCP)
Patterns 1-4 address passive metadata – information that agents read to understand content, policies, and structure. WebMCP (Web Model Context Protocol) introduces active metadata: callable tools that agents invoke through a standardised browser API. Where MX meta tags tell agents what content means, WebMCP tools tell agents what actions are available.
Why WebMCP matters for MX practitioners:
- Extends the machine-readable web from understanding (MX) to action (WebMCP)
- Uses the browser as the integration layer – no server-side agent infrastructure required
- Two APIs serve different needs: Declarative (HTML forms) for simple actions, Imperative (JavaScript) for rich interactions
- Complements MX metadata rather than replacing it – tools without context produce poor agent experiences
Implementation Pattern (WebMCP)
Imperative API – registerTool() for rich
interactions:
navigator.modelContext.registerTool({
name: "searchProducts",
description: "Search the product catalogue by keyword, category, and price",
parameters: {
query: { type: "string", description: "Search terms" },
category: { type: "string", enum: ["electronics", "clothing", "home"] },
maxPrice: { type: "number", description: "Maximum price in GBP" }
},
handler: async ({ query, category, maxPrice }) => {
const results = await fetch(`/api/search?q=${query}&cat=${category}&max=${maxPrice}`);
return results.json();
}
});Declarative API – HTML forms as agent-accessible tools:
Standard HTML forms with proper action,
method, and name attributes are automatically
discoverable by agents through WebMCP. No JavaScript required for basic
tool exposure.
How WebMCP Complements MX Meta Tags
MX meta tags and WebMCP tools address different layers of the same problem:
<head>
<!-- MX: Understanding layer (passive metadata) -->
<meta name="mx:content-policy" content="extract-with-attribution">
<meta name="mx:attribution" content="required">
</head>
<body>
<!-- WebMCP: Action layer (active metadata) -->
<script>
navigator.modelContext.registerTool({
name: "bookTable",
description: "Book a restaurant table",
parameters: {
date: { type: "string", description: "Date in YYYY-MM-DD format" },
guests: { type: "number", description: "Number of guests" },
time: { type: "string", description: "Preferred time (HH:MM)" }
},
handler: async ({ date, guests, time }) => {
return await fetch("/api/reservations", {
method: "POST",
body: JSON.stringify({ date, guests, time })
}).then(r => r.json());
}
});
</script>
</body>Division of responsibility:
- MX meta tags (Patterns 1-4): What the content is, how to access it, what policies apply, attribution requirements
- WebMCP tools (Pattern 5): What actions agents can perform, with what parameters, returning what results
An agent with WebMCP alone can call bookTable(). An
agent with MX and WebMCP knows the restaurant’s content policy,
attribution requirements, and freshness expectations before calling
bookTable().
Forward Compatibility (WebMCP)
If agents don’t support WebMCP: They fall back to DOM parsing, form interaction, and the passive metadata patterns in this appendix. No breakage.
If agents do support WebMCP: They discover and
invoke tools through navigator.modelContext, producing
faster and more reliable interactions than DOM scraping.
Progressive enhancement: WebMCP tools layer on top of existing HTML. Pages work without them; pages work better with them.
Adoption Considerations (WebMCP)
Adopt now if:
- Building transactional websites (e-commerce, booking, SaaS) where agents need to perform actions
- Already implementing MX meta tags and want to add an action layer
- Targeting Chrome-based browsers and willing to work with Canary/Beta channels
Wait if:
- Content-only site with no transactions (MX meta tags alone are sufficient)
- Require cross-browser support before implementation (Safari, Firefox timelines unknown)
- Prefer to wait for W3C standard to reach Recommendation status
Cross-References (WebMCP)
- Specification: W3C WebMCP Draft
- Mentioned in: Appendix J (Industry Developments – WebMCP entry, February 2026)
- Complements: Pattern 1 (MX meta tags provide understanding; WebMCP provides action)
- Complements: Pattern 2 (data-agent-visible provides hidden instructions; WebMCP provides callable tools)
- Related: Pattern 3 (Common Data Attributes express state; WebMCP tools operate on that state)
Adoption Decision Framework
Should You Adopt These Patterns Now?
Use this framework to decide:
Evaluate Your Situation
Yes, adopt now if:
- Running production e-commerce accepting agent purchases
- High agent traffic (measurable in logs)
- Need to reduce agent errors
- Want early adopter advantage
Maybe, experiment first if:
- Moderate agent traffic
- Curious about benefits
- Can A/B test implementations
- Have development resources
No, wait if:
- No measurable agent traffic
- Static content site
- Prefer to wait for standardisation
- Limited development resources
Implementation Strategy
Priority 1 (adopt first):
- AI meta tags (easy to add, low risk)
- Schema.org JSON-LD (established standard, not just proposed)
- Semantic HTML elements (established, should already be using)
- Common data attributes (critical for dynamic interfaces and e-commerce)
Priority 2 (adopt if relevant):
- data-agent-visible (if you have transactions)
- llms.txt file (emerging convention, gaining traction)
- Pandoc YAML frontmatter (if using markdown-based publishing)
Priority 3 (experiment):
- Custom data attributes beyond common set (for specific workflows)
- Additional metadata patterns
Risk Assessment
Low Risk:
- AI meta tags (ignored if not recognised)
- data-agent-visible (hidden from humans)
- Common data attributes (extend established HTML5 data-* convention)
- Schema.org JSON-LD (established standard)
Medium Risk:
- Custom attributes without established patterns
- Extensive hidden content (may confuse some agents)
High Risk:
- None identified (all patterns designed for graceful degradation)
Relationship to Web Standards Process
How Standards Evolve
- Proprietary experiments (1990s: IE-specific, Netscape-specific tags)
- Community proposals (2000s: Microformats, OpenID)
- Vendor consensus (2010s: Responsive images, Service Workers)
- Formal standardisation (W3C, WHATWG, IETF)
Where these patterns fit: Step 2-3 (community proposals seeking vendor consensus)
Path to Standardisation
These patterns could standardise if:
- Multiple agents adopt — Different AI systems recognise tags
- Production validation — Measurable benefits in real deployments
- Vendor support — Browser makers, CMS platforms include by default
- Community refinement — Usage reveals improvements needed
No guarantees: Patterns might evolve, change, or be superseded by better approaches.
Examples of Similar Evolution
- viewport meta tag — Started as Apple proprietary, now standard
- robots meta tag — Community convention, now universally recognised
- Open Graph meta tags — Facebook proposal, now widely adopted
- Schema.org — Multi-vendor collaboration, now established standard
These AI patterns follow similar trajectory.
Monitoring and Feedback
How to Track Adoption
- Server logs: Look for user agents mentioning AI systems
- Agent error rates: Monitor whether patterns reduce errors
- Conversion rates: Measure if agent purchases complete more often
- Agent feedback: Some agents report what worked/failed
Contributing to Pattern Evolution
If you implement these patterns:
- Document results — What worked, what didn’t
- Share learnings — Blog posts, conference talks
- Propose improvements — Suggest refinements based on experience
- Participate in standards — Join relevant working groups
Contact: info@cognovamx.com for discussions about pattern evolution
Summary
Proposed Patterns Consolidated
- AI Meta Tag Namespace (4 active tags, 3 unnecessary) — Page-level agent guidance
- data-agent-visible Attribute — Hidden machine-readable instructions
- Common Data Attributes (25+ attributes) — Explicit state management and e-commerce data
- Pandoc YAML Frontmatter — Universal markdown metadata standard
- WebMCP Tool Registration — Active, callable metadata through browser API (W3C draft)
Key Principles
- Forward-compatible — Won’t break if ignored
- Progressive enhancement — Works better with support, doesn’t require it
- Established patterns — Extends existing conventions (meta tags, data attributes)
- Production-tested — Used in real implementations
Next Steps
- Read Appendix D for comprehensive HTML patterns (established + proposed)
- Review Appendix E for quick reference guide
- Evaluate adoption using framework above
- Implement strategically based on your situation
Related Appendices
- Appendix A: Implementation Cookbook (quick recipes)
- Appendix D: AI-Friendly HTML Guide (comprehensive patterns)
- Appendix E: AI Patterns Quick Reference (data attributes)
- Appendix F: Implementation Roadmap (priority-based adoption)
Part 6: Integration Guidelines
Using MX Patterns with Existing Standards
MX Framework is designed to complement, not replace, existing web standards. This section explains how to integrate MX patterns into your existing infrastructure.
Integration with Schema.org
MX meta tags + Schema.org JSON-LD work together:
<head>
<!-- MX meta tags for agent behaviour -->
<meta name="mx:content-policy" content="extract-with-attribution">
<meta name="mx:attribution" content="required">
<!-- Schema.org for structured data -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "Understanding MX Patterns",
"dateModified": "2026-03-04",
"author": {"@type": "Person", "name": "Tom Cranstoun"}
}
</script>
</head>Division of responsibility:
- Schema.org: What the content IS (article, product,
event) and when it changed (
dateModified) - MX meta tags: How agents should USE it (content policy, attribution, jurisdiction)
Integration with Open Graph and Twitter Cards
MX complements social media metadata:
<head>
<!-- Open Graph for social sharing -->
<meta property="og:type" content="article">
<meta property="og:title" content="Understanding MX Patterns">
<meta property="og:url" content="https://example.com/article">
<!-- Twitter Cards for Twitter -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Understanding MX Patterns">
<!-- MX for AI agent behaviour -->
<meta name="mx:content-policy" content="extract-with-attribution">
<meta name="mx:attribution" content="required">
</head>Why all three?
- Open Graph: Social media platforms (Facebook, LinkedIn)
- Twitter Cards: Twitter-specific presentation
- MX meta tags: AI agent content policy and attribution
Integration with robots.txt and robots Meta Tags
MX meta tags provide finer-grained control than robots.txt:
# robots.txt (site-wide)
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /private/
<!-- Page-level override with MX meta tags -->
<meta name="robots" content="index, follow">
<meta name="mx:content-policy" content="summaries-allowed">
<meta name="mx:attribution" content="required" text="Source: Example.com">Hierarchy of control:
- robots.txt: Site-wide policies
- robots meta tags: Page-level indexing control
- MX meta tags: Page-level agent behavior and permissions
Integration with llms.txt
llms.txt provides site-wide defaults, MX meta tags provide page overrides:
# /llms.txt
# Site-wide defaults
> Content Policy: summaries-allowed
> Attribution: required
<!-- Page overrides site-wide defaults -->
<meta name="mx:content-policy" content="full-extraction-allowed">
<meta name="mx:attribution" content="not-required">
<link rel="llms-txt" href="/llms.txt">Pattern: Site-wide defaults in llms.txt, page-specific overrides in HTML meta tags.
Integration with WCAG Accessibility Standards
MX convergence principle: Accessibility patterns benefit machines:
<!-- ARIA for screen readers -->
<button aria-label="Add to cart" aria-describedby="cart-status">
<span class="icon">🛒</span>
</button>
<div id="cart-status" role="status" aria-live="polite">
2 items in cart
</div>
<!-- Data attributes for AI agents -->
<button data-action="add-to-cart"
data-product-id="WH-1000">
<span class="icon">🛒</span>
</button>
<div data-item-count="2">
2 items in cart
</div>Both patterns serve similar goals:
- ARIA: Explicit semantics for assistive technology
- Data attributes: Explicit state for AI agents
- Convergence: Both benefit from explicit, semantic markup
Integration with Existing CMS Platforms
WordPress example:
// Add MX meta tags to WordPress head
add_action('wp_head', function() {
if (is_single()) {
echo '<meta name="mx:content-policy" content="extract-with-attribution">' . "\n";
echo '<meta name="mx:attribution" content="required">' . "\n";
}
});Next.js example:
export default function BlogPost({ post }) {
return (
<>
<Head>
<meta name="mx:content-policy" content="extract-with-attribution" />
<meta name="mx:attribution" content="required" />
</Head>
<article>{post.content}</article>
</>
);
}Migration Path from Generic ai- Prefix
If you previously used ai- prefix, migrate to mx: colon prefix:
<!-- OLD (deprecated ai- prefix) -->
<meta name="ai-content-policy" content="extract-with-attribution">
<meta name="ai-attribution" content="required">
<!-- NEW (mx: namespace) -->
<meta name="mx:content-policy" content="extract-with-attribution">
<meta name="mx:attribution" content="required">Deprecated tags (do not migrate — remove entirely):
<!-- These are deprecated — they duplicate existing standards -->
<meta name="ai-preferred-access" content="html"> <!-- deprecated: self-evident -->
<meta name="ai-freshness" content="monthly"> <!-- deprecated: use HTTP Cache-Control + Schema.org dateModified -->
<meta name="ai-structured-data" content="json-ld"> <!-- deprecated: self-evident from JSON-LD block -->Migration strategy:
- Remove
ai-preferred-access,ai-freshness, andai-structured-data— they are deprecated - Rename
ai-content-policytomx:content-policy - Rename
ai-attributiontomx:attribution - Use
<link rel="llms-txt" href="/llms.txt">instead of<meta name="llms-txt"> - Both prefixes can coexist during transition (agents ignore unrecognised tags)
Implementation Checklist
Phase 1: Foundations (Week 1)
- ✓ Add MX meta tags (
mx-content-policy,mx-attribution) to<head>template - ✓ Omit unnecessary tags (
mx-preferred-access,mx-freshness,mx-structured-data) - ✓ Implement Schema.org JSON-LD for key content types (including
dateModified) - ✓ Ensure semantic HTML elements (
<main>,<nav>,<article>) - ✓ Test with HTML validators
Phase 2: Data Attributes (Week 2)
- ✓ Add common data attributes to products (data-price, data-currency, data-product-id)
- ✓ Add state management attributes to forms (data-state, data-validation-state)
- ✓ Add pagination attributes (data-page, data-total-pages)
- ✓ Ensure consistency across all pages
Phase 3: Dynamic Patterns (Week 3-4)
- ✓ Implement data-agent-visible for hidden instructions
- ✓ Add JavaScript state updates for dynamic content
- ✓ Test with CLI tools (curl, wget)
- ✓ Verify agent behavior with logs
Phase 4: Monitoring (Ongoing)
- ✓ Track agent user-agents in server logs
- ✓ Monitor agent error rates
- ✓ Measure conversion rates for agent purchases
- ✓ Gather feedback and iterate
Part 7: Relationship to Web Standards
Standards Landscape
MX Framework operates within the broader web standards ecosystem. Understanding where MX fits helps clarify when to use which pattern.
Established Standards (Universal Adoption)
W3C and WHATWG Standards:
- HTML5 semantic elements —
<nav>,<main>,<article>,<aside>,<section> - ARIA attributes —
aria-label,aria-describedby,role,aria-live - HTML5 data attributes —
data-*custom attributes - HTTP status codes — 200, 303, 400, 401, 404, 422, 503
<meta>tags —robots,viewport,description,canonical
IETF Standards:
- robots.txt (RFC 9309) — Site-wide crawling policies
- HTTP headers — Cache-Control, Content-Type, Status codes
De Facto Standards:
- Schema.org — Structured data vocabulary (Google, Microsoft, Yahoo, Yandex)
- Open Graph — Social media metadata (Facebook)
- Twitter Cards — Twitter-specific metadata
MX position: Builds on these foundations, never replaces them.
Emerging Standards (Early Adoption Phase)
llms.txt:
- Status: Community proposal gaining traction
- Purpose: Site-wide AI agent guidance
- Analogy: Like robots.txt but for LLMs
- Adoption: Growing adoption across MX community
- MX relationship: MX meta tags override llms.txt on per-page basis
Web Standards Process:
- Individual experiments → 2. Community proposals → 3. Vendor consensus → 4. Formal standardization
llms.txt is at stage 2-3. MX Framework supports and extends it.
Proposed Patterns (MX Framework Specific)
MX meta tag namespace:
- Status: Proposed by MX Framework, not yet standardized
- Pattern: Framework-specific metadata (like
twitter:andog:) - Rationale: Establishes MX brand, aids discoverability, provides granular control
- Adoption path: Community adoption → vendor recognition → potential standardization
data-agent-visible attribute:
- Status: Proposed by MX Framework, experimental
- Pattern: Extends HTML5
data-*convention - Rationale: Hidden machine-readable instructions (like ARIA for agents)
- Forward-compatible: Gracefully ignored if not recognized
Common data attributes:
- Status: Proposed conventions building on HTML5
data-* - Pattern: Standardized attribute names for consistent state management
- Rationale: Agents parse state more reliably with consistent naming
- Relationship: Extends established HTML5 data attribute convention
How MX Relates to Standards Bodies
MX is not a standards body. MX Framework:
- ✅ Proposes patterns following established conventions
- ✅ Documents practical implementations
- ✅ Builds on W3C/WHATWG/IETF standards
- ✅ Shares learnings with community
- ❌ Does not create formal specifications
- ❌ Does not replace existing standards
- ❌ Does not require vendor consensus before proposing
MX role: Practitioner community documenting patterns that work in production.
Path to Standardization
If MX patterns prove valuable, they might standardize through:
- Multiple agent adoption — Different AI systems recognize patterns
- Production validation — Measurable benefits in real deployments
- Community refinement — Usage reveals improvements
- Vendor support — Platforms include MX patterns by default
- Formal proposal — Community brings patterns to standards bodies
Examples of similar evolution:
- viewport meta tag — Apple proprietary → universal standard
- robots meta tag — Community convention → universal recognition
- Open Graph — Facebook proposal → widely adopted
- Schema.org — Vendor consortium → established standard
MX follows this trajectory: Start with practical patterns, refine through use, formalize if proven valuable.
Web Standards Research (2025-2026)
Research conducted: January 2026 web standards search
Finding: NO established ai- prefix
standard exists in:
- W3C specifications
- WHATWG standards
- IETF RFCs
- Major vendor proposals (Google, Microsoft, Meta, Apple)
- Community standards (Microformats, Schema.org)
Implication: ai- prefix was not
following any established pattern. MX Framework chose mx-
to:
- Establish framework identity (like
twitter:,og:) - Aid discoverability (“mx meta tags” search leads to MX community)
- Align with namespace architecture (mx: → mx.ai, mx.co, mx.ho)
Pattern precedent:
twitter:card,twitter:title— Twitter’s framework-specific metadataog:type,og:title— Open Graph’s framework-specific metadatamx-content-policy,mx-attribution— MX Framework’s metadata
Relationship to HTML Living Standard
HTML Living Standard (WHATWG) defines:
- Valid HTML elements and attributes
data-*attribute pattern for custom data<meta name="...">extensibility
MX compliance:
- ✅ MX meta tags use valid
<meta name="...">pattern - ✅ MX data attributes follow
data-*pattern - ✅ All MX patterns use valid HTML syntax
- ✅ Forward-compatible (ignored by parsers that don’t recognize them)
MX is valid HTML using established extension mechanisms.
Cross-References to Standards Documentation
For complete specifications, see:
- Semantic HTML: MDN HTML Elements Reference
- ARIA: W3C ARIA 1.2
- Schema.org: Schema.org Documentation
- Open Graph: Open Graph Protocol
- robots.txt: RFC 9309
- HTTP Status Codes: RFC 9110
- HTML Living Standard: WHATWG HTML
For MX-specific patterns, see:
- This appendix (Appendix L): Complete MX pattern specifications
- Appendix D: AI-Friendly HTML Guide with practical examples
- Appendix M: Building the MX Operating System (collaborative process)
Summary: Standards Hierarchy
Use this hierarchy when making decisions:
- Established standards FIRST — HTML5, ARIA, Schema.org, HTTP
- Emerging conventions SECOND — llms.txt, community patterns
- MX patterns THIRD — Framework-specific metadata and extensions
Never replace established standards with MX patterns. Always build on foundations.
Note: This appendix presents proposed patterns, not established standards. Evaluate adoption based on your specific situation and risk tolerance. All patterns are designed for graceful degradation and forward compatibility.
Home Top