DITA and MX: A Comparison

20 April 2026 · 6 min read

A side-by-side comparison of the Darwin Information Typing Architecture and Machine Experience. The two approaches agree on more than they disagree: both treat content as modular, both carry structural metadata through the lifecycle, both operate as architectures rather than as tools. Where they part company is the reader. DITA's primary reader is human. MX treats machines as first-class readers alongside humans, not as a downstream concern, but as a design constraint from the outset.

What they are

DITA (Darwin Information Typing Architecture) is an open standard that defines a set of content types for authoring and structuring topic-oriented information, together with mechanisms for combining and constraining those types. Developed at IBM in the late 1990s and donated to OASIS, it's maintained by the OASIS DITA Technical Committee.

MX (Machine Experience) is a discipline and methodology concerned with how digital environments are experienced by machines, agents, crawlers, AI systems, as well as humans. Where DITA focuses on human-readable documentation production, MX focuses on the structural and semantic conditions under which content is correctly interpreted by both humans and machines across any channel.

Core principles

Core principles of DITA and MX compared
Dimension	DITA	MX
Primary concern	Structured authoring and publication	Machine-readable content at the point of creation
Unit of content	Topic (XML file)	Any file type carrying embedded metadata
Metadata approach	Inline XML attributes and elements	YAML frontmatter; `.mx.yaml` sidecar files
Reuse mechanism	Conref, conkeyref, transclusion	Metadata-enriched content served to machines, clean content served to humans
Extensibility	Specialisation and inheritance	Open standards via The Gathering; RFC-based
Governing body	OASIS DITA Technical Committee	The Gathering (tg.community)
Inheritance	Yes, DTD / schema-based	Yes, content-type hierarchy declared in YAML frontmatter
Content types	Concept, task, reference, troubleshooting, glossary	Declared via `mx: content-type`
Audience declaration	Profiling attributes	`mx: audience`
Graph and relationships	Relationship tables (reltables)	JSON-LD `@graph` blocks emitted per rendered page
Locale-formatted values	Authors write prose; transform decides rendering	Machine-readable value pinned to every locale-formatted display (`<data>`, `<time>`, `PriceSpecification`)
Output format scope	HTML, PDF, and other formats via DITA-OT plugins	Metadata carry-forward required for every output doctype, HTML, PDF (XMP), EPUB (OPF spine), feeds (schema properties)
Target users	Technical writers, publishers	Content strategists, developers, AI system designers

Where they overlap

Both treat content as modular and separable from its presentation. Both use metadata to enable filtering and routing. Both are oriented toward multi-channel output. Both operate at the architectural and methodological level rather than existing as tools. Both have formalized content types and inheritance models.

Where they diverge

Scope of audience differs. DITA's primary reader is human. MX treats machines as first-class readers alongside humans, not as a downstream concern, but as a design constraint from the outset.

File format is another dividing line. DITA requires XML. MX is format-agnostic: a content unit can be Markdown, HTML, JSON, or any other file type. The MX metadata layer travels with the content regardless of format, across every output doctype the transform produces. HTML embeds metadata in JSON-LD blocks; PDF holds it in XMP and structure tags (Tagged PDF, ISO 32000); EPUB holds it in the OPF spine; feeds hold it in typed schema properties. MX is a delivery-layer requirement for every output doctype, not an HTML-only concern.

The two systems also differ on publication pipeline versus content posture. DITA is built around the production pipeline: authors write topics, maps assemble them, processors publish outputs. MX is concerned with the ongoing posture of content in a live environment, how it behaves when encountered by AI agents, search systems or API consumers at any point, not only at publish time.

AI integration reflects a deeper difference in design intent. DITA's modular content is compatible with AI delivery, but AI integration is incidental to DITA's purpose. MX makes machine interpretation core to the content model. The hostile-web framing and the five machine-reading contexts reflect a different starting assumption.

Relationship management follows different patterns in each system. DITA uses relationship tables (reltables), a map-level construct that defines links between topics without embedding those links in the topics themselves. MX implements the same concern at the delivery layer: each rendered page includes a JSON-LD @graph fragment in its <head>, declaring the topic as a node with its outgoing edges. A crawler unions those fragments across the site to reconstruct the graph. The reltable's edges survive the build as JSON-LD predicates; no bespoke endpoint is required.

The reltable is already an edge list. DITA-OT, with an MX transform step added, emits each topic's outgoing edges as JSON-LD predicates inside a <script type="application/ld+json"> block in the page's <head>. An agent walks the sitemap, fetches each page, and unions the @graph fragments, no bespoke endpoint required.

Locale-unambiguous values are handled differently. DITA authors write prose; the transform decides how numbers and dates render. MX requires the machine-readable form to travel alongside every locale-formatted output. The HTML5 <data> element pins a locale-free numeric value to a localised string; <time datetime> has the ISO 8601 date beside the prose form; Schema.org PriceSpecification has the currency-safe figure in JSON-LD. A European decimal-comma number such as €2.030,00, two thousand and thirty euros, is misread as 2.030 by any machine expecting a decimal point. Publishing the unambiguous form alongside the localised output eliminates that class of error.

Governance models also differ. OASIS maintains DITA through a formal Technical Committee process. The Gathering operates as an open standards body for MX with a lighter-weight RFC process, explicitly focused on emerging machine-interaction patterns.

What DITA confirms MX already has

When DITA's features are examined against MX, most are already present or implemented in a more capable form:

Information typing, MX declares content types in YAML frontmatter.
Type hierarchy and inheritance, declared via content-type in frontmatter.
Audience profiling, mx: audience.
Relationship management, JSON-LD @graph blocks emitted per rendered page, unioned by the crawler.
Canonical-identity declarations carried in frontmatter.
Metadata inheritance, map-level YAML propagation.

The one net addition

The single DITA concept not yet formalized in MX at the time of this analysis:

The single-source governance rule is a principle I'm proposing for The Gathering: duplicate content is prohibited. Any content appearing in more than one context must reference a canonical source, never copy it. Without this as a stated principle, redundant nodes accumulate, canonical identity becomes ambiguous, and agent traversal results become unreliable. The rule isn't yet a formal RFC, it's an intent-to-propose, credited to DITA's long-standing discipline.

When to use which

DITA is well-suited to large-scale technical documentation environments, software manuals, regulated industries, and teams that need structured content reuse across print and digital channels with established toolchains. MX applies wherever material is authored for environments in which agents are active readers, AI-powered search, agentic workflows, LLM retrieval contexts, and any site or platform where structured machine interpretation matters at the point of creation, not only at publication.

They aren't mutually exclusive. A DITA-based content operation can adopt MX principles by enriching topics with machine-readable metadata, treating MX as a layer above the DITA architecture rather than a replacement for it.