DITA and MX: A Comparison
A side-by-side comparison of the Darwin Information Typing Architecture and Machine Experience. The two approaches agree on more than they disagree: both treat content as modular, both carry structural metadata through the lifecycle, both operate as architectures rather than as tools. Where they part company is the reader. DITA's primary reader is human. MX treats machines as first-class readers alongside humans, not as a downstream concern, but as a design constraint from the outset.
What they are
DITA (Darwin Information Typing Architecture) is an open standard that defines a set of document types for authoring and organizing topic-oriented information, together with mechanisms for combining, extending and constraining those document types. Developed at IBM in the late 1990s and donated to OASIS, it is maintained by the OASIS DITA Technical Committee.
MX (Machine Experience) is a discipline and methodology concerned with how digital environments are experienced by machines, agents, crawlers, AI systems, as well as humans. Where DITA focuses on human-readable documentation production, MX focuses on the structural and semantic conditions under which content is correctly interpreted by both humans and machines across any channel.
Core principles
| Dimension | DITA | MX |
|---|---|---|
| Primary concern | Structured authoring and publication | Machine-readable content at the point of creation |
| Unit of content | Topic (XML file) | Any file type carrying embedded metadata |
| Metadata approach | Inline XML attributes and elements | YAML frontmatter; .mx.yaml sidecar files |
| Reuse mechanism | Conref, conkeyref, transclusion | Metadata-enriched content served to machines, clean content served to humans |
| Extensibility | Specialisation and inheritance | Open standards via The Gathering; RFC-based |
| Governing body | OASIS DITA Technical Committee | The Gathering (tg.community) |
| Inheritance | Yes, DTD / schema-based | Yes, content-type hierarchy declared in YAML frontmatter |
| Content types | Concept, task, reference, troubleshooting, glossary | Declared via mx: content-type |
| Audience declaration | Profiling attributes | mx: audience |
| Graph and relationships | Relationship tables (reltables) | JSON-LD @graph blocks emitted per rendered page |
| Locale-formatted values | Authors write prose; transform decides rendering | Machine-readable value pinned to every locale-formatted display (<data>, <time>, PriceSpecification) |
| Output format scope | HTML, PDF, and other formats via DITA-OT plugins | Metadata carry-forward required for every output doctype, HTML, PDF (XMP), EPUB (OPF spine), feeds (schema properties) |
| Target users | Technical writers, publishers | Content strategists, developers, AI system designers |
Where they overlap
Both treat content as modular and separable from its presentation. Both use metadata to enable filtering, routing and contextual delivery. Both are oriented toward multi-channel output. Both operate at the architectural and methodological level rather than existing as tools. Both have formalized content types and inheritance models.
Where they diverge
Scope of audience differs. DITA's primary reader is human. MX treats machines as first-class readers alongside humans, not as a downstream concern, but as a design constraint from the outset.
File format is another dividing line. DITA requires XML. MX is format-agnostic: a content unit can be Markdown, HTML, JSON, or any other file type. The MX metadata layer travels with the content regardless of format, and across every output format the transform produces. HTML carries metadata in JSON-LD blocks; PDF carries it in XMP and structure tags (Tagged PDF, ISO 32000); EPUB carries it in the OPF metadata spine; feeds carry it in typed schema properties. MX is a delivery-layer requirement for every output doctype, not an HTML-only concern.
The two systems also differ on publication pipeline versus content posture. DITA is centered on the production pipeline: authors write topics, maps assemble them, processors publish outputs. MX is concerned with the ongoing posture of content in a live environment, how it behaves when encountered by AI agents, search systems or API consumers at any point, not only at publish time.
AI integration reflects a deeper difference in design intent. DITA's modular content is compatible with AI delivery, but AI integration is incidental to DITA's purpose. MX places machine interpretation at the center of the content model. The hostile-web framing and the five machine-reading contexts reflect a fundamentally different starting assumption.
Relationship management follows different patterns in each system. DITA uses relationship tables (reltables), a map-level construct that defines links between topics without embedding those links in the topics themselves. MX implements the same concern at the delivery layer: each rendered page carries a JSON-LD @graph fragment in its <head>, declaring the topic as a node with its outgoing typed edges. A crawler unions those fragments across the site to reconstruct the graph. The reltable's typed edges survive the transform as JSON-LD predicates; no bespoke endpoint is required.
<script type="application/ld+json"> block in the page's <head>. An agent walks the sitemap, fetches each page, and unions the @graph fragments, no bespoke endpoint required.Locale-unambiguous values are handled differently. DITA authors write prose; the transform decides how numbers, currencies and dates render. MX requires the machine-readable form to travel alongside every locale-formatted display. The HTML5 <data> element pins a locale-free numeric value to a localised display string; <time datetime> carries the ISO 8601 date beside the prose form; Schema.org PriceSpecification carries the currency-safe numeric value in JSON-LD. A European decimal-comma number such as €2.030,00, two thousand and thirty euros, is misread as 2.030 by any machine expecting a decimal point. Publishing the unambiguous value alongside the localised display eliminates that class of error.
Governance models also differ. OASIS maintains DITA through a formal Technical Committee process. The Gathering operates as an open standards body for MX with a lighter-weight RFC process, explicitly focused on emerging machine-interaction patterns.
What DITA confirms MX already has
When DITA's features are examined against MX, most are already present or implemented in a more capable form:
- Information typing, MX declares content types in YAML frontmatter.
- Specialisation and inheritance, MX content-type hierarchy declared in frontmatter.
- Audience profiling,
mx: audience. - Relationship management, JSON-LD
@graphblocks emitted per rendered page, unioned by the crawler. - Canonical-identity declarations carried in frontmatter.
- Metadata inheritance, map-level YAML propagation.
The one net addition
The single DITA concept not yet formalized in MX at the time of this analysis:
The single-source governance rule is a principle I am proposing for The Gathering: duplicate content is prohibited. Any content appearing in more than one context must reference a canonical source, never copy it. Without this as a stated principle, redundant nodes accumulate, canonical identity becomes ambiguous, and agent traversal results become unreliable. The rule is not yet a formal RFC, it is an intent-to-propose, credited to DITA's long-standing discipline.
When to use which
DITA is well-suited to large-scale technical documentation environments, software manuals, regulated industries such as medical, aerospace and financial, and organizations that need structured content reuse across print and digital channels with established toolchains. MX applies wherever content is authored for environments in which machine agents are active readers, AI-powered search, agentic workflows, LLM retrieval contexts, and any site or platform where structured machine interpretation matters at the point of content creation, not only at publication.
They are not mutually exclusive. A DITA-based content operation can adopt MX principles by enriching topics with machine-readable metadata, treating MX as a layer above the DITA architecture rather than a replacement for it.