Index

DITA and MX: A Comparison

A side-by-side comparison of the Darwin Information Typing Architecture and Machine Experience. The two approaches agree on more than they disagree: both treat content as modular, both carry structural metadata through the lifecycle, both operate as architectures rather than as tools. Where they part company is the reader. DITA's primary reader is human. MX treats machines as first-class readers alongside humans, not as a downstream concern, but as a design constraint from the outset.

What they are

DITA (Darwin Information Typing Architecture) is an open standard that defines a set of document types for authoring and organizing topic-oriented information, together with mechanisms for combining, extending and constraining those document types. Developed at IBM in the late 1990s and donated to OASIS, it is maintained by the OASIS DITA Technical Committee.

MX (Machine Experience) is a discipline and methodology concerned with how digital environments are experienced by machines, agents, crawlers, AI systems, as well as humans. Where DITA focuses on human-readable documentation production, MX focuses on the structural and semantic conditions under which content is correctly interpreted by both humans and machines across any channel.

Core principles

Core principles of DITA and MX compared
Dimension DITA MX
Primary concernStructured authoring and publicationMachine-readable content at the point of creation
Unit of contentTopic (XML file)Any file type carrying embedded metadata
Metadata approachInline XML attributes and elementsYAML frontmatter; .mx.yaml sidecar files
Reuse mechanismConref, conkeyref, transclusionMetadata-enriched content served to machines, clean content served to humans
ExtensibilitySpecialisation and inheritanceOpen standards via The Gathering; RFC-based
Governing bodyOASIS DITA Technical CommitteeThe Gathering (tg.community)
InheritanceYes, DTD / schema-basedYes, content-type hierarchy declared in YAML frontmatter
Content typesConcept, task, reference, troubleshooting, glossaryDeclared via mx: content-type
Audience declarationProfiling attributesmx: audience
Graph and relationshipsRelationship tables (reltables)JSON-LD @graph blocks emitted per rendered page
Locale-formatted valuesAuthors write prose; transform decides renderingMachine-readable value pinned to every locale-formatted display (<data>, <time>, PriceSpecification)
Output format scopeHTML, PDF, and other formats via DITA-OT pluginsMetadata carry-forward required for every output doctype, HTML, PDF (XMP), EPUB (OPF spine), feeds (schema properties)
Target usersTechnical writers, publishersContent strategists, developers, AI system designers

Where they overlap

Both treat content as modular and separable from its presentation. Both use metadata to enable filtering, routing and contextual delivery. Both are oriented toward multi-channel output. Both operate at the architectural and methodological level rather than existing as tools. Both have formalized content types and inheritance models.

Where they diverge

Scope of audience differs. DITA's primary reader is human. MX treats machines as first-class readers alongside humans, not as a downstream concern, but as a design constraint from the outset.

File format is another dividing line. DITA requires XML. MX is format-agnostic: a content unit can be Markdown, HTML, JSON, or any other file type. The MX metadata layer travels with the content regardless of format, and across every output format the transform produces. HTML carries metadata in JSON-LD blocks; PDF carries it in XMP and structure tags (Tagged PDF, ISO 32000); EPUB carries it in the OPF metadata spine; feeds carry it in typed schema properties. MX is a delivery-layer requirement for every output doctype, not an HTML-only concern.

The two systems also differ on publication pipeline versus content posture. DITA is centered on the production pipeline: authors write topics, maps assemble them, processors publish outputs. MX is concerned with the ongoing posture of content in a live environment, how it behaves when encountered by AI agents, search systems or API consumers at any point, not only at publish time.

AI integration reflects a deeper difference in design intent. DITA's modular content is compatible with AI delivery, but AI integration is incidental to DITA's purpose. MX places machine interpretation at the center of the content model. The hostile-web framing and the five machine-reading contexts reflect a fundamentally different starting assumption.

Relationship management follows different patterns in each system. DITA uses relationship tables (reltables), a map-level construct that defines links between topics without embedding those links in the topics themselves. MX implements the same concern at the delivery layer: each rendered page carries a JSON-LD @graph fragment in its <head>, declaring the topic as a node with its outgoing typed edges. A crawler unions those fragments across the site to reconstruct the graph. The reltable's typed edges survive the transform as JSON-LD predicates; no bespoke endpoint is required.

DITA reltable emitted as JSON-LD @graph in each rendered HTML page Left: a DITA reltable inside map.ditamap declares typed relationships, the task "Configure X" requires the concept "Install X", which describes the reference "X: Field Ref". Middle: a DITA-OT build with an added MX transform step. Right: one of the rendered HTML pages (for "Install X") showing a script type equals application/ld+json block inside its head. The JSON-LD contains an @graph array with this topic as a node (id, type, MX audience and state) plus typed predicate edges (mx colon requiredBy pointing to the task, mx colon describes pointing to the reference). Each rendered page carries its own @graph fragment; agents walk the site's sitemap, fetch each page, extract the JSON-LD, and union @id links across fetches. DITA source → DITA-OT + MX transform → Rendered HTML with JSON-LD Reltable rows (map.ditamap) Configure X task Install X concept X: Field Ref reference requires describes DITA-OT + MX transform step https://docs.example.com/concept/install-x <head> <script type="application/ld+json"> { "@context": { ...schema.org, mx... }, "@graph": [ { "@id": "/concept/install-x", "@type": "DefinedTerm", "mx:audience": "tech", "mx:state": "published", "mx:requiredBy": {"@id": "/task/configure-x"}, "mx:describes": {"@id": "/reference/x-field-ref"} } ] } </script> </head> … one of N rendered topic pages, each carrying its own @graph fragment.
The reltable is already an edge list. DITA-OT, with an MX transform step added, emits each topic's outgoing typed edges as JSON-LD predicates inside a <script type="application/ld+json"> block in the page's <head>. An agent walks the sitemap, fetches each page, and unions the @graph fragments, no bespoke endpoint required.

Locale-unambiguous values are handled differently. DITA authors write prose; the transform decides how numbers, currencies and dates render. MX requires the machine-readable form to travel alongside every locale-formatted display. The HTML5 <data> element pins a locale-free numeric value to a localised display string; <time datetime> carries the ISO 8601 date beside the prose form; Schema.org PriceSpecification carries the currency-safe numeric value in JSON-LD. A European decimal-comma number such as €2.030,00, two thousand and thirty euros, is misread as 2.030 by any machine expecting a decimal point. Publishing the unambiguous value alongside the localised display eliminates that class of error.

Governance models also differ. OASIS maintains DITA through a formal Technical Committee process. The Gathering operates as an open standards body for MX with a lighter-weight RFC process, explicitly focused on emerging machine-interaction patterns.

What DITA confirms MX already has

When DITA's features are examined against MX, most are already present or implemented in a more capable form:

The one net addition

The single DITA concept not yet formalized in MX at the time of this analysis:

The single-source governance rule is a principle I am proposing for The Gathering: duplicate content is prohibited. Any content appearing in more than one context must reference a canonical source, never copy it. Without this as a stated principle, redundant nodes accumulate, canonical identity becomes ambiguous, and agent traversal results become unreliable. The rule is not yet a formal RFC, it is an intent-to-propose, credited to DITA's long-standing discipline.

When to use which

DITA is well-suited to large-scale technical documentation environments, software manuals, regulated industries such as medical, aerospace and financial, and organizations that need structured content reuse across print and digital channels with established toolchains. MX applies wherever content is authored for environments in which machine agents are active readers, AI-powered search, agentic workflows, LLM retrieval contexts, and any site or platform where structured machine interpretation matters at the point of content creation, not only at publication.

They are not mutually exclusive. A DITA-based content operation can adopt MX principles by enriching topics with machine-readable metadata, treating MX as a layer above the DITA architecture rather than a replacement for it.