Skip to content

research · structured authoring · DITA · XML · content strategy

Do you need XML for structured authoring? What 59 technical writers actually say

Vladimir Kuzin

The Short Version

Nobody wants XML. They want content reuse, conditions, variables, and multi-channel output. XML is one way to get those capabilities. It is not the only way, and for most documentation teams, it is increasingly not the preferred way.

I analyzed 59 practitioner voices across Reddit, LinkedIn, industry blogs, and community forums. The finding was clear: the structured authoring community has split into three camps. Only one camp (22 percent) considers XML essential, and that camp consists almost entirely of aerospace, defense, and large-enterprise teams with specialized compliance requirements. The other 78 percent either actively resent XML or simply do not care about the underlying format, as long as they get the capabilities they need.

This article presents the evidence.

The Three Camps

Every discussion about XML and structured authoring eventually sorts participants into one of three positions. After cataloging 59 distinct practitioner voices, here is how the numbers break down: 22 percent say XML is essential, 31 percent say they hate XML but feel stuck with it, and 47 percent say XML is unnecessary for their work. Understanding which camp your team falls into determines whether you need XML at all.

"XML is essential" — 22 percent

The pro-XML camp exists, and their reasons are legitimate. But they represent a specific profile that most documentation teams do not share.

These practitioners work in aerospace and defense (S1000D, ATA iSpec 2200), lead teams of 75 or more writers with existing DITA infrastructure, operate in regulated industries where XML schema validation is a compliance requirement, or have built careers as XSLT and XSL-FO specialists.

They are articulate about the value they get:

"Content reuse. Variables. Conditions. Metadata. All things that make the effort worth it." — r/technicalwriting, 15 upvotes

"DITA is a true skill. It is something you get out of what you put in, and you can spend your life, not just your career, getting more out of it all the time." — r/technicalwriting, DITA specialist with 10+ years of experience

Notice what even the strongest advocates emphasize: content reuse, variables, conditions, metadata. These are structured authoring capabilities. They are not unique to XML.

The pro-XML camp is buying tools at a different price tier entirely. They spend $50,000 or more per seat per year on enterprise systems like PTC Windchill or IXIA CCMS. They employ dedicated XML Developer roles separate from their writers. If your team has four writers evaluating cloud-based documentation tools, the pro-XML camp's use case does not apply to you.

"I hate XML but I am stuck with it" — 31 percent

This is the largest single camp by sentiment intensity, and their frustration is palpable. These writers have firsthand experience with DITA or other XML-based authoring systems. They understand the capabilities. They resent the implementation.

"I fucking hate DITA myself, but it's in use all over the place." — r/technicalwriting

"The CCMS is cumbersome and extremely difficult to hire for — we've had multiple contractors quit because they can't make sense of it." — hardware company team lead, actively evaluating alternatives

"I despise DITA with a flaming red passion — and I've been open with my management team about this. I've been a technical writer for more than 30 years." — senior technical writer, r/technicalwriting

"Writing is no longer about crafting a coherent narrative; it's meticulously assembling context-free LEGO bricks." — Fabrizio Ferri Benedetti, passo.uno

"A major learning barrier with DITA is that content structure depends on nested XML markup, where even simple sentences can be embedded with coding, and few writers want to deal with such intricacies. For DITA to work, authors sometimes need to become XML developers." — Kontent.ai

The hiring problem surfaces repeatedly. One team lead described losing multiple contractors who could not get productive in their DITA environment. Another practitioner put it bluntly: "DITA is the only software where the learning curve gets steeper the more you use it."

A consultant who has worked on large-scale DITA and S1000D implementations offered this sobering assessment:

"I've seen literal man-centuries flushed down the CCS hole, sometimes without a single deliverable to show for it." — S1000D/DITA consultant

What this camp wants is specific: content reuse without learning XML syntax, conditions and variables without element nesting, multi-channel output without XSL transforms, topic-based authoring without the DITA information model, and the ability to hire writers who can start producing on day one instead of after months of DITA training.

"XML is unnecessary" — 47 percent

The largest combined group includes both practitioners who have tried XML and moved on (17 percent with conditional views) and those who are format-indifferent and simply want the right capabilities (30 percent). Together, they represent nearly half of all voices in the research.

"DITA's neat, but you don't need it for re-use. You don't even need XML. You can re-use full components, or parts of components, with Asciidoc." — r/technicalwriting

"There is nothing that DITA does that cannot be done with lightweight markup, and lightweight markups even do the job better in certain areas." — r/technicalwriting

"You're better learning the concepts of structured authoring. That will help you more and make you more valuable. I've looked at DITA at multiple companies over the years and have never found a compelling reason to use it." — 30-year technical writing veteran

"When you treat docs as data, you'd be amazed what you can do — and without expensive tech, no less." — practitioner who built reuse with a database and templating system

"You don't need DITA for reuse — you need transclusion, partial transclusion, and conditionals." — r/technicalwriting

This camp draws an important distinction: structured authoring is a methodology, DITA is one implementation. You can have the methodology without the specific implementation. The question is whether your tooling supports the methodology, not whether it uses a particular markup language under the hood.

The Data: XML Is Declining in Technical Writing

The practitioner sentiment is not an outlier. Every measurable indicator points in the same direction: the technical writing profession is moving away from XML as a default requirement, even as structured authoring itself grows in importance.

Job market shift

DITA job postings have dropped to 2.5 percent of all technical writing roles, down from a historical range of 3.5 to 4 percent. Keith Schengili-Roberts, who tracks these numbers on the DITA Writer blog, described the trend as "a precipitous drop" to "low levels never seen before."

Meanwhile, Markdown has surpassed DITA in job posting frequency. Computer science programs increasingly teach Markdown over XML. The pipeline of new writers entering the profession with XML skills is narrowing.

Satisfaction data

The 2017 DITA Satisfaction Survey conducted by Firehead remains the largest published study of DITA user satisfaction. The results were striking:

  • 62 percent of DITA users reported dissatisfaction with their implementation
  • 67 percent said convincing other departments of DITA's value was a "harder sell"
  • Among the 38 percent who were satisfied, the top benefits cited were content reuse consistency (91 percent) and usability/predictability (87 percent)

That last point is critical. Even satisfied DITA users attribute their satisfaction to structured authoring capabilities, not to XML itself. They like reuse and consistency. The markup language is incidental.

The OASIS signal

Perhaps the most telling indicator comes from the organization that governs the DITA standard itself. OASIS created Lightweight DITA (LwDITA) specifically to, in their own words, "free the specification from a dependency on XML." When the standards body behind DITA acknowledges that XML dependency is a problem worth solving, the broader trend is difficult to dispute.

Industry voices

Tom Johnson, who runs idratherbewriting.com (the most widely read technical writing blog), moved away from DITA years ago. He explained his reasoning: "I felt that I should be able to use h3 tags on a page without resorting to complex element nesting."

Mike Howes, describing a migration from DITA to Docusaurus, explained the practical driver: "The doc team is tiny with a huge product portfolio to support. Their main priority was support for Markdown to enable collaborative authoring with people outside the doc team, as DITA is far too cumbersome for occasional contributors." Many of these teams land on docs-as-code workflows, which solve the contributor access problem but introduce their own structural gaps — see the analysis of docs-as-code limitations.

The Write the Docs community positions traditional CCMS tools for "non-technical teams" only, and rarely discusses XML as a requirement for modern documentation.

What You Actually Need for Structured Authoring

The research reveals a consistent pattern: when practitioners describe what they want from a documentation system, they name capabilities, not formats. Here is what those capabilities are and whether they require XML.

CapabilityRequires XML?How it works without XML
Content reuse (write once, reference everywhere)NoComponent references with where-used tracking. Any block editor can store reusable blocks as JSON or database records and render them by reference.
Conditional content (show/hide by audience, product, platform)NoDimension-based filtering at the block level. Define conditions as metadata, apply them in the editor, filter at publish time.
Variables (product name, version number, URL)NoKey-value stores resolved at render time. No markup language required.
Multi-channel output (web, PDF, Markdown)NoA rendering pipeline that transforms structured content into target formats. The source format is irrelevant as long as the content model is structured.
Topic-based authoring (modular content model)NoA content model that treats topics as independent units organized by maps or collections. This is an architectural choice, not a format choice.

The only capabilities that genuinely require XML are schema validation (enforcing structural rules at the markup level), vendor-neutral interchange in XML-native ecosystems, semantic markup for machine processing pipelines that expect XML input, and regulatory compliance in industries that mandate specific XML standards.

Of these, regulatory compliance is the only one that cannot be addressed through alternative approaches. Schema validation can be implemented as editor-level rules. Interchange can happen through import and export. Semantic markup can be applied at the content model layer regardless of storage format.

Topicary implements all five core capabilities using a TipTap/ProseMirror JSON content model. Content reuse works through component references with where-used tracking. Conditional content uses dimension-based filtering with in-editor preview. Variables are key-value sets resolved at publish time. The result is structured authoring without the XML overhead.

When You Genuinely Need XML

Intellectual honesty requires acknowledging the cases where XML is the right choice. They exist, and dismissing them would undermine the credibility of everything above.

You need XML when your industry mandates it. Aerospace and defense documentation governed by S1000D or ATA iSpec 2200 requires XML schema validation as a compliance deliverable. Medical device documentation under certain regulatory frameworks has similar requirements. In these industries, the XML is not overhead — it is the output.

You need XML when you have 75 or more writers with existing DITA infrastructure. At that scale, the migration cost of moving away from DITA exceeds the ongoing cost of maintaining it. Your writers are trained, your toolchain is configured, your publishing pipeline works. Switching introduces risk with limited upside.

You need XML when your translation workflow requires XLIFF interchange with XML-native TMS systems. Some translation management systems expect DITA or DocBook as input and produce XLIFF for translator workbenches. If your localization pipeline is built around this interchange format, ripping it out may not be worth it.

You need XML when your content is the product. Some organizations sell structured data — parts catalogs, technical standards, regulatory databases — where the XML schema is part of the deliverable that customers consume programmatically.

If you fall into one of these categories, tools like Heretto (DITA-native, enterprise-only) or PTC Windchill are designed for your use case. A cloud-based CCMS built for teams of 2 to 15 writers is not the right fit, and there is no shame in that.

For everyone else — and that is most documentation teams — XML is a cost with no corresponding benefit.

The Migration Question

The most common objection from teams considering a move away from XML is: "We have years of DITA content. We cannot just throw it away."

You do not have to. The migration path is: bring your DITA content, convert it, and then never write XML again.

What DITA import preserves

A proper DITA import pipeline handles concepts, tasks, and references (mapped to topics with appropriate structure), ditamaps (mapped to content maps with hierarchy preserved), conrefs (mapped to reusable components with references intact), conditional attributes (mapped to condition dimensions and values), and relationship tables (mapped to cross-references).

What Flare import preserves

Teams migrating from MadCap Flare bring over topics (XHTML converted to structured blocks), snippets (mapped to reusable components), TOC files (mapped to content maps), variables (mapped to variable sets), and condition tags (mapped to condition dimensions).

What does not transfer, and why that is acceptable

DITA specializations (custom topic types with extended schemas), XSL-FO formatting instructions, and DITA-specific linking mechanisms like relationship tables with complex scope rules do not have direct equivalents in non-XML systems. In practice, most teams use a fraction of DITA's specification. The capabilities they actually rely on — reuse, conditions, variables, topic-based modularity — transfer cleanly.

What Every Competitor Does With XML

One of the most revealing findings from this research: even XML-based CCMS tools hide XML from their writers. The industry has already acknowledged that exposing XML syntax to content authors is a problem. They just solve it by adding a visual layer on top of the XML, rather than removing the XML entirely.

ToolUnderlying formatWriter sees XML?Structured authoring?
PaligoDocBook XMLNo (visual editor)Yes
MadCap FlareXHTML (proprietary XML variant)Sometimes (code view)Yes
HerettoDITA XMLYes (enterprise users)Yes
Author-itProprietary databaseNoYes
Document360ProprietaryNoLimited
GitBookMarkdownNoNo (no reuse, no conditions)
TopicaryTipTap/ProseMirror JSONNoYes

The pattern is clear. Paligo stores your content as DocBook XML but wraps it in a visual editor so writers never touch angle brackets. Flare uses XHTML under the hood but provides a Word-like interface. Even Heretto, which leans into DITA visibility for enterprise buyers, offers a visual editor mode.

The question is not whether writers should see XML — the industry already answered that with "no." The question is whether the XML needs to be there at all. If the visual editor abstracts it away, if the writer never touches it, if the structured authoring capabilities can be delivered without it, then the XML layer is pure technical debt: adding complexity to storage, processing, and import/export without adding value to the authoring experience.

Fabrizio Ferri Benedetti captured the market gap precisely in a passo.uno blog post: DITA is a "300 kg solution" when developers want "300 gr notepads," but Markdown "painfully shows its limitations when it comes to content reuse, internal references, or even tables." The market needs something between these two extremes — structured authoring capabilities with a lightweight authoring experience.

Making the Decision for Your Team

After reviewing 59 practitioner voices, the decision framework is straightforward.

Choose XML-based tooling if your industry mandates XML schema validation for compliance, you have 75 or more writers with existing DITA infrastructure, your content is a deliverable that customers consume as structured XML, or your translation pipeline is built around XLIFF interchange with XML-native systems.

Choose structured authoring without XML if your team has 2 to 15 writers who need content reuse, conditions, and variables, you want new writers productive in days rather than months, you need subject matter experts and cross-functional contributors to work in your documentation system, you are evaluating a CCMS for the first time and do not have existing XML content (or you do, and you are willing to import and convert it), or you have tried DITA and found the overhead outweighs the benefits for your team size and use case.

The phrase "structured authoring without XML" appeared in an STC academic paper as early as 2016. A decade later, the tools have caught up to the aspiration. The capabilities that once required XML — content reuse, conditional content, variables, multi-channel publishing, topic-based modularity — are now achievable without it.

The 59 practitioners in this research are not saying XML is bad. They are saying it is unnecessary for most teams. And the data — declining job postings, majority dissatisfaction, OASIS itself decoupling from XML — agrees with them. For how these findings shaped the product, read why I built Topicary. For the content model that replaces XML, see what a CCMS is.

Ready to try Topicary?

Start free. No credit card required.