Question 1

What is the difference between a traditional CCMS and an AI-ready CCMS?

Accepted Answer

A traditional CCMS manages structured authoring and typically publishes to PDF, HTML, and DOCX — formats designed for human readers. An AI-ready CCMS extends that foundation to publish structured data formats that AI systems can consume directly, with content type information, metadata, and hierarchy preserved in the output. Author-it's AION output is the example: it publishes structured JSON from the same content library that produces PDF and HTML, providing AI systems with typed, metadata-annotated, component-level data rather than formatted documents.

Question 2

Why can't I feed a PDF export from my CCMS into an AI or RAG pipeline?

Accepted Answer

You can — but the quality of AI output will be limited. When an AI or RAG system ingests a PDF, it receives stripped text. Content type information is lost (the AI can't distinguish a safety warning from a feature description). Hierarchy is lost (the AI doesn't know a section belongs to a sub-procedure). Metadata is lost — no authorship, no modification date, no version. Chunks are created at arbitrary paragraph or page boundaries rather than meaningful topic boundaries. All of this reduces retrieval precision and increases the likelihood of inaccurate, decontextualised answers.

Question 3

What is the difference between document-centric and component-centric content for AI?

Accepted Answer

In a document-centric approach, content is created and managed as whole documents: Word files, PDFs, web pages. When those documents are fed to an AI system, the AI receives the final rendered output: text without structure, context, or type information. In a component-centric approach, as used by Author-it, content is authored as discrete topics, individual units with defined types and metadata. These components are assembled into documents for human readers and published as structured data for machines. The AI receives components, not documents: smaller, typed, metadata-annotated units that produce more precise retrieval and more reliable answers. The practical consequence is retrieval precision. A document-centric system asks the AI to find the right sentence inside a 40-page manual it only sees as flat text. A component-centric system hands the AI the individual topic, already labelled with what it is and where it belongs. The AI spends its effort answering the question, not reconstructing structure the source threw away.

Question 4

What metadata does AION include in its output?

Accepted Answer

AION includes the following on every published topic: the template name and type (which defines the content type — task, warning, concept, and so on), the topic ID and book ID, the library folder path, the last-modified timestamp and the name of the author who made the last change, the topic description, and all variable values resolved at publish time (such as product name, region, or version). This metadata travels with every content chunk, giving AI systems the context they need to deliver accurate, citable answers.

Question 5

How does component-level chunking improve RAG performance?

Accepted Answer

Retrieval-augmented generation (RAG) works by retrieving the most relevant text chunks from a knowledge base and passing them to an LLM to generate an answer. When content is chunked from a PDF at arbitrary paragraph or page boundaries, the retrieved chunks may contain unrelated content, split mid-procedure, or lose the context that gives them meaning. AION chunks content at the topic level — each chunk is a discrete, typed, metadata-annotated unit authored as a single coherent piece. These chunks produce more precise vector embeddings, which produce more accurate retrieval, which produces better LLM answers.

Question 6

How does Author-it prevent unapproved content from reaching AI systems?

Accepted Answer

AION is a publishing output, not a live scrape or continuous sync. Content must go through Author-it's standard publishing workflow to reach AION output. In organisations using Author-it's Review and Approvals module, only approved content can be published. Draft, in-review, or expired content stays within the authoring environment and does not appear in AION output until it has been approved and published. This means the same governance that controls what goes into a PDF controls what goes into your AI.

Question 7

Does becoming AI-ready with Author-it require restructuring existing content?

Accepted Answer

No. Any book already in the Author-it Library can be published to AION without restructuring. A Library Administrator creates an AION publishing profile once — selecting Resolved XML format and adding the JSON conversion step — and from then on, any author can publish to AION by selecting that profile. Organisations that want to improve AI output quality can do so by improving their library design (better topic typing, richer metadata, cleaner variable usage), but this is optimisation, not a prerequisite. Existing structured content works as-is.

Question 8

Can other CCMSs produce AI-ready structured JSON output like AION?

Accepted Answer

Most CCMS platforms currently export to PDF, HTML, DITA XML, or proprietary formats. These outputs are designed for human-facing delivery and require additional processing — scrapers, parsers, or manual conversion pipelines — before they can be used in a RAG or AI workflow. AION is Author-it's dedicated AI output format: structured JSON produced directly from the publishing workflow, with metadata and hierarchy built in. No additional pipeline is required.

Question 9

What does governed AI output mean in the context of a CCMS?

Accepted Answer

Governed AI output means that the content your AI systems receive has passed through the same review, approval, and publishing controls as your other outputs. In Author-it, this means content that has not been approved does not reach AION. Content published to AION carries modification history, authorship, and content type — so answers produced from it can be audited back to a specific approved component. For regulated industries where hallucinations are a compliance event, governed AI output is the difference between an AI tool you can trust and one you cannot.

Dimension	Author-it + AION	Document-centric approach
Category	Enterprise CCMS + AI Content Foundation	HATs, wiki tools, Word-based or legacy CCMS workflows
AI output format	AION - structured JSON for LLMs and RAG pipelines, included	PDF, HTML, DOCX - formats built for human readers
Content type tagging	Every topic typed: Warning, Task, Concept, Procedure, and any custom type	None - type information is lost on export
Metadata per chunk	Type, author, last-modified, folder path, topic ID, variable values - on every topic	Filename at best - all provenance stripped on export
Content hierarchy	Preserved - books, sub-books, and topics in published order	Lost - the AI cannot tell which section belongs to which procedure
Chunk boundaries	Component-level - one topic, one chunk, one unit of meaning	Arbitrary - page breaks, paragraph limits, character counts
Publishing governance	Unapproved content cannot reach AION output	No gate - drafts and expired content can reach the AI
Source freshness	Single source - republish and the AI has the current, governed version	Snapshot at export - stale until manually refreshed
Implementation effort	One publishing profile, configured once by a Library Administrator	Scraper, parser, or ingestion pipeline - requires ongoing maintenance
AI output cost	Included with Author-it Cloud - no additional products required	Separate tooling or custom pipeline required
Structured authoring	Yes - component and topic-based, no DITA or XML required	Varies - often document-level or unstructured
Regulated industry fit	Manufacturing, Utilities, medical devices, aerospace	Varies - governed AI output not standard

Most CCMSs publish documents. Author-it publishes data your AI can use.

A PDF export is not an AI content strategy.

What your AI actually receives

Five differences that change what your AI can do

WHAT AI NEEDS

Documents end where AI needs to begin.

DATA CHUNKING

Arbitrary chunks produce unpredictable answers.

METADATA

The AI is working blind without metadata.

AI CONTENT GOVERNANCE

Your AI should only know what's been approved.

SINGLE SOURCE

Stale content in. Stale answers out.

This is what AI-ready actually looks like.

AI - Traditional CCMS vs Author-it FAQ

More on AI Content Foundation

Why RAG pipelines fail on enterprise documentation

Structured content for AI: how LLMs read your documentation

AI content foundation guide: what your LLMs actually need

Make content your competitive advantage. And your AI’s source of truth.