| id | skill-docx-workbench |
| name | DOCX Workbench — Redlining & Automation |
| description | Create, review, and redline .docx documents with structured workflows, OOXML tooling, and tracked-change best practices for Cortex-OS projects. |
| version | 1.0.0 |
| author | brAInwav Documentation Guild |
| owner | @jamiescottcraik |
| category | documentation |
| difficulty | advanced |
| tags | docx, redlining, ooxml, document, automation |
| estimatedTokens | 5400 |
| license | Complete terms in LICENSE.txt |
| requiredTools | python, node, pandoc, unzip |
| prerequisites | Read `resources/ooxml.md` and `resources/docx-js.md` in full, Install Pandoc and Python 3.11+ with lxml, Access to input/output `.docx` files |
| relatedSkills | skill-tdd-red-green-refactor, skill-testing-evidence-triplet, skill-creator |
| resources | ./resources/docx-js.md, ./resources/ooxml.md, ./resources/LICENSE.txt, ./resources/scripts/ooxml/pack.py, ./resources/scripts/ooxml/unpack.py, ./resources/scripts/ooxml/validate.py, ./resources/scripts/ooxml/validation/__init__.py, ./resources/scripts/ooxml/validation/base.py, ./resources/scripts/ooxml/validation/docx.py, ./resources/scripts/ooxml/validation/redlining.py, ./resources/scripts/ooxml/validation/pptx.py, ./resources/scripts/docx/__init__.py, ./resources/scripts/docx/document.py, ./resources/scripts/docx/utilities.py, ./resources/scripts/docx/templates/comments.xml, ./resources/scripts/docx/templates/commentsExtensible.xml, ./resources/scripts/docx/templates/commentsExtended.xml, ./resources/scripts/docx/templates/commentsIds.xml, ./resources/scripts/docx/templates/people.xml |
| deprecated | false |
| replacedBy | null |
| impl | packages/doc-tools/src/docx_workbench.ts#runDocxWorkflow |
| inputs | [object Object] |
| outputs | [object Object] |
| preconditions | Governance pack reviewed (skills, RULES_OF_AI, testing standards)., All source documents scanned for confidentiality requirements and approvals logged., North-star acceptance test defined (e.g., specific clauses redlined). |
| sideEffects | Creates temporary unpacked directories for OOXML manipulation., Produces markdown diffs and validation reports stored with task artefacts. |
| estimatedCost | $0.005 / document workflow (~1000 tokens across planning + validation). |
| calls | skill-creator, skill-testing-evidence-triplet |
| requiresContext | memory://skills/skill-docx-workbench/historical-runs |
| providesContext | memory://skills/skill-docx-workbench/latest-artifact |
| monitoring | true |
| lifecycle | [object Object] |
| estimatedDuration | PT60M |
| i18n | [object Object] |
| persuasiveFraming | [object Object] |
| observability | [object Object] |
| governance | [object Object] |
| schemaStatus | [object Object] |
DOCX Workbench — Redlining & Automation
When to Use
- Drafting or updating complex Word documents (SOWs, contracts, policies) that demand tracked changes or comments.
- Performing legal/enterprise reviews where redline hygiene, RSIDs, and audit trails matter.
- Converting documents to markdown for analysis or summarisation while preserving structure.
- Diagnosing formatting issues, comment metadata, or embedded media within
.docxpackages.
How to Apply
- Confirm approvals, cleanse the document of secrets, and duplicate it into the task workspace.
- Choose the workflow (
create,analyze,redline,comment) and follow the decision tree referenced inresources/ooxml.md. - For new documents, script with
docx-js(seeresources/docx-js.md) and export viaPacker.toBuffer(). - For edits/redlines, unpack with
python resources/scripts/ooxml/unpack.pyand implement batches usingscripts/docx/document.pyhelpers. - Pack the document, validate (
resources/scripts/ooxml/validate.py), convert to markdown for verification, and archive evidence/Local Memory entries.
Success Criteria
- RSIDs respected, tracked changes precise (only changed text wrapped) with
w:ins/w:delpatterns. - Validation script passes with no outstanding redline or OOXML errors.
- Markdown diff confirms all requested changes and no unintended modifications.
- Output document includes comments/metadata as specified, with artefacts stored for review.
- Local Memory entry logged (
skillUsed: "skill-docx-workbench", effectiveness ≥0.8) plus evidence bundle path.
0) Mission Snapshot — What / Why / Where / How / Result
- What: Provide a reliable, repeatable workflow for creating, redlining, and validating
.docxdocuments. - Why: Legal and enterprise deliverables demand meticulous tracked changes and reproducible audits.
- Where: Used across Cortex-OS engagements involving Word documents (contracts, board packs, policy updates).
- How: Combine scripted tooling (docx-js, Document library) with pandoc conversions and validation scripts.
- Result: Signed-off
.docxartefact with traceable evidence and zero governance blockers.
1) Contract — Inputs → Outputs
Inputs: original .docx (or API-supplied source), workflow instructions, reviewer notes. Outputs: updated .docx, markdown diff, validation logs, Local Memory references. Maintain directories under the task folder (documents/, logs/, diffs/).
2) Preconditions & Safeguards
- Validate document classification; ensure secure handling and approvals before editing.
- Install dependencies (Pandoc, Python packages) and verify
python -m pip install -r requirementsif needed. - Back up originals; perform work in copies within the task directory.
- Confirm timezone/locale context for date updates to avoid ambiguous replacements.
3) Implementation Playbook (RED→GREEN→REFACTOR)
- Reconnaissance (RED): Convert to markdown (
pandoc --track-changes=all), list required edits, group into batches of 3–10. - Execution (GREEN): Unpack with
unpack.py, usedocument.pyutilities to apply changes, regenerating document after each batch. Record RSIDs suggested by the unpack script. - Refinement (REFACTOR): Re-pack, run
validate.py, compare markdown diff, and rerun until validation clean. Capture logs and update documentation.
4) Observability & Telemetry Hooks
- Log each batch application with
[brAInwav]prefix including RSID and batch identifiers. - Store validation output and markdown diffs under
logs/docx-workbench/anddiffs/for PR attachments. - Track metrics (number of insertions/deletions) to monitor scope creep.
5) Safety, Compliance & Governance
- Never fabricate reviewer metadata or RSIDs; rely on scripts for accurate insertion order.
- Remove or obfuscate confidential content when sharing diffs or logs.
- Follow RULES_OF_AI for branding/logging and ensure accessibility of generated markdown.
- Document waivers if any validation warnings persist (rare; escalate immediately).
6) Success Criteria & Acceptance Tests
validate.pyreturns success; no unresolved warnings.- Markdown diff shows all requested changes and no extra modifications.
- Document opens in Word/LibreOffice with tracked changes visible and comments intact.
- Evidence Triplet stored: failing validation log (pre-fix), passing validation log, proof of reviewer acceptance.
7) Failure Modes & Recovery
- Validation failures: Inspect
validation.log, adjust XML, re-run pack/validate. - Broken RSIDs: Re-run unpack to retrieve suggested RSID and reapply changes preserving original runs.
- Lost comments/media: Ensure
word/comments*.xmland media directories retained; copy from original if missing. - Pandoc mismatch: Use
--track-changes=alland verify encoding; fallback to raw XML comparison.
8) Worked Examples & Snippets
- Sample docx-js usage (see
resources/docx-js.md) for creating clauses. - Python snippet using
Documenthelper to insert tracked changes preserving RSIDs. - Bash pipeline converting docx → markdown → diff for QC.
9) Memory & Knowledge Integration
- Store Local Memory entries capturing change summary, reviewer notes, and validation status.
- Link to related contract or policy memories via
relationship_type_enum: "references". - Reference memory IDs in PRs and governance manifests.
10) Lifecycle & Versioning Notes
- Update scripts when OOXML schema guidance changes; record version in SKILL changelog.
- Archive previous versions under
resources/changelog/if required by compliance. - Sunset or split the skill if docx workflows diverge significantly (e.g., academic vs legal).
11) References & Evidence
resources/docx-js.md— scripting guide for generating documents.resources/ooxml.md— deep OOXML and redlining reference.resources/scripts/— pack/unpack/validate utilities and Document helper library.- Task artefacts: markdown diffs, validation logs, reviewer sign-off.
12) Schema Gap Checklist
- Publish MCP wrappers around pack/unpack/validate to enable remote automation.
- Extend validator to ensure RSID suggestions consumed per batch.
- Add coverage report verifying pandoc conversions across templates.