| name | ppt-master |
| description | AI-driven multi-format SVG content generation system. Converts source documents (PDF/DOCX/URL/Markdown) into high-quality SVG pages and exports to PPTX through multi-role collaboration. Use when user asks to "create PPT", "make presentation", "生成PPT", "做PPT", "制作演示文稿", or mentions "ppt-master". |
PPT Master Skill
AI-driven multi-format SVG content generation system. Converts source documents into high-quality SVG pages through multi-role collaboration and exports to PPTX.
Core Pipeline: Source Document → Create Project → [Template] → Strategist → [Image_Generator] → Executor Live Preview → Quality Check → Post-processing → Export
[!CAUTION]
🚨 Global Execution Discipline (MANDATORY)
This workflow is a strict serial pipeline. The following rules have the highest priority — violating any one of them constitutes execution failure:
- SERIAL EXECUTION — Steps MUST be executed in order; the output of each step is the input for the next. Non-BLOCKING adjacent steps may proceed continuously once prerequisites are met, without waiting for the user to say "continue"
- BLOCKING = HARD STOP — Steps marked ⛔ BLOCKING require a full stop; the AI MUST wait for an explicit user response before proceeding and MUST NOT make any decisions on behalf of the user
- NO CROSS-PHASE BUNDLING — Cross-phase bundling is FORBIDDEN. (Note: the Eight Confirmations in Step 4 are ⛔ BLOCKING — the AI MUST present recommendations and wait for explicit user confirmation before proceeding. Once the user confirms, all subsequent non-BLOCKING steps — design spec output, SVG generation, speaker notes, and post-processing — may proceed automatically without further user confirmation)
- GATE BEFORE ENTRY — Each Step has prerequisites (🚧 GATE) listed at the top; these MUST be verified before starting that Step
- NO SPECULATIVE EXECUTION — "Pre-preparing" content for subsequent Steps is FORBIDDEN (e.g., writing SVG code during the Strategist phase)
- NO SUB-AGENT SVG GENERATION — Executor Step 6 SVG generation is context-dependent and MUST be completed by the current main agent end-to-end. Delegating page SVG generation to sub-agents is FORBIDDEN
- SEQUENTIAL PAGE GENERATION ONLY — In Executor Step 6, after the global design context is confirmed, SVG pages MUST be generated sequentially page by page in one continuous pass. Grouped page batches (for example, 5 pages at a time) are FORBIDDEN
- SPEC_LOCK RE-READ PER PAGE — Before generating each SVG page, Executor MUST
read_file <project_path>/spec_lock.md. All colors / fonts / icons / images MUST come from this file — no values from memory or invented on the fly. Executor MUST also look up the current page'spage_rhythm(anchor/dense/breathing),page_layouts(which template SVG to inherit, if any), andpage_charts(which chart template to adapt, if any). Empty / absent entries are intentional Strategist signals — see executor-base.md §2.1. This rule exists to resist context-compression drift on long decks and to break the uniform "every page is a card grid" default- SVG MUST BE HAND-WRITTEN, NOT SCRIPT-GENERATED — Every SVG page is written by the main agent directly, one page at a time (see rules 6 and 7). Writing or running a Python / Node / shell script that produces the SVG files in batch — looping over pages, templating from data, or emitting them via a generator — is FORBIDDEN, including under "save tokens", "quick draft", or "user is in a hurry" pretexts. The script-generation path was tried on a feature branch and abandoned: cross-page visual consistency depends on per-page authoring with full upstream context, which a generator script cannot reproduce
[!IMPORTANT]
🌐 Language & Communication Rule
- Response language: match the user's input and source materials. Explicit user override (e.g., "请用英文回答") takes precedence.
- Template format:
design_spec.mdMUST follow its original English template structure (section headings, field names) regardless of conversation language. Content values may be in the user's language.
[!IMPORTANT]
🔌 Compatibility With Generic Coding Skills
ppt-masteris a repository-specific workflow, not a general application scaffold- Do NOT create
.worktrees/,tests/, branch workflows, or generic engineering structure by default- On conflict with a generic coding skill, follow this skill unless the user explicitly says otherwise
Main Pipeline Scripts
| Script | Purpose |
|---|---|
${SKILL_DIR}/scripts/source_to_md/pdf_to_md.py |
PDF to Markdown |
${SKILL_DIR}/scripts/source_to_md/doc_to_md.py |
Documents to Markdown — native Python for DOCX/HTML/EPUB/IPYNB, pandoc fallback for legacy formats (.doc/.odt/.rtf/.tex/.rst/.org/.typ) |
${SKILL_DIR}/scripts/source_to_md/excel_to_md.py |
Excel workbooks to Markdown — supports .xlsx/.xlsm; legacy .xls should be resaved as .xlsx |
${SKILL_DIR}/scripts/source_to_md/ppt_to_md.py |
PowerPoint to Markdown |
${SKILL_DIR}/scripts/source_to_md/web_to_md.py |
Web page to Markdown (supports WeChat via curl_cffi) |
${SKILL_DIR}/scripts/project_manager.py |
Project init / validate / manage |
${SKILL_DIR}/scripts/analyze_images.py |
Image analysis |
${SKILL_DIR}/scripts/latex_render.py |
LaTeX formula rendering (manifest-driven PNG assets) |
${SKILL_DIR}/scripts/image_gen.py |
AI image generation (multi-provider) |
${SKILL_DIR}/scripts/svg_quality_checker.py |
SVG quality check |
${SKILL_DIR}/scripts/total_md_split.py |
Speaker notes splitting |
${SKILL_DIR}/scripts/finalize_svg.py |
SVG post-processing (unified entry) |
${SKILL_DIR}/scripts/svg_to_pptx.py |
Export to PPTX |
${SKILL_DIR}/scripts/update_spec.py |
Propagate a spec_lock.md color / font_family change across all generated SVGs |
For complete tool documentation, see ${SKILL_DIR}/scripts/README.md.
Template Index
| Index | Path | Purpose |
|---|---|---|
| Layout templates | ${SKILL_DIR}/templates/layouts/layouts_index.json |
Query available page layout templates |
| Brand presets | ${SKILL_DIR}/templates/brands/brands_index.json |
Query available brand identity presets (color / typography / logo / voice) |
| Visualization templates | ${SKILL_DIR}/templates/charts/charts_index.json |
Query available visualization SVG templates (charts, infographics, diagrams, frameworks) |
| Icon library | ${SKILL_DIR}/templates/icons/ |
See ${SKILL_DIR}/templates/icons/README.md; search icons on demand with ls templates/icons/<library>/ | grep <keyword> |
Standalone Workflows
| Workflow | Path | Purpose |
|---|---|---|
topic-research |
workflows/topic-research.md |
Pre-pipeline — gather web sources when the user supplies only a topic with no source files |
create-template |
workflows/create-template.md |
Standalone layout template creation workflow |
create-brand |
workflows/create-brand.md |
Standalone brand-only template creation (identity preset; no SVG page roster) |
resume-execute |
workflows/resume-execute.md |
Phase B entry — resume execution in a fresh chat after Phase A (Step 1–5) completed in another session (split mode) |
verify-charts |
workflows/verify-charts.md |
Chart coordinate calibration — run after SVG generation if the deck contains data charts |
customize-animations |
workflows/customize-animations.md |
Object-level PPTX animation customization — run only when the user explicitly asks to tune animation order/effects/timing |
live-preview |
workflows/live-preview.md |
Browser-based live preview — auto-started during generation and re-enterable any time the user mentions "live preview", "preview", "看效果", or wants to click/select a slide element |
visual-review |
workflows/visual-review.md |
Per-page rubric-based visual self-check — run only when the user explicitly asks for a visual re-pass on the generated SVGs (between Executor and post-processing). Opt-in only; never invoked by the main pipeline. |
Workflow
Step 1: Source Content Processing
🚧 GATE: User has provided source material (PDF / DOCX / EPUB / URL / Markdown file / text description / conversation content — any form is acceptable).
No source content? When the user supplies only a topic name or requirements without any file or substantive description, run the `topic-research` workflow first, then return here with its products as input.
When the user provides non-Markdown content, convert immediately:
| User Provides | Command |
|---|---|
| PDF file | python3 ${SKILL_DIR}/scripts/source_to_md/pdf_to_md.py <file> |
| DOCX / Word / Office document | python3 ${SKILL_DIR}/scripts/source_to_md/doc_to_md.py <file> |
| XLSX / XLSM / Excel workbook | python3 ${SKILL_DIR}/scripts/source_to_md/excel_to_md.py <file> |
| CSV / TSV | Read directly as plain-text table source |
| PPTX / PowerPoint deck | python3 ${SKILL_DIR}/scripts/source_to_md/ppt_to_md.py <file> |
| EPUB / HTML / LaTeX / RST / other | python3 ${SKILL_DIR}/scripts/source_to_md/doc_to_md.py <file> |
| Web link | python3 ${SKILL_DIR}/scripts/source_to_md/web_to_md.py <URL> |
| WeChat / high-security site | python3 ${SKILL_DIR}/scripts/source_to_md/web_to_md.py <URL> (requires curl_cffi, included in requirements.txt) |
| Markdown | Read directly |
Office vector assets (EMF/WMF) from DOCX/PPTX sources:
doc_to_md.py/ppt_to_md.pyextract embedded Office vector images (.emf/.wmf) alongside bitmap images. Afterimport-sources, these land inimages/together withimage_manifest.jsonand are first-class assets in §VIII Image Resource List.Do NOT convert EMF/WMF to PNG. The PPT Master pipeline preserves them as external references (
finalize_svg.pyskips them) andsvg_to_pptx.pyembeds them as PPTX-native media viaimage/x-emf/image/x-wmfMIME — PowerPoint renders them at full vector fidelity. Converting via LibreOffice/Inkscape introduces CJK font substitution drift and rasterization loss; the original EMF/WMF is always higher fidelity than the converted PNG.Browser-based live preview cannot render EMF (will show blank) — this is expected; the PPTX output is the source of truth.
✅ Checkpoint — Confirm source content is ready, proceed to Step 2.
Step 2: Project Initialization
🚧 GATE: Step 1 complete; source content is ready (Markdown file, user-provided text, or requirements described in conversation are all valid).
python3 ${SKILL_DIR}/scripts/project_manager.py init <project_name> --format <format>
Format options: ppt169 (default), ppt43, xhs, story, etc. For the full format list, see references/canvas-formats.md.
Import source content (choose based on the situation):
| Situation | Action |
|---|---|
| Has source files (PDF/MD/etc.) | python3 ${SKILL_DIR}/scripts/project_manager.py import-sources <project_path> <source_files...> --move |
| User provided text directly in conversation | No import needed — content is already in conversation context; subsequent steps can reference it directly |
⚠️ MUST use
--move(not copy): all source files — Step 1's generated Markdown, original PDFs / MDs / images — go intosources/viaimport-sources --move. After execution they no longer exist at the original location. Intermediate artifacts (e.g.,_files/) are handled automatically.
✅ Checkpoint — Confirm project structure created successfully, sources/ contains all source files, converted materials are ready. Proceed to Step 3.
Step 3: Template Option
🚧 GATE: Step 2 complete; project directory structure is ready.
Default — free design. Proceed directly to Step 4. Do NOT query any *_index.json unless triggered. Do NOT ask the user. Do NOT proactively suggest, hint at, or fuzzy-match any template based on content, slug-like words, or vague style descriptions.
Template flow triggers ONLY on explicit directory paths supplied by the user in their initial message. The trigger rule is mechanical, not interpretive:
| User input contains | Step 3 action |
|---|---|
One or more explicit template directory paths (each resolves to a directory containing design_spec.md with kind: brand / kind: layout / kind: deck in its YAML frontmatter) |
Read each spec's kind, dispatch per the kind matrix below, fuse if multiple |
| Anything else — bare template names ("用 academic_defense"), style descriptions ("麦肯锡风格"), brand mentions ("招商银行风格"), vague intent ("想用个模板"), or silence | Skip Step 3, free design |
There is no slug matching, no name lookup, no fuzzy resolution. A name without a path does not trigger — the user must give a path the AI can cd into.
Style descriptions ("麦肯锡风格" / "Keynote 风" / "极简风" / etc.) never trigger Step 3. They flow into Strategist's Eight Confirmations as a style brief (color / typography / tone in confirmations e–g).
Bare names ("academic_defense", "招商银行", "anthropic") do NOT trigger Step 3 even if a matching directory exists in the library. The user must give a path. AI must not "helpfully" resolve a name to a path.
"What templates exist?" is out-of-band Q&A — answer by listing entries from
brands_index.json/layouts_index.json/decks_index.jsontogether with their paths. Listing alone does not advance the pipeline; the user must send a path back to trigger Step 3.
To create a new layout or deck, read `workflows/create-template.md`. To create a new brand, read `workflows/create-brand.md`.
Three template kinds
The architecture has three independent reference bundles. Full schema in `docs/zh/templates-architecture.md`. Summary:
| Kind | Physical dir | Contains | Frontmatter |
|---|---|---|---|
| brand | templates/brands/<id>/ |
identity-only segment: color / typography / logo / voice / icon style | kind: brand |
| layout | templates/layouts/<id>/ |
structure-only segment: canvas / page structure / page types / SVG roster | kind: layout |
| deck | templates/decks/<id>/ |
full replica: identity + structure + middle (template overview) segments | kind: deck |
Segment ownership (governs fusion override priority):
| Segment | Sections | Owner kind on fusion |
|---|---|---|
| Identity | Color Scheme / Typography / Logo / Voice & Tone / Icon Style | brand |
| Structure | Canvas / Page Structure / Page Types / SVG Roster | layout |
| Middle | Template Overview (use cases / design intent) | deck (no other kind writes this) |
Single-path dispatch
User path's kind |
Step 3 action |
|---|---|
kind: brand |
Copy design_spec.md + logo files + asset subdirs (images/ / illustrations/ / icons/) into <project>/templates/. Strategist locks identity segment as truth; structure stays free. |
kind: layout |
Copy design_spec.md + SVG roster + asset files into <project>/templates/. Strategist locks structure; identity decided in Eight Confirmations e–g. |
kind: deck |
Copy everything (design_spec.md + SVGs + logos + assets) into <project>/templates/. Strategist locks all segments; Eight Confirmations narrows to deck-content fields (audience / page count / outline / tone tweaks). |
TEMPLATE_DIR=<user-supplied path>
cp -r ${TEMPLATE_DIR}/* <project_path>/templates/
The single-line copy suffices for all three kinds — the spec's kind field tells Strategist how to read it; downstream code doesn't distinguish.
Multi-path fusion
When the user gives two or more paths of different kinds, Step 3 fuses them into a single <project>/templates/design_spec.md. Default granularity is segment-level integer replacement — entire identity / structure / middle segments are taken from the highest-priority source for that segment, no implicit field-level mixing.
Override priority by segment:
| Combination | Identity from | Structure from | Middle from |
|---|---|---|---|
| brand only | brand | (free design) | (none) |
| layout only | (free design) | layout | (none) |
| deck only | deck | deck | deck |
| brand + layout | brand | layout | (none) |
| brand + deck | brand (overrides deck) | deck | deck |
| layout + deck | deck | layout (overrides deck) | deck |
| brand + layout + deck | brand | layout | deck |
Field-level micro-adjustment (e.g. "use anthropic brand but primary changed to #FF0000") is not part of Step 3 fusion — it flows into Strategist Eight Confirmations e–g as a normal user request.
Same-kind multiple paths — conflict resolution
When the user gives two paths of the same kind (e.g. brands/anthropic + brands/google), Step 3 surfaces a conflict prompt before fusing — like resolving a git merge conflict:
AI: 你给了两个 brand,检测到段级冲突:
- Color Scheme(Anthropic 橙红 vs Google 多色)
- Typography(Styrene/AnthropicSans vs GoogleSans/Roboto)
- Logo(Anthropic 标 vs Google 标)
- Voice & Tone(restrained vs friendly)
- Icon Style(stroke vs filled)
要 (a) 全部按 Anthropic / (b) 全部按 Google / (c) 逐段挑?
Rules:
- Default: no implicit ordering — every cross-source segment difference is reported as a conflict
- Only when the user picks
(c)does AI walk through each segment one by one - Field-level conflicts are out of scope — segment-level only
- Three or more same-kind paths are not supported — ask the user to converge to at most two
Fused spec provenance
When fusion happens (any multi-path case), the resulting <project>/templates/design_spec.md carries a provenance block immediately under its H1:
> **Fused from:**
> - deck: `templates/decks/招商银行/` (base)
> - brand: `templates/brands/anthropic/` (identity override)
> - layout: `templates/layouts/academic_defense/` (structure override)
> - conflicts resolved: Color Scheme from anthropic(user picked a)
Single-path Step 3 does not add provenance (the source is self-evident from the copied files).
✅ Checkpoint — Default path proceeds to Step 4 without user interaction. If the user supplied one or more explicit template paths, those have been dispatched (or fused) into <project_path>/templates/ before advancing.
Step 4: Strategist Phase (MANDATORY — cannot be skipped)
🚧 GATE: Step 3 complete; default free-design path taken, or (if triggered) template files copied into the project.
First, read the role definition:
Read references/strategist.md
⚠️ Mandatory gate: before writing
design_spec.md, Strategist MUSTread_file templates/design_spec_reference.mdand follow its full I–XI section structure. Seestrategist.mdSection 1.
Eight Confirmations (full template: templates/design_spec_reference.md):
⛔ BLOCKING: present the Eight Confirmations as a single bundled recommendation set and wait for explicit user confirmation or modification before outputting Design Specification & Content Outline. This is the single core confirmation point — once confirmed, all subsequent steps proceed automatically.
- Canvas format
- Page count range
- Target audience
- Style objective
- Color scheme
- Icon usage approach
- Typography plan, including formula rendering policy
- Image usage approach
Mandatory — split-mode note (not a ninth confirmation): after listing the eight confirmation details, you MUST append exactly one short line (rendered in the user's language, prefixed with 💡) about generation mode. Pick the variant by qualitative read of Phase A signals — recommended page count, source-material bulk, whether topic-research ran with substantial web-fetch accumulation:
| Signal read | Line content |
|---|---|
| Heavy (long page count / bulky sources / heavy web-fetch accumulation) | State estimated page count and large source size; recommend switching to split mode after Step 5 — stop this chat, open a fresh window and input 继续生成 projects/<project_name> to enter Phase B (SVG generation + export); no response or "continue" = default continuous mode. |
| Normal (default) | State scale is moderate, default continuous mode generates in one go; if mid-way window switch is desired, input 继续生成 projects/<project_name> after Step 5 to switch to split mode. |
This line is required output every run — the user must always see the mode choice exists. Whether to act on it is the user's call.
Formula rendering policy lives inside item 7 (Typography plan):
| Policy | Behavior |
|---|---|
mixed (default) |
Strategist renders complex formula-worthy expressions as PNG assets; simple inline expressions remain editable text / Unicode |
render-all |
Strategist renders every formula-worthy expression as PNG assets |
text-only |
No formula rendering; formulas remain editable text / Unicode |
After the Eight Confirmations are approved and before outputting design_spec.md / spec_lock.md, if the confirmed formula policy is mixed or render-all and the content contains formula-worthy expressions, Strategist MUST:
- Identify explicit LaTeX and any source expressions that should be faithfully structured as formulas.
- Write
<project_path>/images/formula_manifest.jsonwith only the formulas selected for rendering. - Run:
python3 ${SKILL_DIR}/scripts/latex_render.py <project_path> - Include the rendered formula PNGs as
Acquire Via: formula,Status: Rendered,Type: Latex Formularows indesign_spec.md §VIII Image Resource List; also list them inspec_lock.md imageswith| no-crop.
The formula renderer uses a provider fallback chain by default: codecogs,quicklatex,mathpad,wikimedia. The first three are color-aware; Wikimedia is an availability fallback. Formula PNGs are transparent by default: manifest background is the temporary render matte and transparency-removal reference, not a retained final background unless transparent: false is set for that item. Do not scan spec_lock.md for $...$ or $$...$$. Dollar-delimited math in source material is only a signal for Strategist; the renderer consumes the explicit manifest.
If the user provided images or formula PNGs were rendered, run analysis before outputting the design spec:
python3 ${SKILL_DIR}/scripts/analyze_images.py <project_path>/images
⚠️ Image handling: NEVER directly read / open / view image files (
.jpg,.png, etc.). All image info comes fromanalyze_images.pyoutput or the Design Spec's Image Resource List.
Output:
<project_path>/design_spec.md— human-readable design narrative<project_path>/spec_lock.md— machine-readable execution contract (skeleton:templates/spec_lock_reference.md); Executor re-reads before every page
✅ Checkpoint — Phase deliverables complete, auto-proceed to next step:
## ✅ Strategist Phase Complete
- [x] Eight Confirmations completed (user confirmed)
- [x] Split-mode note appended below the eight items (heavy or normal variant)
- [x] Design Specification & Content Outline generated
- [x] Execution lock (spec_lock.md) generated
- [ ] **Next**: Auto-proceed to [Image_Generator / Executor] phase
Step 5: Image Acquisition Phase (Conditional)
🚧 GATE: Step 4 complete; Design Specification & Content Outline generated and user confirmed. Any formula rows already have Acquire Via: formula and Status: Rendered.
Trigger: At least one row in the resource list has
Acquire Via: aiand/orAcquire Via: web. If every row isuser,formula, orplaceholder, skip to Step 6.
Always load the common framework:
Read references/image-base.md
Then lazy-load the path-specific reference for each row that actually needs it:
| Acquire Via | Load reference (only if any such row exists) | Run |
|---|---|---|
ai |
references/image-generator.md |
python3 ${SKILL_DIR}/scripts/image_gen.py --manifest <project_path>/images/image_prompts.json |
web |
references/image-searcher.md |
python3 ${SKILL_DIR}/scripts/image_search.py ... |
user / placeholder |
(skip) | (skip) |
A deck with only ai rows never loads image-searcher.md; a deck with only web rows never loads image-generator.md. A mixed deck loads both, processes each row through its own path, and writes both image_prompts.json and image_sources.json.
⚠️ In-pipeline ai path MUST use manifest mode — even when only 1 ai row exists. Write
images/image_prompts.jsonfirst, then runimage_gen.py --manifest, thenimage_gen.py --render-mdto produce theimage_prompts.mdsidecar. The positional form (image_gen.py "prompt" ...) is reserved for out-of-pipeline one-off testing / single-image fixups — it skips manifest + sidecar, leaving no audit trail.
Workflow:
- Extract all rows with
Status: PendingandAcquire Via ∈ {ai, web}from the design spec - Generate prompts (ai rows) and/or run search (web rows) per image-base.md §2 dispatch table
- Verify every row reaches a terminal status:
Generated(ai success),Sourced(web success), orNeeds-Manual
✅ Checkpoint — Confirm acquisition attempted for every row:
## ✅ Image Acquisition Phase Complete
- [x] image_prompts.json created (when any ai rows processed)
- [x] image_prompts.md sidecar rendered (when any ai rows processed)
- [x] image_sources.json created (when any web rows processed)
- [x] Each row: status is `Generated` / `Sourced` / `Needs-Manual` (no `Pending` remaining)
Default — auto-proceed to Step 6. Only when the user's Step 4 response explicitly opted into split mode (in reply to the optional hint), output the Phase A hand-off below and stop this conversation:
## ✅ Phase A Complete
- [x] Spec: `design_spec.md`, `spec_lock.md`
- [x] Resources: `sources/`, `images/`, `templates/`
- [ ] **Next**: open a fresh chat window and input `继续生成 projects/<project_name>` to enter Phase B via the [`resume-execute`](workflows/resume-execute.md) workflow.
On acquisition failure, do NOT halt — follow the Failure Handling rule in image-base.md §5: retry once, then mark the row
Needs-Manual, report to user, and continue to the checkpoint above.
Step 6: Executor Phase
🚧 GATE: Step 4 (and Step 5 if triggered) complete; all prerequisite deliverables are ready.
Read the role definition based on the selected style:
Read references/executor-base.md # REQUIRED: common guidelines
Read references/shared-standards.md # REQUIRED: SVG/PPT technical constraints
Read references/executor-general.md # General flexible style
Read references/executor-consultant.md # Consulting style
Read references/executor-consultant-top.md # Top consulting style (MBB level)
Only read executor-base + shared-standards + one style file.
Design Parameter Confirmation (Mandatory): before the first SVG, output key design parameters from the spec (canvas dimensions, color scheme, font plan, body font size). See executor-base.md §2.
Live Preview Auto-Startup (Mandatory): before the first SVG, automatically start the browser editor in live mode and keep it running continuously through Executor + Step 7 export:
python3 ${SKILL_DIR}/scripts/svg_editor/server.py <project_path> --live
- Start it immediately when Executor begins;
svg_output/may be empty. Editor opens athttp://localhost:5050; port conflict →--port <other>and report the actual URL. - Run it as a long-running side process/session; do not wait for it to exit before generating SVG pages. Do not wait for user confirmation after startup.
- Service must keep running until one of: (a) the user clicks Exit preview in the browser, or (b) the user explicitly asks in chat to stop it. Generation continues even if the user closes the editor.
- Do NOT read or apply submitted annotations during generation. Users may annotate at any time, but Executor proceeds without touching them. The window to apply annotations opens only after Step 7 completes — see `workflows/live-preview.md`.
- UI button semantics and editor details: see `workflows/live-preview.md` Notes.
Pre-generation Batch Read (Mandatory): before the first SVG, batch-read every distinct layout SVG referenced in spec_lock.page_layouts and every distinct chart SVG referenced in spec_lock.page_charts (plus any §VII backup charts). One read per file, up front — do not re-read these during page generation. See executor-base.md §1.0.
Per-page spec_lock re-read (Mandatory): before each SVG page, read_file <project_path>/spec_lock.md and use only its colors / fonts / icons / images, plus the per-page page_rhythm / page_layouts / page_charts lookups (resolves to template SVGs already loaded in the batch read above). Resists context-compression drift on long decks. See executor-base.md §2.1.
⚠️ Main-agent only: SVG generation MUST stay in the current main agent — page design depends on full upstream context. Do NOT delegate to sub-agents. ⚠️ Generation rhythm: generate pages sequentially, one at a time, in the same continuous context. Do NOT batch (e.g., 5 per group).
Visual Construction Phase: generate SVG pages sequentially, one at a time, in one continuous pass → <project_path>/svg_output/
Quality Check Gate (Mandatory) — after all SVGs, BEFORE annotation handling and speaker notes:
python3 ${SKILL_DIR}/scripts/svg_quality_checker.py <project_path>
- Any
error(banned SVG features, viewBox mismatch, spec_lock drift, etc.) MUST be fixed before proceeding — return to Visual Construction, regenerate that page, re-run check. warningentries (low-res image, non-PPT-safe font tail, etc.): fix when straightforward, otherwise acknowledge and release.- Run against
svg_output/(not afterfinalize_svg.py— finalize rewrites SVG and masks violations).
Logic Construction Phase: generate speaker notes → <project_path>/notes/total.md
✅ Checkpoint — Confirm all SVGs and notes are fully generated and quality-checked. Proceed directly to Step 7 post-processing:
## ✅ Executor Phase Complete
- [x] Live preview started and kept available at the reported URL
- [x] All SVGs generated to svg_output/
- [x] svg_quality_checker.py passed (0 errors)
- [x] Speaker notes generated at notes/total.md
Chart pages? If this deck contains data charts (bar / line / pie / radar / etc.), run the standalone `verify-charts` workflow before Step 7 to calibrate coordinates. AI models routinely introduce 10–50 px errors when mapping data to pixel positions; verify-charts eliminates that class of error. Skip if no chart pages.
Visual self-check (opt-in)? If the user explicitly asked for a per-page visual re-pass on the SVGs ("跑一下视觉自检 / 视觉回看", "visual review", "check pages visually", etc.), run the standalone `visual-review` workflow before Step 7. Do NOT run it by default and do NOT recommend it based on inferred model capability or deck size — trigger is user request only.
Step 7: Post-processing & Export
🚧 GATE: Step 6 complete; all SVGs generated to svg_output/; speaker notes notes/total.md generated.
🚧 Image readiness GATE (when Step 5 left ai rows in Needs-Manual): every expected file must exist at project/images/<filename> before running 7.1.
If files are missing: PAUSE, list the missing filenames, point the user to
images/image_prompts.md(each### Image N:block is paste-ready for ChatGPT / Gemini / Midjourney; auto-generated fromimage_prompts.json) and the required placementproject/images/<filename>. Resume Step 7.1 only after all expected files are in place.finalize_svg.pyandsvg_to_pptx.pydo not detect missing files at this layer — proceeding with gaps produces a deck with broken image references.
⚠️ Run the three sub-steps one at a time — each must complete successfully before the next. ❌ NEVER combine them into a single code block or shell invocation.
Canonical three-command pipeline (mirrors references/shared-standards.md §5):
Step 7.1 — Split speaker notes:
python3 ${SKILL_DIR}/scripts/total_md_split.py <project_path>
Step 7.2 — SVG post-processing (icon embedding / image crop & embed / text flattening / rounded rect to path):
python3 ${SKILL_DIR}/scripts/finalize_svg.py <project_path>
Step 7.3 — Export PPTX (embeds speaker notes by default):
python3 ${SKILL_DIR}/scripts/svg_to_pptx.py <project_path>
# Output (default-flow mode):
# exports/<project_name>_<timestamp>.pptx ← native pptx (canonical output, reads svg_output/)
# backup/<timestamp>/svg_output/ ← Executor SVG source backup (always written)
#
# Add --svg-snapshot to additionally emit the SVG-image preview pptx alongside the native pptx:
# exports/<project_name>_<timestamp>_svg.pptx ← SVG preview pptx (reads svg_final/)
The native pptx consumes
svg_output/directly so the converter can preserve high-fidelity primitives (icon<use>placeholders, imagepreserveAspectRatio→srcRect, rounded rectrx/ry→prstGeom roundRect). Thesvg_output/snapshot inbackup/<timestamp>/is always written so the project can be re-exported from frozen SVG sources without re-running the LLM. The SVG-rendered preview pptx is opt-in via--svg-snapshot— live preview already provides the SVG visual reference, so it's only needed when you want a self-contained file to share. Pass-s outputor-s finalto force a single source if you need it.
Paragraph editability vs line fidelity — by default every dy-stacked line is its own PowerPoint text frame, preserving exact SVG layout. Add
--merge-paragraphsonly when the user explicitly asks for an editable / wrap-friendly export (e.g. "I want to edit the abstract as one block", "make text boxes resizable / reflow"): mergeable paragraph blocks collapse into one editable text frame with multiple<a:p>, at the cost of PowerPoint re-wrapping inside each box. Default off keeps pixel-fidelity; turn it on per the user's request, not on your own judgement.
Optional animation flags (the defaults already enable rich entrance animations — adjust only when the user asks for something different):
-t <effect>— page transition. Defaultfade. Options:fade/push/wipe/split/strips/cover/random/none.-a <effect>— per-element entrance animation. Defaultauto(map effect from group id: chart→wipe, card-/step-/pillar-→fly, title/takeaway→fade; image-like idshero/figure-/image/img-/kpicycle a richer pool — zoom / dissolve / circle / box / diamond / wheel — so multiple images vary across the deck). Passnoneto disable, a specific effect likefade, ormixedfor the legacy 16-effect cycle. Requires top-level<g id="...">groups (already required by Executor).--animation-trigger {on-click,with-previous,after-previous}— Start mode (matches PowerPoint's animation-pane Start dropdown). Defaultafter-previous(click-free cascade; pace via--animation-stagger). Useon-clickfor presenter-paced reveals, orwith-previousfor all-at-once.--animation-config <path>— optional object-level sidecar. Default:<project_path>/animations.jsonwhen present.--auto-advance <seconds>— kiosk-style auto-play.
Optional custom animations (only when the user asks to tune animation order/effects/timing for specific objects):
Run the standalone `customize-animations` workflow. Default export already has global entrance animation; do not create animations.json unless object-level customization was requested.
Optional recorded narration (only when the user asks for narrated/video export):
Run the standalone `generate-audio` workflow. The AI picks a narration backend (edge by default, or a configured cloud provider such as ElevenLabs / MiniMax / Qwen / CosyVoice for high-quality or cloned voices), asks the user once (backend + voice + rate/settings + embed-or-not, all with recommended values), then executes notes_to_audio.py and (if chosen) re-exports the PPTX with --recorded-narration audio.
Do NOT call notes_to_audio.py directly without going through the workflow — --voice / --voice-id is required and the workflow produces the locale/provider-aware recommendation that makes the choice meaningful.
Full effect list, anchor logic, and limits: `references/animations.md`.
❌ NEVER substitute
cpforfinalize_svg.py— finalize performs multiple critical processing steps ❌ NEVER force-s outputfor the legacy/preview pptx (PowerPoint's internal SVG parser drops icons and rounded corners). The default auto-split already gives native the high-fidelity source it needs without touching legacy. ❌ NEVER use--only(it suppresses one of the two output files)
Post-export annotation window: the preview service from Step 6 typically remains running after export. If the user submitted annotations in the browser (during Executor or after export) and now asks to apply them — they may quote the browser prompt (
Annotations saved. ... apply my annotations), say "apply my annotations" / "应用注解" / equivalent — run `live-preview` Step 2 to apply and re-export. Annotations submitted during generation are also handled here, not earlier.
Preview not running? Any time the user mentions "live preview", "preview", "看效果", or wants to select/click a slide element and the service is not running, run `live-preview` Step 1 to start it. If the service is already running, just point them at the URL — do not restart.
Role Switching Protocol
Before switching roles, MUST first read the corresponding reference file. Output marker:
## [Role Switch: <Role Name>]
📖 Reading role definition: references/<filename>.md
📋 Current task: <brief description>
Reference Resources
| Resource | Path |
|---|---|
| Shared technical constraints | references/shared-standards.md |
| Canvas format specification | references/canvas-formats.md |
| Image-text layout patterns (Primary structures + Modifier layers — combine freely) | references/image-layout-patterns.md |
| Image layout sizing (math for side-by-side container dimensions) | references/image-layout-spec.md |
| SVG image embedding | references/svg-image-embedding.md |
| Icon library | templates/icons/README.md |
Notes
- Local preview:
python3 -m http.server -d <project_path>/svg_final 8000 - Troubleshooting: on generation issues (layout overflow, export errors, blank images, etc.), check
docs/faq.mdfor known solutions