| name | arxiv-paper-writer |
| description | Write LaTeX ML/AI review articles for arXiv using the IEEEtran template and verified BibTeX citations. |
| compatibility | Python 3.8+ for scripts. Web browsing/search for citation verification. LaTeX is required (pdflatex + bibtex or latexmk). |
| metadata | [object Object] |
ML/AI Review Paper Workflow (IEEEtran template)
When to Use
- ML/AI review papers for arXiv (main text ~6-10 pages; references excluded)
- LaTeX + BibTeX workflow with verified citations
- Citation validation/repair on existing LaTeX projects
When NOT to Use
- Novel experimental research papers (this is a review workflow)
- Non-academic documents
Inputs
- Topic description (required)
- Constraints: venue, page limit, author/affiliations (optional)
- Existing project path for citation validation (optional)
Outputs
main.tex(LaTeX source)ref.bib(verified BibTeX entries)IEEEtran.clsplan/<timestamp>-<slug>.md,issues/<timestamp>-<slug>.csv- Figures/tables;
main.pdf notes/literature-notes.md(optional per-citation notes)notes/arxiv-registry.sqlite3(arXiv metadata/BibTeX cache)
Conventions: run python3 scripts/... from this skill folder (where scripts/ lives); <paper_dir> is the paper/project root (contains main.tex, ref.bib, plan/, issues/, notes/). Paths like plan/... are under <paper_dir>. For arXiv discovery/metadata/BibTeX, use scripts/arxiv_registry.py (no ad-hoc curl/wget).
Gated Workflow
Tip: Run
python3 scripts/<script>.py --helpbefore use. Open reference files only when a step calls them out.
Non-Negotiable Rules
- No prose in
main.texuntil plan approved AND issues CSV exists. - First deliverable: research snapshot + outline + clarification questions + draft plan.
- Use plan + issues tracking for all new papers; do not opt out.
- Issues CSV is the execution contract; update
StatusandVerified_Citationsper issue, and add/split/insert issue rows when scope grows (do not do untracked work). - Template is fixed: use IEEEtran two-column layout (
assets/template/IEEEtran.cls). Treat two-column width as a layout constraint (use two-column floats when needed).
Gate 0: Research Snapshot + Draft Plan
- Confirm constraints (venue, page limit, author block, date range).
- Translate the topic into search keywords and run a light discovery pass:
10-20 key papers (see
references/research-workflow.md). After step 4 (once<paper_dir>exists), cache arXiv discovery witharxiv_registry.py search. - Propose 2-4 candidate titles aligned to the topic.
- Scaffold the project folder and draft plan:
This copies LaTeX templates frompython3 scripts/bootstrap_ieee_review_paper.py --stage kickoff --topic "<topic>"assets/template/; plan/issues are generated from templates inassets/. Initialize arXiv registry (once):python3 scripts/arxiv_registry.py --project-dir <paper_dir> init. - Create a framework skeleton in
main.tex(section headings + 2-4 bullets per section + seed citations; no prose). - Update the plan file to reflect the framework, proposed titles, and section/subsection plan.
- Compile early:
python3 scripts/compile_paper.py --project-dir <paper_dir>Fix anyOverfull \hboxwarnings (see Layout Hygiene below). - Return to user:
- Proposed outline (5-8 sections, 2-4 bullets each)
- Planned visualizations (5+) mapped to sections (see
references/visual-templates.md) - Clarification questions
- STOP until user approves.
Gate 1: Create Issues CSV (after approval)
- Check kickoff gate in plan:
- [x] User confirmed scope + outline in chat. - Create issues CSV (script refuses if gate unchecked):
python3 scripts/bootstrap_ieee_review_paper.py --stage issues --topic "<topic>" --with-literature-notes - Validate:
python3 scripts/validate_paper_issues.py <paper_dir>/issues/<timestamp>-<slug>.csv - If literature notes are enabled, keep short summaries and (optional) abstract snippets to avoid re-search.
- The plan may evolve; add/split/insert issues as needed, re‑validate after edits, and keep going until all issues (including inserted ones) are
DONEorSKIP(when feasible, in the same run).
Phase 2: Per-Issue Writing Loop
For each writing issue in the CSV:
- If an issue balloons (new figure, new subsection, new benchmark set, or a large QA fix), split/insert new issue row(s) (e.g.,
W6a,Q5) before proceeding; re-runpython3 scripts/validate_paper_issues.py <issues.csv>; keep going until all issues areDONE/SKIP.
- Research: 8-12 section-specific papers.
- Write: Never 3 sentences without citations; varied paragraph rhythm
(see
references/writing-style.md). For section intent and structure, usereferences/template-usage.md. - Visualize: Match content triggers (see
references/visual-templates.md). Prioritize single-column sizing; use double-column spans only when necessary (see Layout Hygiene). Cite externally sourced figure content. - Verify: Web search + open source page (and PDF if available) before adding to
ref.bib. For arXiv entries, append BibTeX viapython3 scripts/arxiv_registry.py --project-dir <paper_dir> export-bibtex <arxiv_id> --out-bib <paper_dir>/ref.bib. - Update: Mark issue
DONEwithVerified_Citationscount. - Compile after meaningful changes; fix
Overfull \hboxbefore markingDONE.
Phase 2.5: Rhythm Refinement
After all writing issues are DONE, refine prose section-by-section using the latex-rhythm-refiner skill. This step varies sentence/paragraph lengths and removes filler phrases while preserving all citations.
Phase 3: QA Gate
- Run internal QA checklist (see
references/quality-report.md). - Compile; ensure no
Overfull \hboxwarnings inmain.log. - Deliver
main.tex,ref.bib, figures, andmain.pdf.
Existing Paper Workflow (No Re-Scaffold)
If a paper folder already exists, do NOT rerun scaffold:
# Create plan
python3 scripts/create_paper_plan.py --topic "<topic>" --stage plan --output-dir <paper_dir>
# STOP for approval, then check kickoff gate box
# Create issues (use timestamp/slug from plan filename/frontmatter)
python3 scripts/create_paper_plan.py --topic "<topic>" --stage issues --timestamp "<TS>" --slug "<slug>" --output-dir <paper_dir> --with-literature-notes
Citation-Validation Variant
- Treat provided path as LaTeX project root.
- Follow
references/citation-workflow.md. - Use
references/bibtex-guide.mdfor BibTeX rules if entries need repair. - Deliver validation report and corrected
ref.bibif requested.
Success Criteria
Compilation: python3 scripts/compile_paper.py --project-dir <paper_dir> (exit 0, no "Citation undefined" warnings). Use --report-page-counts for main-text page count.
Quality Metrics:
- 6-10 pages of main text (references excluded)
- 60-80 total citations (8+ per section)
- 100% citation verification rate
- 70%+ citations from last 3 years
- 5+ visualization types
- All issues
DONEorSKIP
Safety & Guardrails
- Never fabricate citations or results; add TODO and ask user if evidence missing.
- Verify every citation via web search + source page (and PDF if available) before adding to
ref.bib. - Confirm before large literature searches.
- Do not overwrite user files without confirmation.
- Issues CSV is the contract; mark
DONEonly when criteria met. - No submission bundles unless user requests.
Layout Hygiene
Fix Overfull \hbox warnings before marking issues DONE:
- Figures: start with
figure+\columnwidth; switch tofigure*+\textwidthif needed - Tables: prefer
p{...}column widths /\tabcolsepover\resizebox - Equations: use
split,multline,aligned, orIEEEeqnarrayfor line-breaking
Issues CSV Schema
| Phase | Issues |
|---|---|
| Research | Rx: discovery, scaffolding, framework, viz planning |
| Writing | Wx: each section with target citations and visualization |
| Refinement | RFx: apply latex-rhythm-refiner skill (after all Wx DONE) |
| QA | Qx: citation verification, QA checklist, compilation, final review |
Status: TODO → DOING → DONE. Schema validated by validate_paper_issues.py.