| name | paper-notes |
| description | Write structured notes for each paper in the core set into `papers/paper_notes.jsonl` (summary/method/results/limitations). **Trigger**: paper notes, structured notes, reading notes, 论文笔记, paper_notes.jsonl. **Use when**: survey 的 evidence 阶段(C3),已有 `papers/core_set.csv`(以及可选 fulltext),需要为后续 claims/citations/writing 准备可引用证据。 **Skip if**: 还没有 core set(先跑 `dedupe-rank`),或你只做极轻量 snapshot 不需要细粒度证据。 **Network**: none. **Guardrail**: 具体可核对(method/metrics/limitations),避免大量重复模板;保持结构化字段而非长 prose。 |
Paper Notes
Produce consistent, searchable paper notes that later steps (claims, visuals, writing) can reliably synthesize.
This is still NO PROSE: keep notes as bullets / short fields, not narrative paragraphs.
When to use
- After you have a core set (and ideally a mapping) and need evidence-ready notes.
- Before writing a survey draft.
Inputs
papers/core_set.csv- Optional:
outline/mapping.tsv(to prioritize) - Optional:
papers/fulltext_index.jsonl+papers/fulltext/*.txt(if running in fulltext mode)
Output
papers/paper_notes.jsonl(JSONL; one record per paper)
Decision: evidence depth
- If you have extracted text (
papers/fulltext/*.txt) → enrich key papers using fulltext snippets and setevidence_level: "fulltext". - If you only have abstracts (default) → keep long-tail notes abstract-level, but still fully enrich high-priority papers (see below).
Workflow (heuristic)
Uses: outline/mapping.tsv, papers/fulltext_index.jsonl.
- Ensure coverage: every
paper_idinpapers/core_set.csvmust have one JSONL record. - Use mapping to choose high-priority papers:
- heavily reused across subsections
- pinned classics (ReAct/Toolformer/Reflexion… if in scope)
- For high-priority papers, capture:
- 3–6 summary bullets (what’s new, what problem setting, what’s the loop)
method(mechanism / architecture; what differs from baselines)key_results(benchmarks/metrics; include numbers if available)limitations(specific assumptions/failure modes; avoid generic boilerplate)
- For long-tail papers:
- keep summary bullets short (abstract-derived is OK)
- still include at least one limitation, but make it specific when possible
- Assign a stable
bibkeyfor each paper for citation generation.
Quality checklist
- Coverage: every
paper_idinpapers/core_set.csvappears inpapers/paper_notes.jsonl. - High-priority papers have non-
TODOmethod/results/limitations. - Limitations are not copy-pasted across many papers.
-
evidence_levelis set correctly (abstractvsfulltext).
Helper script (optional)
Quick Start
python .codex/skills/paper-notes/scripts/run.py --helppython .codex/skills/paper-notes/scripts/run.py --workspace <workspace_dir>
All Options
- See
--help(this helper is intentionally minimal)
Examples
- Generate notes, then optionally enrich
priority=highpapers:- Run the helper once, then refine
papers/paper_notes.jsonl(e.g., add full-text details for key papers and diversify limitations).
- Run the helper once, then refine
Notes
- The helper writes deterministic metadata/abstract-level notes and marks key papers with
priority=high. - In
pipeline.py --strictit will be blocked if high-priority notes are incomplete (missing method/key_results/limitations) or contain placeholders.
Troubleshooting
Common Issues
Issue: High-priority notes still look like scaffolds
Symptom:
- Quality gate reports missing
method/key_resultsorTODOplaceholders.
Causes:
- Notes were generated from abstracts only; key papers weren’t enriched.
Solutions:
- Fully enrich
priority=highpapers:method, ≥1key_results, ≥3summary_bullets, ≥1 concretelimitations. - If you need full text evidence, run
pdf-text-extractorinfulltextmode for key papers.
Issue: Repeated limitations across many papers
Symptom:
- Quality gate reports repeated limitation boilerplate.
Causes:
- Copy-pasted limitations instead of paper-specific failure modes/assumptions.
Solutions:
- Replace boilerplate with paper-specific limitations (setup, data, evaluation gaps, failure cases).
Recovery Checklist
-
papers/paper_notes.jsonlcovers allpapers/core_set.csvpaper_ids. - ≥80% of
priority=highnotes satisfy method/results/limitations completeness. - No
TODOremains in high-priority notes.