Claude Code Plugins

Community-maintained marketplace

Feedback

Process PDF files. Use when reading, creating, or merging PDFs.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name pdf
description Process PDF files. Use when reading, creating, or merging PDFs.

PDF Processing Skill

This skill provides guidance on working with PDF files in various ways.

Reading PDFs

Using pdftotext (Fast, Text-Only)

For quick text extraction from PDFs:

pdftotext input.pdf output.txt

Or output to stdout:

pdftotext input.pdf -

Using PyMuPDF (Python, More Features)

For more control and metadata:

import fitz  # PyMuPDF

# Open PDF
doc = fitz.open("input.pdf")

# Extract text from all pages
text = ""
for page in doc:
    text += page.get_text()

# Close document
doc.close()

Using Go (pdfcpu library)

For Go projects:

import "github.com/pdfcpu/pdfcpu/pkg/api"

// Extract text
text, err := api.ExtractTextFile("input.pdf", "", nil)

Creating PDFs

From HTML (wkhtmltopdf)

wkhtmltopdf input.html output.pdf

Using Go (gofpdf)

import "github.com/jung-kurt/gofpdf"

pdf := gofpdf.New("P", "mm", "A4", "")
pdf.AddPage()
pdf.SetFont("Arial", "B", 16)
pdf.Cell(40, 10, "Hello World!")
err := pdf.OutputFileAndClose("output.pdf")

Merging PDFs

Using pdfunite

pdfunite file1.pdf file2.pdf file3.pdf output.pdf

Using Go (pdfcpu)

import "github.com/pdfcpu/pdfcpu/pkg/api"

err := api.MergeFile([]string{"file1.pdf", "file2.pdf"}, "output.pdf", nil)

Best Practices

  1. Check file existence before processing
  2. Handle errors gracefully - PDFs can be corrupted
  3. Consider file size - large PDFs may need streaming
  4. Respect encoding - PDFs may contain non-UTF8 text
  5. Test with real files - PDF standards vary widely

Common Issues

  • Password-protected PDFs: Require password parameter
  • Scanned PDFs: May need OCR (tesseract) for text extraction
  • Complex layouts: Table extraction may need specialized tools
  • Fonts: Embedding fonts may be needed for consistent rendering