Claude Code Plugins

Community-maintained marketplace

Feedback

Extract text, tables, and metadata from PDF files. Use when working with PDFs, document extraction, or parsing PDF content.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name pdf-extractor
description Extract text, tables, and metadata from PDF files. Use when working with PDFs, document extraction, or parsing PDF content.
version 1.0.0
author agentflow
triggers pdf, extract text, parse document, read pdf
requirements pypdf, pdfplumber
tags document, extraction, pdf

PDF Extraction Instructions

Overview

This skill extracts text and data from PDF files using Python libraries.

Usage

Basic Text Extraction

import pdfplumber

with pdfplumber.open("document.pdf") as pdf:
    for page in pdf.pages:
        text = page.extract_text()
        print(text)

Extract Tables

import pdfplumber

with pdfplumber.open("document.pdf") as pdf:
    for page in pdf.pages:
        tables = page.extract_tables()
        for table in tables:
            print(table)

Get Metadata

from pypdf import PdfReader

reader = PdfReader("document.pdf")
metadata = reader.metadata
print(f"Title: {metadata.title}")
print(f"Author: {metadata.author}")
print(f"Pages: {len(reader.pages)}")

Requirements

Install the required packages:

pip install pypdf pdfplumber

Notes

  • For scanned PDFs, consider using OCR libraries like pytesseract
  • Large PDFs should be processed page by page to manage memory