Claude Code Plugins

Community-maintained marketplace

Feedback

article-extraction

@ljchg12-hue/dotfiles
0
0

Extract clean article content from web pages, removing ads and clutter for reading and archiving

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name article-extraction
description Extract clean article content from web pages, removing ads and clutter for reading and archiving

Article Extraction Skill

Extract clean article text from web pages, removing ads, navigation, and clutter.

When to Use

  • Content archiving
  • Research collection
  • Reading list management
  • Content analysis

Core Capabilities

  • Main content extraction
  • Metadata extraction (title, author, date)
  • Image extraction
  • Clean HTML/Markdown output
  • Multi-page article handling
  • Paywall bypass (where legal)

Tools

# Readability (Node.js)
npm install @mozilla/readability

# newspaper3k (Python)
pip install newspaper3k
python -c "from newspaper import Article; a = Article('URL'); a.download(); a.parse(); print(a.text)"

# Trafilatura (Python)
pip install trafilatura
trafilatura -u "URL"

Best Practices

  • Respect robots.txt
  • Cache extracted content
  • Preserve attribution
  • Handle different CMS formats

Resources