Claude Code Plugins

Community-maintained marketplace

Feedback

Finding Open Access Papers

@kthorn/research-superpower
2
0

Use Unpaywall API to find free full-text versions of paywalled papers

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name Finding Open Access Papers
description Use Unpaywall API to find free full-text versions of paywalled papers
when_to_use When paper behind paywall. When PMC full text not available. When DOI resolution hits paywall. When need free access to paper. Before giving up on full text access.
version 1.0.0

Finding Open Access Papers

Overview

Use Unpaywall to find legally available open access versions of papers that appear to be behind paywalls.

Core principle: Many paywalled papers have free versions (preprints, author manuscripts, institutional repositories). Unpaywall finds them.

When to Use

Use this skill when:

  • DOI resolution hits a paywall
  • Paper not available in PubMed Central
  • Publisher site requires subscription
  • Need full text for highly relevant paper (score ≥7)

Use BEFORE giving up on full text access

Unpaywall API

Simple REST API - no authentication required for reasonable usage

Basic Request

curl "https://api.unpaywall.org/v2/DOI?email=YOUR_EMAIL"

Parameters:

  • DOI - The paper's DOI (URL-encoded if needed)
  • email - User's email (required, for courtesy/contact)

IMPORTANT: Ask user for their email at the start of research session. Do NOT use placeholder emails like claude@anthropic.com or researcher@example.com.

Example:

curl "https://api.unpaywall.org/v2/10.1038/nature12373?email=user@example.com"

Response Format

{
  "doi": "10.1038/nature12373",
  "title": "Paper Title",
  "is_oa": true,
  "best_oa_location": {
    "url": "https://europepmc.org/articles/pmc3858213",
    "url_for_pdf": "https://europepmc.org/articles/pmc3858213?pdf=render",
    "version": "publishedVersion",
    "license": "cc-by",
    "host_type": "repository"
  },
  "oa_locations": [
    {
      "url": "https://europepmc.org/articles/pmc3858213",
      "version": "publishedVersion"
    },
    {
      "url": "https://arxiv.org/abs/1234.5678",
      "version": "submittedVersion"
    }
  ]
}

Key Response Fields

is_oa (boolean)

  • true - Open access version available
  • false - No free version found

best_oa_location (object or null)

  • Unpaywall's recommended best open access source
  • Prioritizes published versions over preprints
  • Includes PDF URL when available

oa_locations (array)

  • All known open access locations
  • Includes repositories, preprint servers, institutional sites
  • Ordered by quality/version

version types:

  • publishedVersion - Final published version (best)
  • acceptedVersion - Author's accepted manuscript (good)
  • submittedVersion - Preprint before peer review (useful)

Implementation Pattern

1. Check Unpaywall After Paywall Hit

# Try DOI first
curl -L "https://doi.org/10.1234/example.2023"

# If paywall detected (403, subscription required, etc):
curl "https://api.unpaywall.org/v2/10.1234/example.2023?email=your@email.com"

2. Extract Best URL

# Parse JSON response
response=$(curl -s "https://api.unpaywall.org/v2/DOI?email=EMAIL")

# Check if OA available
is_oa=$(echo $response | jq -r '.is_oa')

if [ "$is_oa" = "true" ]; then
  # Get best PDF URL
  pdf_url=$(echo $response | jq -r '.best_oa_location.url_for_pdf // .best_oa_location.url')

  # Download
  curl -L -o "papers/paper.pdf" "$pdf_url"
fi

3. Report to User

When OA found:

⚠️ Paper behind paywall at publisher
✓ Found open access version via Unpaywall!
   Source: Europe PMC (published version)
   PDF: https://europepmc.org/articles/pmc3858213?pdf=render
   → Downloading...

When no OA found:

⚠️ Paper behind paywall at publisher
✗ No open access version found via Unpaywall
   Options:
   - Request via institutional access
   - Contact authors for preprint
   - Continue with abstract only

4. Prioritize by Version

If multiple locations available:

Priority order:

  1. publishedVersion from publisher or PMC
  2. acceptedVersion from institutional repository
  3. submittedVersion from preprint server (arXiv, bioRxiv)

Integration with evaluating-paper-relevance

Add to full text fetching workflow:

Stage 2: Fetch Full Text

Try in order:
A. PubMed Central (free full text)
B. DOI resolution → If paywall, try Unpaywall
C. Unpaywall direct lookup
D. Preprints (bioRxiv, arXiv)

Updated workflow:

# 1. Try PMC
pmc_result=$(curl "https://eutils.ncbi.nlm.nih.gov/...")
if has_pmc_fulltext; then
  fetch_pmc
  exit 0
fi

# 2. Try DOI
doi_result=$(curl -L "https://doi.org/$doi")
if is_paywall; then
  # 3. Try Unpaywall
  unpaywall_result=$(curl "https://api.unpaywall.org/v2/$doi?email=$EMAIL")
  if has_oa; then
    fetch_unpaywall_pdf
    exit 0
  fi
fi

# 4. No full text available
report_no_fulltext

Rate Limiting

Free tier (with email):

  • 100,000 requests per day
  • No hard rate limit, but be respectful
  • Include email in requests (required)

Best practices:

  • Add 100ms delay between requests
  • Cache responses (don't re-check same DOI)
  • Only check for papers you actually need

Python Helper Example

import requests
import time

def find_open_access(doi, email):
    """
    Find open access version via Unpaywall
    Returns: (pdf_url, version, source) or (None, None, None)
    """
    url = f"https://api.unpaywall.org/v2/{doi}"
    params = {"email": email}

    try:
        response = requests.get(url, params=params, timeout=10)
        response.raise_for_status()
        data = response.json()

        if not data.get('is_oa'):
            return None, None, None

        best_loc = data.get('best_oa_location')
        if not best_loc:
            return None, None, None

        pdf_url = best_loc.get('url_for_pdf') or best_loc.get('url')
        version = best_loc.get('version', 'unknown')
        source = best_loc.get('host_type', 'unknown')

        return pdf_url, version, source

    except Exception as e:
        print(f"Error checking Unpaywall for {doi}: {e}")
        return None, None, None

# Usage
doi = "10.1038/nature12373"
pdf_url, version, source = find_open_access(doi, "researcher@example.com")

if pdf_url:
    print(f"Found {version} at {source}")
    print(f"PDF: {pdf_url}")
    # Download PDF
    response = requests.get(pdf_url)
    with open(f'papers/{doi.replace("/", "_")}.pdf', 'wb') as f:
        f.write(response.content)
else:
    print("No open access version found")

time.sleep(0.1)  # Rate limiting

Common Sources Found

Repositories:

  • Europe PMC / PubMed Central
  • Institutional repositories (university sites)
  • PubMed Central international mirrors

Preprint servers:

  • bioRxiv (biology)
  • medRxiv (medicine)
  • arXiv (physics, CS, math)
  • ChemRxiv (chemistry)

Publisher sites:

  • Open access journals
  • Hybrid journals (OA articles in subscription journals)
  • Delayed open access (embargo expired)

Error Handling

DOI not found:

{
  "error": "true",
  "message": "DOI not found"
}

→ Check DOI format, try alternative identifiers

Network errors:

  • Retry with exponential backoff
  • Maximum 3 attempts
  • Report to user if all fail

Malformed response:

  • Check for is_oa field
  • Fallback to oa_locations array if best_oa_location missing

Quick Reference

Task Command
Check if OA available curl "https://api.unpaywall.org/v2/DOI?email=EMAIL"
Get best PDF URL Parse .best_oa_location.url_for_pdf
List all OA sources Parse .oa_locations[]
Check version type Look at .version field
Download PDF curl -L -o paper.pdf "$pdf_url"

Integration Points

Called by:

  • evaluating-paper-relevance - When full text not in PMC
  • answering-research-questions - For highly relevant papers

Updates:

  • papers-reviewed.json - Note if OA found
  • SUMMARY.md - Include OA source info

Common Mistakes

Using placeholder email: Using claude@anthropic.com or researcher@example.com → Ask user for their real email Not including email: Required parameter, requests will fail Checking every paper: Only check when needed (score ≥7, no PMC) Ignoring version type: Published version better than preprint Single source only: Check oa_locations array for alternatives No rate limiting: Add delays even though no hard limit

Success Criteria

Successful when:

  • Paywalled paper's OA version found and downloaded
  • Version type recorded (published/accepted/submitted)
  • User informed about source and version
  • Fallback options provided if no OA available

Next Steps

After finding OA version:

  • Download PDF to papers/ folder
  • Note source and version in SUMMARY.md
  • Continue with deep dive analysis
  • If no OA: note in summary, continue with abstract only