| name | html-to-pdf |
| description | Converts HTML files to PDF or PNG format. Use this skill when the user asks to convert, export, or generate PDF/PNG from HTML files, or when they want to create printable documents, presentations, or long-form images from web pages or HTML reports. |
HTML to PDF/PNG Converter Skill
This skill helps you convert HTML files to PDF or PNG format with various options for output quality and page layout.
When to Use This Skill
Use this skill when the user wants to:
- Convert HTML files to PDF
- Generate PNG screenshots from HTML
- Create printable documents from web pages
- Export HTML reports as PDFs
- Generate long-form images without page breaks
- Create presentation-ready PDFs from HTML
Available Conversion Methods
Method 1: Multi-Page PDF (Standard A4)
Best for: Printing, archiving, traditional documents
Command:
python html_to_pdf_final.py input.html output.pdf
Features:
- Standard A4 page format
- Zero margins for seamless appearance
- Background colors and gradients preserved
- Multiple pages with page breaks
- Optimized file size (~5-6 MB for typical reports)
Method 2: Single-Page Long Image PDF
Best for: Online viewing, presentations, no page breaks
Command:
python html_to_long_image.py input.html
Features:
- Generates a full-page PNG screenshot first
- Converts PNG to single-page PDF
- NO page breaks - entire content on one page
- Perfect for presentations and online viewing
- Smaller file size (~1-2 MB)
- Two output files:
.pngand.pdf
Method 3: Advanced Multi-Method Converter
Best for: Fallback options, compatibility
Command:
python html_to_pdf_converter.py input.html output.pdf
Features:
- Tries WeasyPrint first (best CSS support)
- Falls back to Playwright if needed
- Automatic dependency installation
- Handles complex CSS and gradients
Required Dependencies
The skill requires Playwright and Pillow. The scripts auto-install dependencies if missing:
pip install playwright pillow pypdf
playwright install chromium
Step-by-Step Instructions
For Standard Multi-Page PDF:
Verify the HTML file exists:
ls -la *.htmlRun the converter:
python html_to_pdf_final.py your_file.htmlOpen the result:
open your_file_final.pdf
For Single-Page Long Image PDF:
Verify the HTML file exists:
ls -la *.htmlRun the long image converter:
python html_to_long_image.py your_file.htmlCheck outputs:
# View the PNG screenshot open your_file_fullpage.png # View the PDF version open your_file_fullpage.pdf
Troubleshooting
Issue: Playwright browser not found
Solution:
playwright install chromium
Issue: Page breaks visible in PDF
Solution: Use the long image method instead:
python html_to_long_image.py your_file.html
Issue: Content appears cut off
Causes & Solutions:
- CSS animations not complete: Script waits 2 seconds for animations
- Lazy loading: Script scrolls through entire page to trigger loading
- Large file size: Scripts handle files up to 20MB+
Issue: Blank PDF output
Solution: Use the long image method which uses screenshot instead of PDF rendering:
python html_to_long_image.py your_file.html
Output Files
After conversion, you'll get:
Multi-Page PDF:
filename_final.pdf- Standard A4 multi-page PDF
Long Image Method:
filename_fullpage.png- Complete screenshot as PNG (6-10 MB)filename_fullpage.pdf- Single-page PDF from image (1-2 MB)
Best Practices
For online viewing/presentations: Use
html_to_long_image.py- No page breaks
- Smooth scrolling experience
- Smaller file size
For printing/archiving: Use
html_to_pdf_final.py- Standard A4 pages
- Better for physical printing
- Professional document format
For complex CSS: Use
html_to_pdf_converter.py- Multiple fallback methods
- Better compatibility
Implementation Notes
Script Locations
All scripts should be in the project directory:
html_to_pdf_final.py- Main multi-page converterhtml_to_long_image.py- Long image generatorhtml_to_pdf_converter.py- Advanced multi-method converter
Key Features Implemented
- Animation Handling: All scripts disable CSS animations/transitions
- Lazy Loading: Scripts scroll through content to trigger loading
- Background Preservation: All gradients and colors render correctly
- Zero Margins: Seamless page appearance without visible borders
- Chinese Font Support: Handles CJK characters properly
Examples
Example 1: Convert BP Report to Multi-Page PDF
python html_to_pdf_final.py 202510_Alpha_Intelligence_BP.html
# Output: 202510_Alpha_Intelligence_BP_final.pdf (22 pages, 5.7 MB)
Example 2: Create Single-Page Presentation PDF
python html_to_long_image.py 202510_Alpha_Intelligence_BP.html
# Output:
# - 202510_Alpha_Intelligence_BP_fullpage.png (6.1 MB)
# - 202510_Alpha_Intelligence_BP_fullpage.pdf (1.6 MB, 1 page)
Example 3: Batch Convert Multiple Files
for file in *.html; do
python html_to_long_image.py "$file"
done
Advanced Usage
Custom Output Paths
python html_to_pdf_final.py input.html custom_output.pdf
python html_to_long_image.py input.html
Check PDF Page Count
python -c "from pypdf import PdfReader; r = PdfReader('output.pdf'); print(f'Pages: {len(r.pages)}')"
Verify File Sizes
ls -lh *.pdf *.png
Performance Expectations
- Processing Speed: ~5-10 seconds for typical HTML files
- Memory Usage: ~100-200 MB during conversion
- PDF File Size: 1-6 MB depending on method
- PNG File Size: 6-10 MB for full-page screenshots
Success Criteria
A successful conversion should:
- ✅ Generate PDF/PNG without errors
- ✅ Include all HTML content (no truncation)
- ✅ Preserve colors, gradients, and styling
- ✅ Handle Chinese/CJK characters correctly
- ✅ Create readable file sizes (< 10 MB)
Quick Reference
| Need | Use | Output |
|---|---|---|
| Printing | html_to_pdf_final.py |
Multi-page A4 PDF |
| Online viewing | html_to_long_image.py |
Single-page PDF + PNG |
| Maximum compatibility | html_to_pdf_converter.py |
Multi-page PDF with fallbacks |