| name | marker |
| description | Convert documents (PDF, EPUB, PPTX, DOCX, XLSX, HTML, images) to Markdown/JSON/HTML using marker-pdf with Claude Haiku LLM enhancement for accurate table, math, and form extraction. Use when user needs to extract content from documents, convert PDFs to markdown, or process document files. |
Marker Document Converter
Convert PDF, EPUB, PPTX, DOCX, XLSX, HTML, and image files to clean Markdown/JSON/HTML format using the marker-pdf tool with multimodal LLM enhancement.
Prerequisites
# Install marker-pdf with full document support
uv tool install marker-pdf[full]
Requires Python 3.10+ and PyTorch.
Basic Usage
marker_single "<file_path>" \
--output_format markdown \
--output_dir "<output_directory>" \
--use_llm \
--llm_service marker.services.claude.ClaudeService \
--claude_model_name claude-haiku-4-5 \
--claude_api_key $ANTHROPIC_API_KEY \
--disable_image_extraction
Note: --disable_image_extraction generates plain text output. Remove this flag if images need to be preserved.
Output Formats
| Format | Description | Use Case |
|---|---|---|
markdown |
Formatted text with tables, LaTeX equations ($$-fenced), code blocks, image links | General document conversion |
html |
Semantic HTML with <img>, <math>, <pre> tags |
Web display |
json |
Hierarchical structure with block types, bounding boxes, section hierarchy | Programmatic processing |
chunks |
Flattened JSON optimized for RAG | Vector database ingestion |
CLI Options
Core Options
--output_format:markdown(default),html,json,chunks--output_dir: Directory for output files--page_range: Specific pages, e.g.,"0,5-10,20"
LLM Enhancement
--use_llm: Enable LLM for improved accuracy (tables, forms, math, handwriting)--llm_service: LLM service class (see LLM Services below)--block_correction_prompt: Custom prompt for output refinement
OCR & Processing
--force_ocr: Force OCR on entire document, converts inline math to LaTeX--strip_existing_ocr: Remove existing OCR and re-process--redo_inline_math: Highest quality inline math conversion (use with--use_llm)
Image & Output Control
--disable_image_extraction: Skip image extraction (plain text only)--paginate_output: Add page separators to output--extract_images: Enable image extraction (default: true)
Advanced
--config_json: Load configuration from JSON file--debug: Enable diagnostic logging--force_layout_block: Force layout type, e.g.,Table--converter_cls: Custom converter class
LLM Services
Claude (Default)
marker_single document.pdf \
--use_llm \
--llm_service marker.services.claude.ClaudeService \
--claude_api_key $ANTHROPIC_API_KEY \
--claude_model_name claude-haiku-4-5
OpenAI
marker_single document.pdf \
--use_llm \
--llm_service marker.services.openai.OpenAIService \
--openai_api_key $OPENAI_API_KEY \
--openai_model gpt-4o
Ollama (Local)
marker_single document.pdf \
--use_llm \
--llm_service marker.services.ollama.OllamaService \
--ollama_base_url "http://localhost:11434" \
--ollama_model llama3.2-vision
Google Gemini (Default if no service specified)
export GOOGLE_API_KEY="your-api-key"
marker_single document.pdf --use_llm
Examples
Convert PDF to Markdown (Plain Text)
marker_single "./docs/report.pdf" \
--output_format markdown \
--output_dir "./docs/" \
--use_llm \
--llm_service marker.services.claude.ClaudeService \
--claude_model_name claude-haiku-4-5 \
--claude_api_key $ANTHROPIC_API_KEY \
--disable_image_extraction
Convert with Images Preserved
marker_single "./docs/report.pdf" \
--output_format markdown \
--output_dir "./docs/" \
--use_llm \
--llm_service marker.services.claude.ClaudeService \
--claude_model_name claude-haiku-4-5 \
--claude_api_key $ANTHROPIC_API_KEY
Extract Tables Only
marker_single "./docs/spreadsheet.pdf" \
--use_llm \
--force_layout_block Table \
--converter_cls marker.converters.table.TableConverter \
--output_format json
Batch Convert Multiple Files
marker /path/to/input/folder --workers 4
Using JSON Config File
cat > config.json << EOF
{
"force_ocr": true,
"use_llm": true,
"output_format": "markdown",
"disable_image_extraction": true,
"strip_existing_ocr": true,
"redo_inline_math": true
}
EOF
marker_single document.pdf --config_json config.json
Output Structure
Markdown Output
- Image links:
 - Tables: Formatted as markdown tables
- Equations: Fenced with
$$...$$ - Code: Fenced with
```language - Headings:
#for sections
JSON Output
{
"pages": [
{
"id": "page_0",
"polygon": [[x1,y1], [x2,y2], ...],
"children": [
{
"id": "block_0",
"block_type": "Text|Table|Image|...",
"html": "<p>content</p>",
"polygon": [...],
"section_hierarchy": {...}
}
]
}
],
"metadata": {
"table_of_contents": [...],
"page_stats": [...]
}
}
Instructions
Confirm the input file path exists
Determine output directory (default: same as input file)
Use AskUserQuestion tool to ask user preferences (ask both questions together):
Question 1 - Image Extraction:
- Header: "Images"
- Question: "是否需要提取文档中的图片?"
- Options:
- "No (Recommended)": 仅提取文本,生成纯 Markdown 文件
- "Yes": 提取图片并保存,Markdown 中包含图片链接
Question 2 - LLM Service:
- Header: "LLM"
- Question: "使用哪个 LLM 来识别图片和表格内容?"
- Options:
- "Claude Haiku (Recommended)": 快速、经济,需要 ANTHROPIC_API_KEY
- "Claude Sonnet": 更高质量,需要 ANTHROPIC_API_KEY
- "GPT-4o": OpenAI 模型,需要 OPENAI_API_KEY
- "Ollama (Local)": 本地运行,无需 API Key
Based on user's answers, construct the command:
- If "No" for images: add
--disable_image_extraction - Set LLM service parameters according to selection:
- Claude Haiku:
--llm_service marker.services.claude.ClaudeService --claude_model_name claude-haiku-4-5 --claude_api_key $ANTHROPIC_API_KEY - Claude Sonnet:
--llm_service marker.services.claude.ClaudeService --claude_model_name claude-sonnet-4-20250514 --claude_api_key $ANTHROPIC_API_KEY - GPT-4o:
--llm_service marker.services.openai.OpenAIService --openai_api_key $OPENAI_API_KEY --openai_model gpt-4o - Ollama:
--llm_service marker.services.ollama.OllamaService --ollama_base_url "http://localhost:11434" --ollama_model llama3.2-vision
- Claude Haiku:
- If "No" for images: add
Run the
marker_singlecommand with chosen optionsReport the output file location and any extraction notes