name	polydoc
description	Build advanced documentation systems using Pandoc filters compiled with GraalVM for the JVM/Clojure ecosystem. Use when creating documentation filters, processing code blocks, building searchable documentation books, running code in documents, or when the user mentions Pandoc filters, documentation processing, document transformation, or interactive documentation systems.

Polydoc - JVM-Native Pandoc Documentation System

Polydoc brings Pandoc's powerful filtering capabilities to the JVM/Clojure ecosystem, providing advanced document processing without Python/Node.js dependencies. It compiles to native code with GraalVM for fast execution.

Quick Start

# Install dependencies
brew bundle

# Start development REPL
bb nrepl

# In REPL
(dev)           ; Load dev namespace
(refresh)       ; Reload all namespaces
(lint)          ; Lint with clj-kondo
(run-all)       ; Run all tests

# Build and use
bb build-cli
polydoc --help

Key capabilities:

Code execution filters: Run Clojure, SQLite, JavaScript, Python code blocks
Rendering filters: Process PlantUML diagrams
Linting filters: Check Clojure code with clj-kondo
Include filters: Compose documents from multiple sources
Book building: Generate books with TOC and full-text search
Interactive viewer: HTTP-based document browser with SQLite-powered search

Core Concepts

Pandoc Filter Architecture

Polydoc processes documents through Pandoc's filter system:

Document (Markdown/etc) 
  → Pandoc Parser 
  → JSON AST 
  → Polydoc Filter (JVM/Clojure) 
  → Modified AST 
  → Pandoc Writer 
  → Output (HTML/PDF/etc)

Pandoc AST Structure:

;; Complete Pandoc document
{:pandoc-api-version [1 23 1]
 :meta {:title {:t "MetaString" :c "My Document"}}
 :blocks [{:t "Header" :c [1 ["id" [] []] [{:t "Str" :c "Title"}]]}
          {:t "Para" :c [{:t "Str" :c "Content"}]}
          {:t "CodeBlock" 
           :c [["" ["clojure"] []] 
               "(+ 1 2)"]}]}

;; Common block types
{:t "Para" :c [...]}           ; Paragraph
{:t "Header" :c [level attrs content]}  ; Header
{:t "CodeBlock" :c [attrs code]}        ; Code block
{:t "BulletList" :c [...]}              ; Bullet list
{:t "OrderedList" :c [...]}             ; Numbered list

Filter Processing Pattern

All Polydoc filters follow this pattern:

(ns polydoc.filters.example
  (:require [clojure.walk :as walk]
            [clojure.data.json :as json]))

(defn process-element
  "Transforms a single AST element."
  [element]
  (if (matches? element)
    (transform element)
    element))

(defn filter-document
  "Walks the entire AST and applies transformations."
  [ast]
  (walk/postwalk process-element ast))

(defn run-filter
  "Main entry point - reads JSON from stdin, processes, writes to stdout."
  []
  (let [ast (json/read *in* :key-fn keyword)
        modified (filter-document ast)]
    (json/write modified *out*)))

Code Block Processing

Code blocks are the primary target for most filters:

;; Code block structure
{:t "CodeBlock"
 :c [["id" ["language" "class"] [["key" "value"]]]  ; attributes
     "code content"]}                                 ; code string

;; Extract language and code
(defn code-block? [element]
  (= "CodeBlock" (:t element)))

(defn get-language [code-block]
  (-> code-block :c first second first))

(defn get-code [code-block]
  (-> code-block :c second))

;; Example: Process Clojure code blocks
(defn process-clojure-block [element]
  (when (and (code-block? element)
             (= "clojure" (get-language element)))
    (let [code (get-code element)
          result (eval-clojure code)]
      ;; Return modified block with result
      (assoc-in element [:c 1] 
                (str code "\n;; => " result)))))

Common Workflows

Workflow 1: Creating a New Filter

Goal: Add a filter that processes code blocks in a specific language.

;; 1. Create filter namespace
(ns polydoc.filters.my-filter
  (:require [clojure.walk :as walk]
            [clojure.data.json :as json]))

;; 2. Define element matching
(defn my-code-block? [element]
  (and (= "CodeBlock" (:t element))
       (= "mylang" (-> element :c first second first))))

;; 3. Define transformation
(defn process-my-block [element]
  (if (my-code-block? element)
    (let [code (-> element :c second)
          result (execute-my-language code)]
      ;; Replace code block with result
      {:t "CodeBlock"
       :c [["" ["output"] []]
           (str code "\n\nOutput:\n" result)]})
    element))

;; 4. Walk the AST
(defn filter-ast [ast]
  (walk/postwalk process-my-block ast))

;; 5. Create CLI command handler
(defn run-my-filter
  "Process mylang code blocks in Pandoc AST."
  [opts]
  (let [ast (json/read *in* :key-fn keyword)
        modified (filter-ast ast)]
    (json/write modified *out*)
    {:exit 0}))

;; 6. Add to main.clj CLI configuration
;; In polydoc.main:
{:command "filter-mylang"
 :description "Process mylang code blocks"
 :opts []
 :runs polydoc.filters.my-filter/run-my-filter}

Test the filter:

# Create test document
echo '```mylang
my code here
```' > test.md

# Convert to Pandoc JSON
pandoc -t json test.md > test.json

# Run filter
clojure -M:main filter-mylang < test.json > output.json

# Convert back to markdown
pandoc -f json -t markdown output.json

Workflow 2: Processing Code Execution

Goal: Execute code blocks and include output in the document.

(ns polydoc.filters.run-code
  (:require [clojure.java.shell :as shell]))

(defn execute-clojure [code]
  "Execute Clojure code and capture output."
  (try
    (let [result (eval (read-string code))]
      {:success true
       :result (pr-str result)
       :output ""})
    (catch Exception e
      {:success false
       :error (.getMessage e)})))

(defn execute-shell [code language]
  "Execute code via shell interpreter."
  (try
    (let [result (shell/sh (interpreter-for language) "-c" code)]
      {:success (zero? (:exit result))
       :output (:out result)
       :error (:err result)})
    (catch Exception e
      {:success false
       :error (.getMessage e)})))

(defn interpreter-for [language]
  (case language
    "python" "python3"
    "javascript" "node"
    "bash" "bash"
    "sh"))

(defn process-executable-block [element]
  (if (and (code-block? element)
           (executable-language? (get-language element)))
    (let [lang (get-language element)
          code (get-code element)
          result (if (= lang "clojure")
                  (execute-clojure code)
                  (execute-shell code lang))]
      (if (:success result)
        ;; Create new block with code + output
        {:t "Div"
         :c [["" [] []]
             [{:t "CodeBlock" :c [["" [lang] []] code]}
              {:t "CodeBlock" :c [["" ["output"] []] (:output result)]}]]}
        ;; Show error
        {:t "Div"
         :c [["" ["error"] []]
             [{:t "CodeBlock" :c [["" [lang] []] code]}
              {:t "Para" :c [{:t "Strong" :c [{:t "Str" :c "Error:"}]}
                             {:t "Space" :c []}
                             {:t "Str" :c (:error result)}]}]]}))
    element))

Workflow 3: Building a Documentation Book

Goal: Combine multiple documents into a searchable book with TOC.

(ns polydoc.book
  (:require [next.jdbc :as jdbc]
            [honey.sql :as sql]
            [clojure.java.io :as io]
            [clojure.data.json :as json]))

;; Database schema for book index
(def schema
  {:books {:book_id :integer-primary-key
           :name :text
           :metadata :text}
   :sections {:section_id :integer-primary-key
              :book_id :integer
              :content :text
              :hash :text
              :title :text
              :level :integer}})

(defn create-index-db [db-path]
  "Create SQLite database for book index."
  (let [db {:dbtype "sqlite" :dbname db-path}
        ds (jdbc/get-datasource db)]
    (jdbc/execute! ds
      ["CREATE TABLE IF NOT EXISTS books (
          book_id INTEGER PRIMARY KEY,
          name TEXT,
          metadata TEXT)"])
    (jdbc/execute! ds
      ["CREATE TABLE IF NOT EXISTS sections (
          section_id INTEGER PRIMARY KEY,
          book_id INTEGER,
          title TEXT,
          content TEXT,
          hash TEXT,
          level INTEGER,
          FOREIGN KEY(book_id) REFERENCES books(book_id))"])
    (jdbc/execute! ds
      ["CREATE VIRTUAL TABLE IF NOT EXISTS sections_fts 
        USING fts5(title, content, content=sections)"])
    ds))

(defn index-section [ds book-id section]
  "Add a section to the index."
  (jdbc/execute! ds
    (sql/format {:insert-into :sections
                 :values [{:book_id book-id
                          :title (:title section)
                          :content (:content section)
                          :hash (:hash section)
                          :level (:level section)}]})))

(defn extract-sections [ast]
  "Extract sections from Pandoc AST."
  (let [sections (atom [])]
    (walk/postwalk
      (fn [element]
        (when (= "Header" (:t element))
          (let [[level [id classes attrs] content] (:c element)]
            (swap! sections conj
                   {:level level
                    :id id
                    :title (text-from-inlines content)
                    :content (serialize-blocks element)})))
        element)
      ast)
    @sections))

(defn build-book
  "Build book from TOC file and source documents."
  [{:keys [toc output-db output-html]}]
  (let [ds (create-index-db output-db)
        toc-data (parse-toc toc)
        book-id (create-book ds toc-data)]
    ;; Process each document in TOC
    (doseq [doc (:documents toc-data)]
      (let [ast (parse-document doc)
            sections (extract-sections ast)]
        (doseq [section sections]
          (index-section ds book-id section))))
    ;; Generate HTML with search
    (generate-html ds book-id output-html)))

Workflow 4: Search Implementation

Goal: Provide full-text search across documentation.

(ns polydoc.search
  (:require [next.jdbc :as jdbc]
            [honey.sql :as sql]
            [honey.sql.helpers :as h]))

(defn search-sections
  "Search for sections matching query."
  [ds query]
  (jdbc/execute! ds
    (sql/format
      {:select [:s.section_id :s.title :s.content :s.level
                [(sql/call :highlight :sections_fts 1 "<mark>" "</mark>")
                 :highlighted]]
       :from [[:sections :s]]
       :join [[:sections_fts :fts]
              [:= :s.section_id :fts.rowid]]
       :where [:match :sections_fts query]
       :order-by [[(sql/call :rank :sections_fts) :asc]]
       :limit 50})))

(defn format-search-results
  "Format search results for display."
  [results]
  (for [result results]
    {:title (:sections/title result)
     :snippet (:highlighted result)
     :level (:sections/level result)
     :section-id (:sections/section_id result)}))

;; CLI command
(defn run-search
  "Search documentation index."
  [{:keys [query db]}]
  (let [ds (jdbc/get-datasource {:dbtype "sqlite" :dbname db})
        results (search-sections ds query)]
    (doseq [result (format-search-results results)]
      (println (str "## " (:title result)))
      (println (:snippet result))
      (println))
    {:exit 0}))

Workflow 5: Interactive Documentation Viewer

Goal: Serve documentation with live search via HTTP.

(ns polydoc.viewer
  (:require [org.httpkit.server :as http]
            [hiccup.core :as h]
            [polydoc.search :as search]))

(defn render-page
  "Render HTML page with search."
  [content query results]
  (h/html
    [:html
     [:head
      [:title "Polydoc Viewer"]
      [:style "
        body { font-family: sans-serif; max-width: 800px; margin: 0 auto; }
        .search { padding: 20px 0; }
        .result { padding: 10px; margin: 10px 0; border-left: 3px solid #007bff; }
        mark { background: yellow; }
      "]]
     [:body
      [:h1 "Documentation"]
      [:div.search
       [:form {:method "GET"}
        [:input {:type "text" :name "q" :value query :placeholder "Search..."}]
        [:button "Search"]]]
      [:div.results
       (for [result results]
         [:div.result
          [:h3 (:title result)]
          [:div {:dangerouslySetInnerHTML {:__html (:snippet result)}}]])]]]))

(defn handler [ds]
  (fn [req]
    (let [query (get-in req [:params :q])
          results (when query (search/search-sections ds query))]
      {:status 200
       :headers {"Content-Type" "text/html"}
       :body (render-page nil query results)})))

(defn start-viewer
  "Start HTTP server for documentation viewer."
  [{:keys [db port]
    :or {port 8080}}]
  (let [ds (jdbc/get-datasource {:dbtype "sqlite" :dbname db})
        server (http/run-server (handler ds) {:port port})]
    (println (str "Viewer running at http://localhost:" port))
    (println "Press Ctrl+C to stop")
    @(promise)))

CLI Command Structure with cli-matic

Polydoc uses cli-matic for command-line interface:

(ns polydoc.main
  (:require [cli-matic.core :as cli]
            [polydoc.filters.run-clojure :as run-clojure]
            [polydoc.filters.plantuml :as plantuml]
            [polydoc.book :as book]
            [polydoc.search :as search]
            [polydoc.viewer :as viewer])
  (:gen-class))

(def CONFIGURATION
  {:app {:command "polydoc"
         :description "Advanced Pandoc documentation system for JVM"
         :version "0.1.0"}
   
   :global-opts []
   
   :commands
   [{:command "filter"
     :description "Pandoc filter commands"
     :subcommands
     [{:command "run-clojure"
       :description "Execute Clojure code blocks"
       :opts []
       :runs run-clojure/run-filter}
      
      {:command "run-sqlite"
       :description "Execute SQLite queries"
       :opts []
       :runs run-sqlite/run-filter}
      
      {:command "run-javascript"
       :description "Execute JavaScript code blocks"
       :opts []
       :runs run-js/run-filter}
      
      {:command "render-plantuml"
       :description "Render PlantUML diagrams"
       :opts [{:option "output-dir"
               :short "o"
               :as "Output directory for images"
               :type :string
               :default "./images"}]
       :runs plantuml/run-filter}
      
      {:command "lint-clojure"
       :description "Lint Clojure code with clj-kondo"
       :opts []
       :runs lint/run-filter}
      
      {:command "include"
       :description "Include external files in document"
       :opts [{:option "base-dir"
               :short "b"
               :as "Base directory for includes"
               :type :string
               :default "."}]
       :runs include/run-filter}]}
    
    {:command "book"
     :description "Book building commands"
     :subcommands
     [{:command "toc"
       :description "Print table of contents"
       :opts [{:option "file"
               :short "f"
               :as "TOC file path"
               :type :string
               :required true}]
       :runs book/print-toc}
      
      {:command "build"
       :description "Build entire book with index"
       :opts [{:option "toc"
               :short "t"
               :as "TOC file"
               :type :string
               :required true}
              {:option "output-db"
               :short "d"
               :as "Output database path"
               :type :string
               :default "book.db"}
              {:option "output-html"
               :short "o"
               :as "Output HTML file"
               :type :string
               :default "book.html"}]
       :runs book/build-book}]}
    
    {:command "search"
     :description "Search documentation"
     :opts [{:option "query"
             :short "q"
             :as "Search query"
             :type :string
             :required true}
            {:option "db"
             :short "d"
             :as "Database path"
             :type :string
             :default "book.db"}]
     :runs search/run-search}
    
    {:command "view"
     :description "Start interactive documentation viewer"
     :opts [{:option "db"
             :short "d"
             :as "Database path"
             :type :string
             :default "book.db"}
            {:option "port"
             :short "p"
             :as "HTTP port"
             :type :int
             :default 8080}]
     :runs viewer/start-viewer}]})

(defn -main [& args]
  (cli/run-cmd args CONFIGURATION))

When to Use Each Feature

Use polydoc filter run-clojure when:

Documenting Clojure libraries with live examples
Including test results in documentation
Generating tables/charts from data

Use polydoc filter render-plantuml when:

Creating architecture diagrams
Documenting system design
Visualizing workflows

Use polydoc filter lint-clojure when:

Ensuring code examples are valid
Catching errors before publication
Maintaining code quality in docs

Use polydoc book build when:

Creating comprehensive documentation sites
Combining multiple documents
Need searchable documentation

Use polydoc search when:

Finding information across documentation
Testing search functionality
Building search-based tools

Use polydoc view when:

Developing documentation locally
Previewing changes before publish
Creating interactive documentation sites

Best Practices

DO:

Test filters with simple AST examples first
Use walk/postwalk for AST transformations
Handle errors gracefully (return original element if processing fails)
Add comprehensive CLI help text
Use SQLite FTS5 for full-text search
Cache expensive operations (diagram rendering, code execution)
Validate Pandoc AST structure before processing
Use keyword keys for JSON parsing (:key-fn keyword)

DON'T:

Mutate AST elements (return new ones)
Execute untrusted code without sandboxing
Skip error handling in filters
Forget to flush output after JSON write
Hardcode paths (use CLI options)
Process the same element multiple times
Modify elements you're not targeting

Common Issues

Issue: "Filter produces invalid JSON"

Cause: Malformed AST structure returned

Solution: Validate AST structure matches Pandoc spec

;; Wrong: Missing required fields
{:t "CodeBlock" :c ["code"]}

;; Right: Complete structure
{:t "CodeBlock" :c [["" ["lang"] []] "code"]}

;; Use schema validation
(require '[malli.core :as m])

(def CodeBlock
  [:map
   [:t [:= "CodeBlock"]]
   [:c [:tuple
        [:tuple string? [:sequential string?] [:sequential [:tuple string? string?]]]
        string?]]])

(m/validate CodeBlock element)

Issue: "Code execution fails"

Cause: Missing interpreter or invalid code

Solution: Add error handling and validation

(defn safe-execute [code language]
  (try
    (if (interpreter-exists? language)
      (execute-code code language)
      {:error (str "No interpreter for " language)})
    (catch Exception e
      {:error (.getMessage e)})))

Issue: "Search returns no results"

Cause: FTS index not synced with content table

Solution: Rebuild FTS index

(defn rebuild-fts-index [ds]
  (jdbc/execute! ds
    ["INSERT INTO sections_fts(sections_fts) VALUES('rebuild')"]))

Issue: "Performance degradation with large documents"

Cause: Processing entire AST multiple times

Solution: Use transducers or single-pass processing

;; Instead of multiple walk/postwalk calls
(defn multi-filter [ast]
  (walk/postwalk
    (fn [element]
      (-> element
          (process-code-blocks)
          (process-diagrams)
          (process-includes)))
    ast))

Development Workflow

REPL-Driven Development

;; 1. Start REPL
;; bb nrepl (in terminal)

;; 2. Load dev namespace
(dev)

;; 3. Load and test filter
(require '[polydoc.filters.my-filter :reload])

;; 4. Test with sample AST
(def test-ast
  {:blocks [{:t "CodeBlock" :c [["" ["clojure"] []] "(+ 1 2)"]}]})

(polydoc.filters.my-filter/filter-ast test-ast)

;; 5. Verify output
;; => {:blocks [{:t "CodeBlock" :c [["" ["clojure"] []] "(+ 1 2)\n;; => 3"]}]}

;; 6. Lint and test
(lint)
(run-all)

;; 7. Build and test CLI
;; bb build-cli (in terminal)

Testing Filters End-to-End

# 1. Create test markdown
cat > test.md << EOF
# Test Document

\`\`\`clojure
(+ 1 2 3)
\`\`\`
EOF

# 2. Convert to Pandoc JSON
pandoc -t json test.md -o test.json

# 3. Run filter
clojure -M:main filter run-clojure < test.json > output.json

# 4. Convert back
pandoc -f json output.json -o output.md

# 5. Verify
cat output.md

Building Native Image

# Full build pipeline
bb build-cli

# Separate steps
bb compile          # Compile Clojure to classes
bb build-uberjar    # Create standalone JAR
bb build-gvm        # Compile with GraalVM

# Test native binary
./polydoc --help

Integration with Pandoc

Basic Pipeline

# Markdown → Filter → HTML
pandoc input.md \
  --filter polydoc filter run-clojure \
  --filter polydoc filter render-plantuml \
  -o output.html

# Multiple filters in sequence
pandoc input.md \
  --filter "polydoc filter include" \
  --filter "polydoc filter run-clojure" \
  --filter "polydoc filter lint-clojure" \
  -o output.html

Makefile Integration

POLYDOC = polydoc
PANDOC = pandoc

%.html: %.md
	$(PANDOC) $< \
		--filter "$(POLYDOC) filter run-clojure" \
		--filter "$(POLYDOC) filter render-plantuml" \
		--standalone \
		-o $@

book: book.yaml chapters/*.md
	$(POLYDOC) book build -t book.yaml -o book.html -d book.db

search:
	$(POLYDOC) search -q "$(QUERY)" -d book.db

serve:
	$(POLYDOC) view -d book.db -p 8080

clean:
	rm -f *.html *.json book.db

Dependencies

Core Libraries

cli-matic (0.5.4): CLI framework
clojure.data.json (2.5.1): Pandoc AST parsing
next.jdbc (1.3.1070): Database access
honeysql (2.7.1350): SQL DSL
sqlite-jdbc (3.47.1.0): SQLite driver
malli (0.19.2): Schema validation
bling (0.8.8): Terminal formatting

Development Tools

clj-kondo: Linting
kaocha: Testing
clj-reload: Namespace reloading
hashp: Debug printing

External Requirements

Pandoc (>=2.0): Document processing
GraalVM (optional): Native compilation
PlantUML (optional): Diagram rendering

Resources

Related Skills

Clojure REPL: REPL-driven development
cli-matic: CLI framework
next.jdbc: Database operations
HoneySQL: SQL generation
Malli: Data validation
hashp: Debug printing

External Documentation

Pandoc Filters: Official filter documentation
Pandoc AST: AST specification
SQLite FTS5: Full-text search

Summary

Polydoc brings Pandoc's document processing to the JVM with:

Fast execution - GraalVM native compilation
Rich filters - Code execution, rendering, linting
Searchable docs - SQLite FTS5 full-text search
Interactive viewer - HTTP-based documentation browser
JVM-native - No Python/Node.js dependencies

Core workflow:

Write documents with code blocks
Process with Polydoc filters (execute, render, lint)
Build searchable books with TOC
Serve interactively with HTTP viewer

Use Polydoc when: Building advanced documentation systems on the JVM, processing code in documents, creating searchable documentation, or avoiding Python/Node.js dependencies for Pandoc filters.

polydoc

Install Skill

SKILL.md