name	xml-element-extractor
description	Extract specific XML elements from source files using Python and optionally format with xmllint. This skill should be used when users need to isolate a single XML element (like InstrumentVector with Id="0") from a larger XML document, preserving the complete element structure including opening and closing tags.

XML Element Extractor

Overview

This skill enables precise extraction of XML elements from source files using Python's standard library (xml.etree.ElementTree), with optional xmllint formatting for clean output. Use this skill when you need to isolate a specific XML element from a larger document while maintaining its complete structure.

⚠️ Agent Guidelines

DO NOT READ XML FILES DIRECTLY

Never use Read tool on XML files - Large files can exceed context limits
Never display XML content directly - Always use the extraction script
Always use the Python extraction script - Optimized for efficient XML processing
Process XML externally - Let the script handle parsing, not the agent

This prevents context overflow and maintains performance.

Key advantages of the Python implementation:

Maximum compatibility: Uses only Python standard library, no third-party dependencies
Robust parsing: Proper XML parsing handles complex structures, special characters, and encoding
Attribute order insensitive: Works regardless of attribute order in XML tags
Better error handling: Provides clear error messages and graceful failure handling
Cross-platform: Works consistently across different operating systems

Quick Start

To extract an XML element:

Identify the source XML file path
Choose destination file path for the extracted element
Specify the exact opening tag (including attributes) of the element to extract
Execute the extraction script
Verify the output file contains the desired element

Extraction Process

Step 1: Prepare Input Parameters

Gather the following parameters:

Source XML file: Path to the input XML file containing the element to extract
Destination XML file: Path where the extracted element will be saved
Element tag: Exact opening tag including all attributes (e.g., <InstrumentVector Id="0">)

Step 2: Execute Extraction Script

Run the extraction script with the three parameters:

python3 scripts/extract_xml_element.py <source_file> <dest_file> <element_tag>

Example:

python3 scripts/extract_xml_element.py source.xml dest.xml '<InstrumentVector Id="0">'

Note: The script is implemented in Python but maintains the .py extension for compatibility with existing workflows. The shebang line ensures it executes with Python.

Step 3: Verify Output

The script will:

Extract the first matching XML element from the source file
Save it to the destination file
Optionally format the output with xmllint if available
Report success or failure with appropriate error messages

Error Handling

The script provides basic error handling for common issues:

Missing source file: Displays error if source file doesn't exist
Unreadable source file: Displays error if source file cannot be read
No matching element: Displays error if the specified element tag is not found
Failed extraction: Displays error if the extraction process fails

XML Formatting

If xmllint is available on the system, the script will automatically format the extracted XML with:

2-space indentation
Proper XML structure validation
Clean, readable output format

If xmllint is not available, the extracted element will be saved in its original formatting.

Usage Examples

Basic Extraction

# Extract InstrumentVector with Id="0"
python3 scripts/extract_xml_element.py live_set.xml instrument_vector.xml '<InstrumentVector Id="0">'

# Extract MidiTrack element
python3 scripts/extract_xml_element.py tracks.xml midi_track.xml '<MidiTrack Id="42">'

# Extract complex elements with quotes and special characters
python3 scripts/extract_xml_element.py live_set.xml device.xml '<MxDeviceAudioEffect Id="5">'

Error Cases

# This will fail - source file doesn't exist
python3 scripts/extract_xml_element.py missing.xml output.xml '<Element>'

# This will fail - no matching element found
python3 scripts/extract_xml_element.py source.xml output.xml '<NonExistentTag>'

# This will fail - malformed XML
python3 scripts/extract_xml_element.py malformed.xml output.xml '<Element>'

Resources

scripts/

extract_xml_element.py: Main Python script that performs XML element extraction using Python's standard library. The script:

Takes three parameters: source file, destination file, and element tag
Validates input parameters and file accessibility
Uses xml.etree.ElementTree for robust XML parsing and element extraction
Handles complex XML structures, special characters, and attribute order variations
Optionally formats output with xmllint if available, with Python minidom fallback
Provides clear error messages for common failure cases
Cross-platform compatible (Windows, macOS, Linux)

references/

python_xml.md: Reference documentation for XML element extraction process, including explanations of the Python parsing logic and troubleshooting guides for common extraction issues.

Technical Details

Python XML Processing

The script uses Python's standard library with the following key components:

XML Parsing:

xml.etree.ElementTree.parse(): Parses XML files into structured element trees
tree.iter(tag_name): Iterates through all elements with matching tag names
ET.tostring(): Converts elements back to XML strings

Tag Matching Algorithm:

Attribute order normalization: Sorts attributes for consistent comparison
Regex-based tag parsing: Extracts tag names and attributes using regular expressions
Exact matching: Ensures precise element identification including all attributes

Error Detection:

XML parsing validation using ET.ParseError
File I/O error handling with proper exception catching
Empty output file detection and cleanup
Timeout handling for external tool calls

Formatting Options:

Primary: xmllint with XMLLINT_INDENT environment variable for 2-space indentation
Fallback: Python's xml.dom.minidom for cross-platform compatibility

Cross-Platform Compatibility

The Python implementation provides consistent behavior across:

Windows: Uses subprocess calls with proper path handling
macOS: Native Unix-style subprocess execution
Linux: Standard Python subprocess behavior

No platform-specific tools are required, making the skill truly portable.

xml-element-extractor

Install Skill

SKILL.md