Claude Code Plugins

Community-maintained marketplace

Feedback

Fine-tune LLMs using Red Hat training-hub library with SFT, LoRA, and OSFT algorithms. Use when preparing JSONL datasets, running training jobs, configuring hardware, scaling to clusters, evaluating models, or deploying with vLLM.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name training-hub
description Fine-tune LLMs using Red Hat training-hub library with SFT, LoRA, and OSFT algorithms. Use when preparing JSONL datasets, running training jobs, configuring hardware, scaling to clusters, evaluating models, or deploying with vLLM.

Training Hub

Red Hat's unified library for LLM post-training: SFT, LoRA, and OSFT (continual learning).

Quick Reference

Task Command
Recommend config python scripts/recommend_config.py --model <model> --hardware <hw>
Estimate memory python scripts/estimate_memory.py --model <model> --method sft --hardware h100
Validate dataset python scripts/validate_dataset.py data.jsonl
Full fine-tuning from training_hub import sft
LoRA training from training_hub import lora_sft
OSFT (continual) from training_hub import osft

Installation

pip install training-hub              # Basic
pip install training-hub[lora]        # LoRA with Unsloth (2x faster)
pip install training-hub[cuda] --no-build-isolation  # CUDA support

Get Started Fast

# Get optimal config for your hardware
python scripts/recommend_config.py \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --hardware rtx-5090

Data Format

Training data must be JSONL with message structure:

{"messages": [{"role": "user", "content": "Q"}, {"role": "assistant", "content": "A"}]}

Validate before training:

python scripts/validate_dataset.py ./training_data.jsonl

For data preparation details, see DATA-FORMATS.md.

Training Methods

Supervised Fine-Tuning (SFT)

Full-parameter fine-tuning. Requires significant VRAM.

from training_hub import sft

result = sft(
    model_path="Qwen/Qwen2.5-7B-Instruct",
    data_path="./training_data.jsonl",
    ckpt_output_dir="./checkpoints",
    num_epochs=3,
    effective_batch_size=8,
    learning_rate=2e-5,
    max_seq_len=2048,
    max_tokens_per_gpu=45000,
)

LoRA Fine-Tuning

Memory-efficient adaptation (up to 2x faster, 70% less VRAM):

from training_hub import lora_sft

result = lora_sft(
    model_path="Qwen/Qwen2.5-7B-Instruct",
    data_path="./training_data.jsonl",
    ckpt_output_dir="./outputs",
    lora_r=16,
    lora_alpha=32,
    num_epochs=3,
    learning_rate=2e-4,
)

QLoRA (4-bit): Add load_in_4bit=True for large models on limited VRAM.

OSFT (Continual Learning)

Adapt without catastrophic forgetting:

from training_hub import osft

result = osft(
    model_path="meta-llama/Llama-3.1-8B-Instruct",
    data_path="./domain_data.jsonl",
    ckpt_output_dir="./checkpoints",
    unfreeze_rank_ratio=0.25,
    effective_batch_size=16,
    learning_rate=2e-5,
)

For all parameters, see ALGORITHMS.md.

Hardware Support

Hardware VRAM Best For
RTX 5090 32GB 8B LoRA, 70B QLoRA
DGX Spark 128GB 70B SFT
H100 80GB 14B SFT, 70B LoRA
8×H100 640GB 70B SFT
# Check if your config fits
python scripts/estimate_memory.py \
  --model meta-llama/Llama-3.1-70B-Instruct \
  --method lora \
  --hardware h100 \
  --num-gpus 8

For hardware-specific configs, see HARDWARE.md.

Scaling

Multi-GPU:

result = sft(..., nproc_per_node=8)

Multi-node:

result = sft(..., nnodes=2, node_rank=0, nproc_per_node=8, rdzv_endpoint="0.0.0.0:29500")

For Slurm, Kubernetes, and datacenter deployments, see SCALE.md.

Algorithm Selection

Scenario Method
First-time fine-tuning, large dataset SFT
Memory constrained LoRA
Very large model (70B+), limited VRAM LoRA + QLoRA
Preserve existing capabilities OSFT
Domain adaptation, small dataset OSFT

Documentation

Topic File
Hardware profiles & configs HARDWARE.md
All algorithm parameters ALGORITHMS.md
Data formats & conversion DATA-FORMATS.md
Datacenter & cluster setup SCALE.md
Model evaluation EVALUATION.md
vLLM inference & serving INFERENCE.md
Advanced techniques ADVANCED.md
Model-specific configs MODELS.md
Troubleshooting TROUBLESHOOTING.md
Distributed training DISTRIBUTED.md

Utility Scripts

Script Purpose
recommend_config.py Generate optimal config for model + hardware
estimate_memory.py Estimate GPU memory requirements
validate_dataset.py Validate JSONL dataset format
convert_to_jsonl.py Convert CSV, Alpaca, ShareGPT to JSONL

Troubleshooting

CUDA OOM: Reduce max_tokens_per_gpu, use LoRA + QLoRA, or add GPUs

Dataset errors: Run python scripts/validate_dataset.py first

LoRA multi-GPU: Requires torchrun --nproc-per-node=N script.py

Training diverges: Lower learning_rate (try 1e-5 for SFT, 1e-4 for LoRA)

For more, see TROUBLESHOOTING.md.