name: ffmpeg-hardware-acceleration description: Complete GPU-accelerated encoding/decoding system for FFmpeg 7.1 LTS and 8.0.1 (latest stable, released 2025-11-20). PROACTIVELY activate for: (1) NVIDIA NVENC/NVDEC encoding, (2) Intel Quick Sync Video (QSV), (3) AMD AMF encoding, (4) Apple VideoToolbox, (5) Linux VAAPI setup, (6) Vulkan Video 8.0 (FFv1, AV1, VP9, ProRes RAW), (7) VVC/H.266 hardware decoding (VAAPI/QSV), (8) GPU pipeline optimization with pad_cuda, (9) Docker GPU containers, (10) Performance benchmarking. Provides: Platform-specific commands, preset comparisons, quality tuning, full GPU pipeline examples, Vulkan compute codecs, VVC decoding, troubleshooting guides. Ensures: Maximum encoding speed with optimal quality using GPU acceleration.
CRITICAL GUIDELINES
Windows File Path Requirements
MANDATORY: Always Use Backslashes on Windows for File Paths
When using Edit or Write tools on Windows, you MUST use backslashes (\) in file paths, NOT forward slashes (/).
Quick Reference
| Platform | Encoder | Decoder | Detect Command |
|---|---|---|---|
| NVIDIA | h264_nvenc, hevc_nvenc, av1_nvenc |
h264_cuvid, hevc_cuvid |
ffmpeg -encoders | grep nvenc |
| Intel QSV | h264_qsv, hevc_qsv, av1_qsv |
h264_qsv, hevc_qsv |
ffmpeg -encoders | grep qsv |
| AMD AMF | h264_amf, hevc_amf, av1_amf |
N/A (use software) | ffmpeg -encoders | grep amf |
| Apple | h264_videotoolbox, hevc_videotoolbox |
h264_videotoolbox |
macOS only |
| VAAPI | h264_vaapi, hevc_vaapi, av1_vaapi |
with -hwaccel vaapi |
Linux only |
When to Use This Skill
Use when GPU acceleration is needed:
- Encoding speed is critical (10-30x faster than CPU)
- Processing large batches of videos
- Real-time encoding for streaming
- Server-side transcoding at scale
- Docker containers with GPU passthrough
Key decision: GPU encoding trades some quality for massive speed. Use -cq or -qp for quality control.
FFmpeg Hardware Acceleration (2025)
Comprehensive guide to GPU-accelerated encoding and decoding with NVIDIA, Intel, AMD, Apple, and Vulkan.
Current Latest: FFmpeg 8.0.1 (released 2025-11-20) - Check with ffmpeg -version
Hardware Acceleration Overview
Hardware acceleration uses dedicated GPU/SoC components for video processing:
- NVENC/NVDEC (NVIDIA): Dedicated video encode/decode engines
- QSV (Intel): Quick Sync Video on Intel CPUs with integrated graphics
- AMF (AMD): Advanced Media Framework for AMD GPUs
- VideoToolbox (Apple): macOS/iOS hardware acceleration
- VAAPI (Linux): Video Acceleration API (Intel, AMD on Linux)
- Vulkan Video (Cross-platform): FFmpeg 7.1+/8.0 GPU acceleration
Performance Comparison (2025 Benchmarks)
| Method | Speed | Quality | Power | Use Case |
|---|---|---|---|---|
| libx264 (CPU) | 1x | Best | High | Quality-critical |
| libx265 (CPU) | 0.3x | Best | Very High | Archival |
| h264_nvenc | 10-20x | Good | Low | Real-time, streaming |
| hevc_nvenc | 8-15x | Good | Low | 4K streaming |
| h264_qsv | 8-15x | Good | Very Low | Laptop, efficiency |
| h264_amf | 8-15x | Good | Low | AMD systems |
NVIDIA NVENC/NVDEC
Requirements
- NVIDIA GPU (GTX 600+ / Quadro K series+)
- NVIDIA drivers 450+
- FFmpeg built with
--enable-nvenc --enable-cuda --enable-cuvid
Check NVIDIA Support
# Check available NVIDIA codecs
ffmpeg -encoders | grep nvenc
ffmpeg -decoders | grep cuvid
# Check GPU info
nvidia-smi
# List hardware accelerators
ffmpeg -hwaccels
Basic NVENC Encoding
# H.264 NVENC encoding
ffmpeg -i input.mp4 -c:v h264_nvenc -preset p4 -b:v 5M output.mp4
# H.265/HEVC NVENC encoding
ffmpeg -i input.mp4 -c:v hevc_nvenc -preset p4 -b:v 4M output.mp4
# AV1 NVENC (RTX 40 series+)
ffmpeg -i input.mp4 -c:v av1_nvenc -preset p4 -b:v 3M output.mp4
NVENC Presets (FFmpeg 7+)
| Preset | Speed | Quality | Use Case |
|---|---|---|---|
| p1 | Fastest | Lowest | Real-time capture |
| p2 | Faster | Low | Screen recording |
| p3 | Fast | Medium | General streaming |
| p4 | Medium | Good | Recommended |
| p5 | Slow | Better | High-quality streaming |
| p6 | Slower | Best | Offline encoding |
| p7 | Slowest | Highest | Maximum quality |
Full GPU Pipeline (Decode + Encode)
# Keep frames in GPU memory (fastest)
ffmpeg -y -vsync 0 \
-hwaccel cuda \
-hwaccel_output_format cuda \
-i input.mp4 \
-c:v h264_nvenc \
-preset p4 \
-b:v 5M \
-c:a copy \
output.mp4
GPU Scaling and Filtering
# GPU-based scaling with scale_cuda
ffmpeg -y -vsync 0 \
-hwaccel cuda \
-hwaccel_output_format cuda \
-i input.mp4 \
-vf scale_cuda=1280:720 \
-c:v h264_nvenc \
-preset p4 \
output.mp4
# GPU overlay with overlay_cuda
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i main.mp4 \
-hwaccel cuda -hwaccel_output_format cuda -i overlay.mp4 \
-filter_complex "[0:v][1:v]overlay_cuda=10:10" \
-c:v h264_nvenc output.mp4
# GPU padding with pad_cuda (FFmpeg 8.0+)
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 \
-vf "pad_cuda=1920:1080:(ow-iw)/2:(oh-ih)/2:black" \
-c:v h264_nvenc output.mp4
# Letterbox with pad_cuda
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 \
-vf "scale_cuda=1920:-2,pad_cuda=1920:1080:(ow-iw)/2:(oh-ih)/2" \
-c:v h264_nvenc output.mp4
NVENC Quality Optimization
# High quality with lookahead
ffmpeg -i input.mp4 \
-c:v hevc_nvenc \
-preset p5 \
-tune hq \
-rc vbr \
-cq 23 \
-b:v 0 \
-rc-lookahead 32 \
-spatial-aq 1 \
-temporal-aq 1 \
-c:a copy \
output.mp4
# Constant quality mode (CQP)
ffmpeg -i input.mp4 \
-c:v h264_nvenc \
-preset p4 \
-rc constqp \
-qp 23 \
-c:a copy \
output.mp4
NVENC Two-Pass Encoding
# Two-pass for best quality (ABR)
ffmpeg -i input.mp4 \
-c:v h264_nvenc \
-preset p5 \
-2pass 1 \
-b:v 5M \
-c:a copy \
output.mp4
NVENC B-Frames and GOP
# Enable B-frames for better compression
ffmpeg -i input.mp4 \
-c:v hevc_nvenc \
-preset p4 \
-bf 4 \
-b_ref_mode 2 \
-g 250 \
-c:a copy \
output.mp4
Intel Quick Sync Video (QSV)
Requirements
- Intel CPU with integrated graphics (Sandy Bridge+)
- Intel Media SDK or oneVPL
- FFmpeg built with
--enable-libmfxor--enable-libvpl
Check QSV Support
# Check available QSV codecs
ffmpeg -encoders | grep qsv
ffmpeg -decoders | grep qsv
# Verify Intel GPU access (Linux)
ls /dev/dri/
vainfo # Check VAAPI support
Basic QSV Encoding
# Initialize QSV device
ffmpeg -init_hw_device qsv=hw \
-filter_hw_device hw \
-i input.mp4 \
-c:v h264_qsv \
-preset medium \
-b:v 5M \
output.mp4
# H.265/HEVC QSV
ffmpeg -init_hw_device qsv=hw \
-filter_hw_device hw \
-i input.mp4 \
-c:v hevc_qsv \
-preset medium \
-b:v 4M \
output.mp4
# AV1 QSV (Intel Arc, 12th gen+)
ffmpeg -init_hw_device qsv=hw \
-filter_hw_device hw \
-i input.mp4 \
-c:v av1_qsv \
-preset medium \
-b:v 3M \
output.mp4
Full QSV Pipeline
# Decode and encode on GPU
ffmpeg -hwaccel qsv \
-hwaccel_output_format qsv \
-i input.mp4 \
-c:v h264_qsv \
-preset medium \
-b:v 5M \
output.mp4
QSV with Scaling
# GPU scaling with vpp_qsv
ffmpeg -hwaccel qsv \
-hwaccel_output_format qsv \
-i input.mp4 \
-vf "vpp_qsv=w=1280:h=720" \
-c:v h264_qsv \
-preset medium \
output.mp4
VVC/H.266 QSV Decoding (FFmpeg 7.1+)
# Hardware VVC decoding
ffmpeg -hwaccel qsv \
-hwaccel_output_format qsv \
-c:v vvc_qsv \
-i input.vvc \
-c:v h264_qsv \
output.mp4
AMD AMF
Requirements
- AMD GPU (GCN or newer)
- AMD drivers with AMF support
- FFmpeg built with
--enable-amf
Check AMF Support
ffmpeg -encoders | grep amf
Basic AMF Encoding
# H.264 AMF
ffmpeg -i input.mp4 \
-c:v h264_amf \
-quality balanced \
-b:v 5M \
output.mp4
# H.265/HEVC AMF
ffmpeg -i input.mp4 \
-c:v hevc_amf \
-quality balanced \
-b:v 4M \
output.mp4
# AV1 AMF (RDNA3+)
ffmpeg -i input.mp4 \
-c:v av1_amf \
-quality balanced \
-b:v 3M \
output.mp4
AMF Quality Presets
| Preset | Description |
|---|---|
| speed | Fastest encoding |
| balanced | Recommended |
| quality | Best quality |
AMD Hardware Upscaling (FFmpeg 8.0+)
# Super resolution upscaling
ffmpeg -i input.mp4 \
-vf "sr_amf=4096:2160:algorithm=sr1-1" \
-c:v hevc_amf \
output.mp4
VAAPI (Linux)
Requirements
- Intel, AMD, or NVIDIA GPU on Linux
- VAAPI drivers (intel-media-driver, mesa-va-drivers)
- FFmpeg built with
--enable-vaapi
Check VAAPI Support
# Check VAAPI info
vainfo
# List VAAPI codecs
ffmpeg -encoders | grep vaapi
ffmpeg -decoders | grep vaapi
Basic VAAPI Encoding
# H.264 VAAPI
ffmpeg -vaapi_device /dev/dri/renderD128 \
-i input.mp4 \
-vf 'format=nv12,hwupload' \
-c:v h264_vaapi \
-b:v 5M \
output.mp4
# H.265/HEVC VAAPI
ffmpeg -vaapi_device /dev/dri/renderD128 \
-i input.mp4 \
-vf 'format=nv12,hwupload' \
-c:v hevc_vaapi \
-b:v 4M \
output.mp4
Full VAAPI Pipeline
# Decode and encode on GPU
ffmpeg -hwaccel vaapi \
-hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi \
-i input.mp4 \
-c:v h264_vaapi \
-b:v 5M \
output.mp4
VAAPI Scaling
ffmpeg -hwaccel vaapi \
-hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi \
-i input.mp4 \
-vf 'scale_vaapi=w=1280:h=720' \
-c:v h264_vaapi \
output.mp4
VVC VAAPI Decoding (FFmpeg 8.0+)
FFmpeg 8.0 adds VVC/H.266 hardware decoding on Intel and AMD GPUs via VAAPI.
# Hardware VVC/H.266 decoding
ffmpeg -hwaccel vaapi \
-hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi \
-i input.vvc \
-c:v h264_vaapi \
output.mp4
# VVC decode + transcode to H.265
ffmpeg -hwaccel vaapi \
-hwaccel_device /dev/dri/renderD128 \
-hwaccel_output_format vaapi \
-i input.mkv \
-c:v hevc_vaapi \
-b:v 4M \
output.mp4
# VVC with Screen Content Coding (SCC) support
# FFmpeg 8.0 adds full SCC support including:
# - IBC (Inter Block Copy)
# - Palette Mode
# - ACT (Adaptive Color Transform)
ffmpeg -hwaccel vaapi \
-hwaccel_device /dev/dri/renderD128 \
-i screen_recording.vvc \
output.mp4
VVC VAAPI Requirements:
- Intel Xe2 graphics (Lunar Lake) or newer for full VVC support
- FFmpeg 8.0 or later
- Intel media driver with VVC support
Apple VideoToolbox
Requirements
- macOS 10.8+ or iOS 8+
- FFmpeg built with
--enable-videotoolbox
Check VideoToolbox Support
ffmpeg -encoders | grep videotoolbox
ffmpeg -decoders | grep videotoolbox
Basic VideoToolbox Encoding
# H.264 VideoToolbox
ffmpeg -i input.mp4 \
-c:v h264_videotoolbox \
-b:v 5M \
output.mp4
# H.265/HEVC VideoToolbox
ffmpeg -i input.mp4 \
-c:v hevc_videotoolbox \
-b:v 4M \
-tag:v hvc1 \
output.mp4
# ProRes VideoToolbox
ffmpeg -i input.mp4 \
-c:v prores_videotoolbox \
-profile:v 3 \
output.mov
VideoToolbox Quality Settings
# Quality-based encoding
ffmpeg -i input.mp4 \
-c:v h264_videotoolbox \
-q:v 65 \
output.mp4
# Hardware accelerated decode + encode
ffmpeg -hwaccel videotoolbox \
-i input.mp4 \
-c:v h264_videotoolbox \
-b:v 5M \
output.mp4
Vulkan Video (FFmpeg 7.1+/8.0)
Overview
Vulkan Video provides cross-platform GPU acceleration using Vulkan compute shaders. Unlike proprietary hardware accelerators, Vulkan codecs are based on compute shaders and work on any implementation of Vulkan 1.3.
FFmpeg 7.1 Vulkan Features
- H.264 Vulkan encoding
- H.265/HEVC Vulkan encoding
FFmpeg 8.0 Vulkan Features (New)
- AV1 Vulkan encoding - GPU-accelerated AV1 via compute shaders
- VP9 Vulkan decoding - Hardware-accelerated VP9 decode
- FFv1 Vulkan encode/decode - Lossless codec for archival/capture
- ProRes RAW Vulkan decode - Apple ProRes RAW hardware decode
Benefits of Vulkan Compute Codecs:
- Cross-platform: Same code works on AMD, Intel, and NVIDIA
- No vendor lock-in: Works with any Vulkan 1.3 driver
- Ideal for: Lossless screen capture, high-throughput archival, professional workflows
Basic Vulkan Encoding
# H.264 Vulkan
ffmpeg -init_hw_device vulkan \
-i input.mp4 \
-c:v h264_vulkan \
-b:v 5M \
output.mp4
# H.265/HEVC Vulkan
ffmpeg -init_hw_device vulkan \
-i input.mp4 \
-c:v hevc_vulkan \
-b:v 4M \
output.mp4
# AV1 Vulkan (FFmpeg 8.0+)
ffmpeg -init_hw_device vulkan \
-i input.mp4 \
-c:v av1_vulkan \
-b:v 3M \
output.mp4
# FFv1 Vulkan Lossless (FFmpeg 8.0+)
ffmpeg -init_hw_device vulkan \
-i input.mp4 \
-c:v ffv1_vulkan \
output.mkv
Full Vulkan Pipeline
# Complete Vulkan decode-filter-encode
ffmpeg -init_hw_device vulkan=vk \
-filter_hw_device vk \
-hwaccel vulkan \
-hwaccel_output_format vulkan \
-i input.mp4 \
-vf "scale_vulkan=1280:720" \
-c:v h264_vulkan \
output.mp4
VP9 Vulkan Decoding (FFmpeg 8.0+)
# Hardware decode VP9 with Vulkan
ffmpeg -init_hw_device vulkan \
-hwaccel vulkan \
-hwaccel_output_format vulkan \
-i input.webm \
-c:v h264_vulkan \
output.mp4
ProRes RAW Vulkan Decoding (FFmpeg 8.0+)
# Hardware decode ProRes RAW with Vulkan
ffmpeg -init_hw_device vulkan \
-hwaccel vulkan \
-i input.mov \
-c:v libx264 \
output.mp4
Lossless Screen Capture with FFv1 Vulkan
# High-throughput lossless capture (Linux X11)
ffmpeg -init_hw_device vulkan \
-f x11grab -framerate 60 -i :0.0 \
-c:v ffv1_vulkan \
screen_capture.mkv
# Lossless screen recording (Windows)
ffmpeg -init_hw_device vulkan \
-f gdigrab -framerate 60 -i desktop \
-c:v ffv1_vulkan \
screen_capture.mkv
Upcoming Vulkan Codecs
The next minor update will add:
- ProRes (encode and decode)
- VC-2 (encode and decode)
Docker with Hardware Acceleration
NVIDIA GPU in Docker
# Run with NVIDIA GPU support
docker run --gpus all \
--rm \
-v $(pwd):/data \
jrottenberg/ffmpeg:nvidia \
-hwaccel cuda \
-hwaccel_output_format cuda \
-i /data/input.mp4 \
-c:v h264_nvenc \
/data/output.mp4
Intel QSV in Docker
# Run with Intel GPU access
docker run --rm \
--device=/dev/dri:/dev/dri \
-v $(pwd):/data \
jrottenberg/ffmpeg:vaapi \
-hwaccel qsv \
-i /data/input.mp4 \
-c:v h264_qsv \
/data/output.mp4
Troubleshooting
Common Issues
"No NVENC capable devices found"
# Check GPU support
nvidia-smi
# Verify driver version
nvidia-smi --query-gpu=driver_version --format=csv
# Check CUDA version
nvcc --version
"Cannot load libcuda.so"
# Set library path (Linux)
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
QSV "Unsupported format"
# Force pixel format conversion
ffmpeg -i input.mp4 \
-vf "format=nv12" \
-c:v h264_qsv \
output.mp4
VAAPI permission denied
# Add user to video/render group
sudo usermod -aG video $USER
sudo usermod -aG render $USER
# Re-login or use newgrp
Performance Debugging
# Benchmark encode speed
ffmpeg -benchmark -i input.mp4 -c:v h264_nvenc -f null -
# Show hardware decode stats
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -benchmark -i input.mp4 -f null -
# Monitor GPU usage (NVIDIA)
nvidia-smi dmon -s u
# Monitor GPU usage (Intel)
intel_gpu_top
Best Practices
- Use full GPU pipelines when possible to avoid CPU-GPU memory transfers
- Match decode and encode hardware for best performance
- Use appropriate presets - faster isn't always better for quality
- Enable lookahead and AQ for quality-critical encodes
- Test on target hardware - quality varies by GPU generation
- Monitor GPU memory for high-resolution content
- Consider power efficiency for laptops and servers
- Update drivers regularly for performance and feature improvements
Recommended Settings by Use Case
Live Streaming
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input \
-c:v h264_nvenc -preset p3 -tune ll -zerolatency 1 -b:v 6M \
-f flv rtmp://server/live/stream
VOD Encoding (Quality)
ffmpeg -i input.mp4 \
-c:v hevc_nvenc -preset p6 -tune hq \
-rc vbr -cq 22 -b:v 0 \
-rc-lookahead 32 -spatial-aq 1 \
output.mp4
Batch Processing
# Multiple parallel streams on one GPU
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input1.mp4 -c:v h264_nvenc output1.mp4 &
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input2.mp4 -c:v h264_nvenc output2.mp4 &
wait
This guide covers hardware acceleration fundamentals. For specific platform optimizations and advanced configurations, consult the respective vendor documentation.