To enable GPU acceleration in Docling (v2.12.0 or later), you can configure it either in your Python code or via environment variables:
In Python code:
from docling.datamodel.accelerator_options import AcceleratorOptions, AcceleratorDevice
from docling.datamodel.pipeline_options import ThreadedPdfPipelineOptions
pipeline_options = ThreadedPdfPipelineOptions(
accelerator_options=AcceleratorOptions(
device=AcceleratorDevice.CUDA,
),
ocr_batch_size=4,
layout_batch_size=64,
table_batch_size=4,
)
Via environment variable:
Set DOCLING_DEVICE=cuda (or cuda:0, cuda:1 for specific GPUs).
Supported devices:
cuda(NVIDIA GPUs)mps(Apple Silicon)xpu(Intel GPUs)rocm(AMD GPUs)auto(automatic detection, default)
Docker usage:
NVIDIA GPUs:
Use CUDA-enabled images such as ghcr.io/docling-project/docling-serve-cu128 (CUDA 12.8) or cu130 (CUDA 13.0). Both images support linux/amd64 and linux/arm64 platforms, enabling GPU acceleration on ARM64 systems with NVIDIA GPUs (such as NVIDIA Jetson devices or ARM-based cloud instances with GPU support). For Docker, ensure you have the NVIDIA Container Toolkit installed.
AMD GPUs:
Use ROCm-enabled images for AMD GPU support. Two ROCm variants are available:
ghcr.io/docling-project/docling-serve-rocm(ROCm 6.3)ghcr.io/docling-project/docling-serve-rocm72(ROCm 7.2)
Example Docker Compose configuration for AMD GPUs:
services:
docling-serve:
image: ghcr.io/docling-project/docling-serve-rocm72:main
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
environment:
## Enable experimental Flash/Mem-Efficient attention kernels on AMD GPU (aotriton)
TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL: "1"
## Optional: for older cards
# HSA_OVERRIDE_GFX_VERSION: "11.0.0"
# HSA_ENABLE_SDMA: "0"
See docs/deploy-examples/compose-amd.yaml for a complete working example.
Optional: Flash Attention 2
For Ampere+ NVIDIA GPUs, you can enable Flash Attention 2 for improved speed and memory usage by setting cuda_use_flash_attention2=True in AcceleratorOptions or DOCLING_CUDA_USE_FLASH_ATTENTION2=true as an environment variable.
Requirements:
- A compatible GPU (NVIDIA, AMD, Intel, or Apple Silicon)
- Correct drivers installed
- (For Docker with NVIDIA) NVIDIA Container Toolkit