You can improve the resolution of images extracted from PDFs in docling by configuring the images_scale parameter in PdfPipelineOptions. By default, images_scale is set to 1.0, but increasing it (e.g., to 2.0) will produce higher-resolution images. Here is an example of how to set this option programmatically:
from docling.datamodel.pipeline_options import PdfPipelineOptions
pipeline_options = PdfPipelineOptions(
generate_page_images=True, # for full-page images
generate_picture_images=True, # for figures/pictures
images_scale=2.0 # increase for higher resolution
)
If you use the CLI, docling sets images_scale=2 by default when exporting images, but there is no direct CLI flag to set it higher; for more control, you need to customize the pipeline in code (example).
Note: There are known issues in some recent versions where setting images_scale above 1.0 can cause bugs with image cropping or bounding box scaling, resulting in incorrectly framed images (details). If you encounter this, a temporary workaround is to extract at scale=1.0 and manually upscale images, though this will not improve actual detail or framing.
For non-PDF formats (like DOCX or PPTX), image extraction quality cannot be tuned via pipeline options—quality is determined by the backend converter (reference).