Documents
Can docling-serve fetch documents from object storage (e.g., S3)?
Can docling-serve fetch documents from object storage (e.g., S3)?
Type
Answer
Status
Published
Created
May 2, 2026
Updated
May 2, 2026
Created by
Dosu Bot
Updated by
Dosu Bot

Yes, docling-serve has experimental S3 support, but with the following constraints:

KFP Engine (Experimental)#

S3 support is only available when using the KFP (Kubeflow Pipelines) engine — not with the LOCAL, RQ, or RAY engines. You must also set the environment variable DOCLING_SERVE_ENG_KFP_EXPERIMENTAL=true to enable it.

When using the KFP engine:

  • S3 sources are accepted on the /v1/convert/source, /v1/convert/source/async, and /v1/chunk/*/source endpoints.
  • The S3SourceRequest requires: endpoint, access_key, secret_key, and bucket.
  • S3 sources and S3 targets must be used together — you cannot mix S3 sources with non-S3 targets or vice versa.
  • S3-compatible systems like MinIO or IBM COS may work by pointing to a custom endpoint, though this is not explicitly documented.

Standard (Non-KFP) Deployments#

There is no native object storage support. The recommended workaround is to fetch the document yourself and pass it as a byte stream or base64-encoded content:

file_bytes = s3_client.get_object(...)['Body'].read()
file_stream = BytesIO(file_bytes)
doc_stream = DocumentStream(stream=file_stream, name=filename, format=input_format)
result = doc_converter.convert(doc_stream)