Azure Container Apps (ACA) with a custom container is the recommended deployment option for your Docling-based Python SDK application. This is because:
- Resource Requirements: Your workload involves CPU-intensive tasks (OCR, table extraction) and uses PyTorch with pre-downloaded models, which require persistent storage, flexible memory allocation, and no execution timeout constraints.
- Limitations of Azure Functions: Azure Functions have limited local storage, memory, and execution timeouts, making them unsuitable for heavy PDF processing workflows.
- Simplicity and Flexibility: Deploying directly to ACA with your custom Dockerfile avoids the complexity of Azure Functions on ACA and gives you full control over dependencies and environment.
- Scalability: ACA allows you to allocate sufficient resources (recommended: 4 vCPUs, 8GB+ RAM) and supports GPU acceleration if needed in the future.
Deployment Tips:
- Your approach of pre-downloading models into the container and setting
DOCLING_ARTIFACTS_PATHis correct. - For production, consider mounting a persistent volume for models to avoid rebuilding the image when models update.
- Ensure you set a thread budget (e.g.,
OMP_NUM_THREADS=4) and allocate enough memory for stable operation.
Reference: Azure Container Apps documentation
Your current Dockerfile and setup are well-suited for ACA deployment.