How can you mount a PersistentVolumeClaim (PVC) in docling-serve on OpenShift to store EasyOCR models, and what steps are required to ensure docling-serve can access these models?

To mount a PVC in docling-serve on OpenShift for EasyOCR models, you need to edit the Deployment manifest directly (there is no Operator-specific CRD for this). Add the PVC as a volume and mount it in the container, then set the DOCLING_SERVE_ARTIFACTS_PATH environment variable to the mount path. Here is an example snippet:

spec:
  template:
    spec:
      containers:
        - name: api
          env:
            - name: DOCLING_SERVE_ARTIFACTS_PATH
              value: '/modelcache'
          volumeMounts:
            - name: docling-model-cache
              mountPath: /modelcache
          startupProbe:
            httpGet:
              path: /ready
              port: http
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 30
            timeoutSeconds: 2
          readinessProbe:
            httpGet:
              path: /ready
              port: http
            periodSeconds: 5
            timeoutSeconds: 2
            successThreshold: 1
            failureThreshold: 3
          livenessProbe:
            httpGet:
              path: /health
              port: http
            periodSeconds: 30
            timeoutSeconds: 2
            failureThreshold: 3
      volumes:
        - name: docling-model-cache
          persistentVolumeClaim:
            claimName: docling-model-cache-pvc

Your PVC should be defined like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: docling-model-cache-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 10Gi

Ensure your EasyOCR models are present in /modelcache (or your chosen path) and the directory structure matches what docling expects. If a required model is missing, docling-serve will raise a runtime error. It's recommended to preload models into the PVC using a Kubernetes Job before starting docling-serve. For more details, see the official docling-serve documentation.

Health probe configuration#

The deployment manifest includes health probes to ensure docling-serve is fully ready before accepting traffic:

startupProbe (/ready): Allows sufficient time for model loading during pod startup. With failureThreshold: 30 and periodSeconds: 10, the pod has up to 5 minutes to complete initialization, which is critical when preloading large EasyOCR models from the PVC. This prevents Kubernetes from killing the pod during the extended startup time required for model loading.
readinessProbe (/ready): Gates traffic on actual readiness. The /ready endpoint returns 200 only after model loading completes (when load_models_at_boot is enabled). This eliminates timeout errors during rollouts by ensuring the pod doesn't receive requests until models are fully loaded.
livenessProbe (/health): Lightweight liveness check that verifies the API is responsive without checking model or dependency status. Used to detect and restart crashed pods.

Metrics endpoint configuration#

docling-serve supports serving Prometheus metrics on a separate port from the main API via the DOCLING_SERVE_METRICS_PORT environment variable. When set, this starts a dedicated HTTP server on the specified port that serves the /metrics endpoint. This is useful for production deployments where you want to expose metrics on a different port with separate network policies.

Example deployment configuration:

spec:
  template:
    spec:
      containers:
        - name: api
          env:
            - name: DOCLING_SERVE_METRICS_PORT
              value: "9090"
          ports:
            - name: http
              containerPort: 5000
              protocol: TCP
            - name: metrics
              containerPort: 9090
              protocol: TCP

With this configuration, the main API remains accessible on port 5000 while Prometheus can scrape metrics from port 9090. This allows you to apply different network policies or service configurations for API traffic versus monitoring traffic.

Troubleshooting tips:

Ensure your PVC is Bound and healthy (oc get pvc).
Inspect pod events and logs for mount errors (oc describe pod ...).
Confirm the mount path in volumeMounts matches DOCLING_SERVE_ARTIFACTS_PATH.
Check permissions on the PVC (use fsGroup or an initContainer to set permissions if needed).
Make sure the PVC and pod are in the same namespace.