DBOS Exception System#
The DBOS Exception System is the framework-specific exception hierarchy in DBOS Python, a durable execution framework for building reliable applications. The exception system distinguishes between catchable exceptions (intended for user code) and non-catchable exceptions (reserved for critical framework operations) through two base classes: DBOSException and DBOSBaseException.
This dual-tier architecture enables the framework to manage workflow lifecycle operations (such as cancellation and ID uniqueness enforcement) without interference from user exception handlers. The system includes 17 distinct error codes defined in the DBOSErrorCode enum, covering workflow conflicts, recovery failures, queue operations, authorization, and step retry exhaustion.
The exception hierarchy is designed to support DBOS's durable execution model, where workflows can be interrupted and resumed. Understanding the distinction between catchable and non-catchable exceptions is critical for writing correct DBOS applications, as catching framework-critical exceptions can interfere with workflow recovery and state management.
Exception Base Classes#
DBOSException#
DBOSException is the base class for all catchable DBOS exceptions. It inherits from Python's standard Exception class, making it catchable by normal try/except blocks in user code.
Attributes:
message(str): The error message stringdbos_error_code(Optional[int]): The error code from theDBOSErrorCodeenumstatus_code(Optional[int]): HTTP status code for web-facing operations (e.g., 403 for authorization errors)
User code should catch DBOSException or its subclasses when handling expected error conditions like authorization failures, step retry exhaustion, or configuration validation errors.
DBOSBaseException#
DBOSBaseException is the base class for non-catchable DBOS exceptions. It inherits from Python's BaseException instead of Exception, preventing it from being caught by generic except Exception: handlers.
Attributes:
message(str): The error message stringdbos_error_code(Optional[int]): The error code from theDBOSErrorCodeenumstatus_code(Optional[int]): HTTP status code where applicable
This design ensures that critical framework operations—such as workflow cancellation and ID conflict detection—cannot be accidentally intercepted by user code, which could compromise the integrity of workflow state management.
Error Codes#
The DBOSErrorCode enum defines standardized error codes for programmatic exception handling:
| Code | Name | Description |
|---|---|---|
| 1 | ConflictingIDError | Workflow database record already exists with the same ID |
| 2 | RecoveryError | Workflow recovery operation failed |
| 3 | InitializationError | DBOS framework initialization incomplete |
| 4 | WorkflowFunctionNotFound | Database references unregistered workflow function |
| 5 | NonExistentWorkflowError | Workflow database record does not exist for given ID |
| 6 | MaxRecoveryAttemptsExceeded | Workflow exceeded maximum recovery attempt limit |
| 7 | MaxStepRetriesExceeded | Step exceeded maximum retry attempts |
| 8 | NotAuthorized | Role-based security authorization failure |
| 9 | ConflictingWorkflowError | Different workflows started with same workflow ID |
| 10 | WorkflowCancelled | Workflow has been cancelled |
| 11 | UnexpectedStep | Step has unexpected recorded name (non-deterministic workflow) |
| 12 | QueueDeduplicated | Workflow deduplicated in queue |
| 13 | AwaitedWorkflowCancelled | Child workflow was cancelled |
| 14 | AwaitedWorkflowMaxRecoveryAttemptsExceeded | Child workflow exceeded recovery attempts |
| 25 | ConflictingRegistrationError | Conflicting decorators applied to same function |
Catchable Exceptions#
All exceptions in this section inherit from DBOSException and can be caught by user code.
Workflow Management Exceptions#
DBOSConflictingWorkflowError#
DBOSConflictingWorkflowError is raised when different workflow functions are invoked with the same workflow ID. This indicates an idempotency violation where the same unique identifier is reused for different operations.
Constructor:
DBOSConflictingWorkflowError(workflow_id: str, message: Optional[str] = None)
DBOSRecoveryError#
DBOSRecoveryError is raised when a workflow recovery operation fails. This can occur during automatic recovery of interrupted workflows.
Constructor:
DBOSRecoveryError(workflow_id: str, message: Optional[str] = None)
DBOSWorkflowFunctionNotFoundError#
DBOSWorkflowFunctionNotFoundError is raised when the database references a workflow function that is not registered in the current codebase. This typically occurs during recovery when code has been modified or removed.
Constructor:
DBOSWorkflowFunctionNotFoundError(workflow_id: str, message: Optional[str] = None)
DBOSNonExistentWorkflowError#
DBOSNonExistentWorkflowError is raised when attempting to access a workflow that does not have a database record for the given ID.
Constructor:
DBOSNonExistentWorkflowError(destination: str, destination_id: str)
Retry and Recovery Exceptions#
MaxRecoveryAttemptsExceededError#
MaxRecoveryAttemptsExceededError is raised when a workflow exceeds its maximum number of execution or recovery attempts. The default limit is 100 attempts, configurable via the max_recovery_attempts parameter on the @DBOS.workflow() decorator.
Constructor:
MaxRecoveryAttemptsExceededError(wf_id: str, max_retries: int)
The exception message includes a link to the decorator reference documentation for additional details.
DBOSMaxStepRetriesExceeded#
DBOSMaxStepRetriesExceeded is raised when a step with retry configuration exhausts all retry attempts without succeeding. Steps support configurable automatic retries through the @DBOS.step() decorator.
Constructor:
DBOSMaxStepRetriesExceeded(step_name: str, max_retries: int, errors: list[Exception])
Attributes:
step_name(str): Name of the failing stepmax_retries(int): Maximum number of attempts configurederrors(list[Exception]): List of all exceptions from each attempt
This exception implements __reduce__ for pickle serialization support, enabling it to be transmitted across process boundaries.
Queue Exceptions#
DBOSQueueDeduplicatedError#
DBOSQueueDeduplicatedError is raised when a workflow is deduplicated in a queue due to an existing workflow with the same deduplication ID.
Constructor:
DBOSQueueDeduplicatedError(workflow_id: str, queue_name: str, deduplication_id: str)
Attributes:
workflow_id(str): ID of the deduplicated workflowqueue_name(str): Name of the queuededuplication_id(str): The deduplication identifier
Child Workflow Exceptions#
DBOSAwaitedWorkflowCancelledError#
DBOSAwaitedWorkflowCancelledError is raised in a parent workflow when an awaited child workflow has been cancelled.
Constructor:
DBOSAwaitedWorkflowCancelledError(workflow_id: str)
DBOSAwaitedWorkflowMaxRecoveryAttemptsExceeded#
DBOSAwaitedWorkflowMaxRecoveryAttemptsExceeded is raised in a parent workflow when an awaited child workflow has exceeded its maximum recovery attempts.
Constructor:
DBOSAwaitedWorkflowMaxRecoveryAttemptsExceeded(workflow_id: str)
Framework Exceptions#
DBOSInitializationError#
DBOSInitializationError is raised when DBOS framework initialization does not complete successfully.
Constructor:
DBOSInitializationError(message: str)
DBOSNotAuthorizedError#
DBOSNotAuthorizedError is raised by DBOS role-based security when a user is not authorized to access a function. This exception sets status_code to 403 for HTTP responses.
Constructor:
DBOSNotAuthorizedError(msg: str)
DBOSUnexpectedStepError#
DBOSUnexpectedStepError is raised when a step has an unexpected recorded name during workflow execution. This typically indicates a non-deterministic workflow where the order or identity of steps changes between executions.
Constructor:
DBOSUnexpectedStepError(workflow_id: str, step_id: int, expected_name: str, recorded_name: str)
The error message includes guidance to "check that your workflow is deterministic."
DBOSConflictingRegistrationError#
DBOSConflictingRegistrationError is raised when conflicting decorators (e.g., both @DBOS.workflow() and @DBOS.transaction()) are applied to the same function.
Constructor:
DBOSConflictingRegistrationError(name: str)
Non-Catchable Exceptions#
All exceptions in this section inherit from DBOSBaseException and should not be caught by user code.
DBOSWorkflowCancelledError#
DBOSWorkflowCancelledError is raised when a workflow has been cancelled. By inheriting from BaseException, this exception bypasses generic exception handlers, ensuring that cancellation propagates correctly through the workflow execution stack.
Constructor:
DBOSWorkflowCancelledError(msg: str)
DBOSWorkflowConflictIDError#
DBOSWorkflowConflictIDError is raised when a workflow database record already exists with the same workflow ID. This exception ensures workflow ID uniqueness at the database level and cannot be caught by user code to maintain state consistency.
Constructor:
DBOSWorkflowConflictIDError(workflow_id: str)
Usage Patterns#
Framework Exception Raising#
Decorator Validation#
The framework raises DBOSException when validating decorator usage, such as preventing async functions from being decorated as transactions:
if inspect.iscoroutinefunction(func):
raise DBOSException(
f"Function {transaction_name} is a coroutine function, "
f"but DBOS.transaction does not support coroutine functions"
)
Configuration Validation#
The framework validates configuration parameters and raises exceptions for invalid inputs:
if not croniter.is_valid(schedule, second_at_beginning=True):
raise DBOSException(f"Invalid cron schedule: '{schedule}'")
Internal Workflow State Management#
The framework raises DBOSWorkflowConflictIDError (non-catchable) for internal state conflicts:
if len(rows) > 0 and int(rows[0][0]) != completed_at_epoch_ms:
raise DBOSWorkflowConflictIDError(result["workflow_uuid"])
User Exception Handling#
Handling Step Retry Exhaustion#
User code should catch DBOSMaxStepRetriesExceeded when steps exhaust their retry attempts:
@DBOS.step(retries_allowed=True, interval_seconds=0, max_attempts=5)
def failing_step() -> None:
raise Exception("Transient failure")
@DBOS.workflow()
def process_workflow() -> None:
try:
failing_step()
except DBOSMaxStepRetriesExceeded as e:
# Access exception attributes
logger.error(f"Step {e.step_name} failed after {e.max_retries} attempts")
logger.error(f"Errors: {e.errors}")
# Handle the failure appropriately
raise
Handling Child Workflow Cancellation#
Catch DBOSAwaitedWorkflowCancelledError when awaiting child workflows that may be cancelled:
@DBOS.workflow()
async def long_running_child() -> str:
while True:
await DBOS.sleep_async(0.1)
return "completed"
@DBOS.workflow()
async def parent_workflow() -> str:
try:
result = await long_running_child()
return result
except DBOSAwaitedWorkflowCancelledError as e:
logger.warning(f"Child workflow {e.workflow_id} was cancelled")
return "cancelled"
HTTP Error Handling#
The admin server catches DBOSException for workflow operations and converts them to HTTP responses:
try:
forked_id = self.dbos.fork_workflow(
workflow_id,
start_step,
# ... parameters
)
response_body = json.dumps({"workflow_id": forked_id}).encode("utf-8")
self.send_response(200)
except DBOSException as e:
print(f"Error forking workflow: {e}")
self.send_response(500)
response_body = json.dumps({"error": str(e)}).encode("utf-8")
Exception Handling Best Practices#
When to Catch DBOSException:
- Authorization errors (
DBOSNotAuthorizedError) - Step retry exhaustion (
DBOSMaxStepRetriesExceeded) - Awaited workflow cancellation (
DBOSAwaitedWorkflowCancelledError) - Configuration validation errors
- Workflow recovery failures (
DBOSRecoveryError) - Queue deduplication (
DBOSQueueDeduplicatedError)
When NOT to Catch Exceptions:
- Do not catch
DBOSBaseExceptionor its subclasses in application code - Let workflow cancellation propagate (
DBOSWorkflowCancelledError) - Let ID conflict detection be handled by the framework (
DBOSWorkflowConflictIDError) - These inherit from
BaseExceptionspecifically to prevent accidental catching
Workflow Error Behavior#
Workflow Termination#
When an exception is thrown from a workflow, the workflow terminates with specific behavior:
- DBOS records the exception in the database
- Workflow status is set to
ERROR - The workflow is not automatically recovered
- Uncaught exceptions are assumed to be nonrecoverable
This design requires explicit recovery using DBOS.resume_workflow() or DBOS.fork_workflow() after fixing the underlying issue.
Step Retry Configuration#
Steps support configurable automatic retries with exponential backoff:
@DBOS.step(
retries_allowed: bool = False, # Enable automatic retries
interval_seconds: float = 1.0, # Initial retry delay
max_attempts: int = 3, # Maximum retry attempts
backoff_rate: float = 2.0, # Exponential backoff multiplier
should_retry: Callable[[BaseException], bool] = None # Optional retry predicate
)
def potentially_failing_step():
# Step implementation
pass
Parameters:
retries_allowed(bool): Enable automatic retriesinterval_seconds(float): Initial retry delaymax_attempts(int): Maximum retry attemptsbackoff_rate(float): Exponential backoff multipliershould_retry(Callable[[BaseException], bool]): Optional predicate called with a raised exception to decide whether the step should be retried. If it returns False, the exception is re-raised immediately without further retries.
Retry Exhaustion Behavior:
If should_retry returns False for a raised exception, the exception is re-raised immediately without exhausting retries and without wrapping in DBOSMaxStepRetriesExceeded.
If a step exhausts all max_attempts (when should_retry returns True or is not provided), it throws DBOSMaxStepRetriesExceeded to the calling workflow. If uncaught, this terminates the workflow.
Selective Retry Example:
def should_retry_network_errors(exc: BaseException) -> bool:
return isinstance(exc, (TimeoutError, ConnectionError))
@DBOS.step(should_retry=should_retry_network_errors)
def fetch_data():
# Only TimeoutError and ConnectionError will be retried
...
Recovery Limits#
Workflows have a max_recovery_attempts parameter (default: 100) that limits automatic recovery:
@DBOS.workflow(max_recovery_attempts=100)
def my_workflow():
# Workflow implementation
pass
When this limit is exceeded:
- Workflow status becomes
MAX_RECOVERY_ATTEMPTS_EXCEEDED - Further recovery attempts fail immediately
- The limit can be disabled with
max_recovery_attempts=None
Transaction Behavior#
Transactions automatically commit on success and rollback on exception. DBOS wraps transaction functions in a SQLAlchemy "begin once" block.
Important: Do not call DBOS.sql_session.commit() or DBOS.sql_session.rollback() manually within transaction functions, or you will encounter:
sqlalchemy.exc.InvalidRequestError: Can't operate on closed transaction inside context manager
Workflow Status Inspection#
The WorkflowStatus object provides error information for debugging:
class WorkflowStatus:
status: str # ENQUEUED, PENDING, SUCCESS, ERROR, CANCELLED,
# or MAX_RECOVERY_ATTEMPTS_EXCEEDED
error: Optional[Exception] # The exception the workflow threw, if any
# ... other attributes
Inspection Methods:
DBOS.get_workflow_status(workflow_id)- Get status and error of a specific workflowDBOS.list_workflows(status="ERROR")- List all workflows in ERROR stateDBOS.list_workflow_steps(workflow_id)- Get detailed step information including errors
The StepInfo object provides per-step error details:
class StepInfo(TypedDict):
function_id: int
function_name: str
output: Optional[Any]
error: Optional[Exception] # The exception the step threw, if any
Design Principles#
The DBOS exception system is built on several key design principles:
-
Catchability Distinction: Separating catchable (
DBOSException) from non-catchable (DBOSBaseException) exceptions ensures critical framework operations cannot be intercepted by user code. -
Structured Error Codes: The
DBOSErrorCodeenum enables programmatic exception handling and consistent error reporting across the framework. -
Serialization Support: Several exceptions implement
__reduce__for pickle serialization, enabling exception transmission across process boundaries during distributed workflow execution. -
HTTP Integration: Exceptions like
DBOSNotAuthorizedErrorinclude HTTP status codes for seamless integration with web frameworks. -
Rich Context: Exceptions carry detailed context (workflow IDs, step names, error lists) to aid debugging and observability.
-
Fail-Safe by Default: Uncaught exceptions terminate workflows rather than silently continuing, preventing undefined behavior from propagating through the system.
Related Topics#
- DBOS Workflows: Durable execution functions that can be interrupted and resumed
- DBOS Steps: Retriable operations within workflows that support automatic retry logic
- DBOS Transactions: Database operations with automatic commit/rollback behavior
- Workflow Recovery: Mechanisms for resuming interrupted or failed workflows
- DBOS Queues: Asynchronous workflow execution with deduplication support
Code Files#
| File Path | Description |
|---|---|
dbos/_error.py | Complete exception hierarchy definition with all exception classes and error codes |
tests/test_failures.py | Comprehensive tests for step retry exhaustion, workflow recovery failures, and exception serialization |
tests/test_async.py | Tests for async workflow cancellation, timeout handling, and workflow handle exceptions |
tests/test_auth.py | Tests for authorization exception handling and role-based access control |
dbos/_core.py | Core framework logic that raises exceptions for decorator validation and workflow execution |
dbos/_admin_server.py | HTTP server implementation demonstrating exception-to-HTTP-response conversion patterns |