Database Connection Pooling#

Database Connection Pooling in DBOS Python is a critical infrastructure component that manages database connections through SQLAlchemy's engine API, providing efficient connection reuse, resource optimization, and scalable database access. Connection pooling maintains a pool of persistent database connections that can be reused across multiple operations, eliminating the overhead of creating new connections for each database interaction.

DBOS employs a dual database architecture that separates concerns between framework operations and application data. The system database stores DBOS framework metadata, workflow state, and execution history, while the application database contains user-defined application data. Each database maintains its own connection pool with independent configuration, allowing fine-grained control over resource allocation and isolation between framework and application operations.

The connection pooling system provides several key capabilities: configurable pool parameters (size, overflow, timeouts), integration with external connection poolers like PgBouncer and Supavisor, support for custom SQLAlchemy engine injection, and automatic connection health monitoring. These features enable DBOS applications to scale efficiently while maintaining predictable resource usage and robust error handling.

Architecture#

Dual Database System#

DBOS maintains two distinct databases, each serving a specific purpose within the framework:

System Database: The system database stores DBOS framework metadata and workflow state, including workflow execution history, transaction logs, and recovery information. This separation ensures that framework operations remain isolated from application data, improving reliability and enabling features like exactly-once execution guarantees.

Application Database: The application database stores user-defined application data and is accessed through standard database operations within DBOS workflows and transactions.

Both databases utilize separate connection pools to isolate resource management. This architectural decision prevents framework operations from competing with application queries for database connections, ensuring consistent performance and predictable resource allocation. The system database connection pool is managed separately and can be sized independently based on workflow concurrency requirements.

Engine creation for both databases follows a consistent pattern, with system database engines created in _sys_db_postgres.py and application database engines in _app_db.py.

SQLAlchemy Integration#

DBOS leverages SQLAlchemy's robust connection pooling infrastructure through its create_engine() API. This integration provides access to SQLAlchemy's mature pooling implementations, including QueuePool (default), NullPool (for external poolers), and other specialized pool types.

Configuration is achieved through the db_engine_kwargs parameter, which accepts a dictionary of keyword arguments that are passed directly to create_engine(). This design allows users to access the full range of SQLAlchemy's engine configuration options while maintaining sensible defaults for common use cases.

For advanced scenarios, DBOS supports custom system_database_engine injection, which bypasses default engine creation. This capability is particularly useful when integrating with external connection poolers or when custom engine configuration is required that cannot be expressed through standard configuration parameters.

Configuration#

Default Pool Parameters#

DBOS provides a carefully tuned set of default engine configuration parameters optimized for typical production workloads:

{
    "pool_size": 20,
    "max_overflow": 0,
    "pool_timeout": 30,
    "pool_pre_ping": True,
    "connect_args": {
        "application_name": "dbos_transact",
        "connect_timeout": 10
    }
}

These defaults balance connection availability, resource usage, and connection health monitoring.

Pool Parameters#

pool_size: The number of persistent connections maintained in the pool. Default is 20 connections, with a minimum recommended value of 5 to ensure adequate connections for framework operations. This parameter determines the steady-state number of database connections that remain open and available for reuse.

max_overflow: Additional connections that can be created beyond pool_size when all pooled connections are in use. The default value of 0 means no overflow connections are allowed, providing strict resource limits and predictable connection counts. Setting this to a non-zero value allows temporary bursts of additional connections at the cost of less predictable resource usage.

pool_timeout: The maximum time in seconds to wait for a connection to become available before raising an error. The default of 30 seconds provides a reasonable balance between availability and responsiveness, preventing indefinite waits while allowing time for connections to be released during high load.

pool_pre_ping: When enabled (default: True), SQLAlchemy tests each connection before use by executing a simple query. This prevents "lost connection" errors from stale connections that may have been closed by the database server due to timeouts or network issues. The small performance overhead is generally worthwhile for production reliability.

connect_timeout: The PostgreSQL connection timeout in seconds, defaulting to 10 seconds. This parameter controls how long the client waits when establishing a new connection to the database server. It's distinct from pool_timeout, which controls waiting for an available connection from the pool.

application_name: Set to "dbos_transact" by default, this PostgreSQL connection parameter identifies DBOS applications in the database's connection list. This is useful for monitoring, debugging, and resource management at the database level.

System Database Pool Configuration#

The system database pool is configurable via the sys_db_pool_size parameter in the configuration. The system pool is created by copying the application pool configuration and overriding the pool_size, ensuring consistent behavior across both pools while allowing independent sizing. The default is 20 connections.

System database pool size should be adjusted based on workflow concurrency patterns. Each active workflow may require system database connections for state management and recovery operations, so applications with high workflow concurrency may benefit from larger system database pools.

Application Database Pool Configuration#

The application database pool is configured through db_engine_kwargs in the configuration file or DBOSConfig object. User-provided kwargs are merged with defaults using .update(), allowing selective overrides of default parameters. The complete configuration dictionary is then passed directly to SQLAlchemy's create_engine() via **engine_kwargs unpacking.

Configuration Examples#

YAML configuration:

name: "my-application"
system_database_url: "postgresql://user:pass@localhost/myapp_dbos_sys"

database:
  sys_db_pool_size: 30
  db_engine_kwargs:
    pool_timeout: 60
    pool_size: 25

Python configuration:

config = DBOSConfig(
    name="my-app",
    system_database_url="postgresql://localhost/myapp_sys",
    sys_db_pool_size=30,
    db_engine_kwargs={
        "pool_timeout": 60,
        "pool_size": 25,
        "pool_pre_ping": True
    }
)

External Connection Poolers#

NullPool Configuration#

For deployments using external connection poolers, DBOS supports disabling application-level pooling through SQLAlchemy's NullPool class. Unlike QueuePool (the default), NullPool creates a new database connection for each request and closes it immediately after use, effectively delegating all connection pooling to an external system.

This approach avoids "double-pooling" scenarios where both the application and an external pooler maintain connection pools, which can lead to inefficient resource utilization and connection count multiplication. When using NullPool, the external pooler becomes solely responsible for connection management.

When NullPool is specified in db_engine_kwargs, DBOS automatically removes incompatible pool parameters (pool_timeout, max_overflow, pool_size, and pool_pre_ping) from the engine configuration. This ensures that NullPool functions correctly without requiring manual parameter cleanup. Users can simply specify poolclass: NullPool and DBOS will handle the rest.

Example configuration:

from sqlalchemy.pool import NullPool

config["db_engine_kwargs"] = {"poolclass": NullPool}

The standard pool parameters are not applicable when using NullPool and will be automatically removed by DBOS during configuration.

PgBouncer Integration#

PgBouncer is a lightweight connection pooler for PostgreSQL that sits between applications and database servers. DBOS is compatible with PgBouncer in session mode only, NOT transaction mode.

Mode Requirements: Transaction mode is incompatible due to DBOS's use of LISTEN/NOTIFY, which requires session-level state that transaction mode pooling does not preserve. Using transaction mode can cause notification listener errors and break workflow recovery mechanisms. The recommended setup combines NullPool with PgBouncer in session mode.

Known Issue: There is a known issue with hardcoded 'postgres' database name when using PgBouncer. DBOS attempts to connect to the postgres database to check if the system database exists and create it if necessary. When using PgBouncer with databases that don't expose the postgres database, this can cause connection failures. The workaround is to provide a custom system_database_engine that bypasses this check.

Recommended configuration:

from sqlalchemy.pool import NullPool

config = {
    "system_database_url": "postgresql://user:pass@pgbouncer:6432/myapp_sys",
    "db_engine_kwargs": {"poolclass": NullPool}
}

Supavisor Integration#

Supavisor is Supabase's connection pooler, designed for serverless and edge computing environments. Like PgBouncer, Supavisor has specific mode requirements for DBOS compatibility.

DBOS applications connecting to Supabase must use "Direct connection" or "Session pooler" mode. The transaction pooler mode should be avoided for the same reasons as with PgBouncer: DBOS relies on PostgreSQL features like LISTEN/NOTIFY that require session-level state preservation.

When configuring Supabase connections, select either the direct connection string or the session pooler connection string from the Supabase dashboard. As with PgBouncer, using NullPool is recommended to avoid double-pooling.

Advanced Configuration#

Custom System Database Engine#

For scenarios requiring fine-grained control over engine creation or when working around specific deployment constraints, DBOS allows providing a custom SQLAlchemy engine via the system_database_engine parameter. When supplied, this custom engine bypasses the default engine creation logic entirely.

This capability is particularly useful when using PgBouncer or custom database setups that require non-standard configuration. As mentioned in the PgBouncer section, it serves as a workaround for the hardcoded 'postgres' database name issue, allowing applications to provide a pre-configured engine that connects directly to the system database without attempting to verify or create it first.

Example:

import sqlalchemy as sa

engine = sa.create_engine(
    system_database_url,
    pool_size=30,
    pool_pre_ping=True
)

config = DBOSConfig(
    name="dbos-app",
    system_database_url=system_database_url,
    system_database_engine=engine
)

Environment Variables#

For cloud deployments and containerized environments, DBOS supports configuration via environment variables:

DBOS_SYSTEM_DATABASE_URL: System database connection string
DBOS_DATABASE_URL: Application database connection string

These environment variables take precedence over configuration file values, enabling environment-specific database configuration without code changes.

Database URL Formats#

DBOS supports multiple database URL formats to accommodate various deployment scenarios:

Standard PostgreSQL URL: The standard format follows the pattern postgresql://user:pass@host:port/database, which is compatible with most PostgreSQL deployments.

Unix Domain Sockets: For local database connections or containerized deployments using shared Unix sockets, DBOS supports Unix domain socket connections with URLs like postgresql+asyncpg://user:pass@/db?host=/path/to/socket. This format is particularly useful for high-performance local connections that bypass network stack overhead.

Special Characters in Database Names: Database names containing special characters like hyphens are supported. Early versions had issues with such names, but these have been resolved to support modern database naming conventions.

SQLite Support#

While DBOS is designed primarily for PostgreSQL, SQLite support is provided for development and testing purposes. SQLite engines include special handling to filter out PostgreSQL-specific connection parameters like application_name and connect_timeout that would cause errors with SQLite.

Both the system database SQLite implementation and the application database SQLite implementation automatically filter these parameters, allowing the same configuration to work with both PostgreSQL and SQLite.

Important: SQLite is recommended for development only, not production. SQLite databases are local files and cannot be used in distributed settings where applications run on multiple servers, limiting their use to single-instance development and testing scenarios.

Best Practices#

The following best practices help ensure reliable, performant, and maintainable database connection pooling in production DBOS deployments:

Enable connection health checks: Use pool_pre_ping=True (enabled by default) to detect stale connections before they cause errors. This small overhead prevents "server closed the connection unexpectedly" errors that can occur when connections are terminated by the database server, network timeouts, or firewall rules.

Set appropriate timeouts: The default pool_timeout of 30 seconds balances availability and responsiveness for most workloads. Increase this for applications with highly variable load patterns or longer-running transactions; decrease it for applications requiring fast failure detection.

Maintain predictable resource usage: Use max_overflow=0 (the default) for strict connection limits that make capacity planning straightforward. This prevents unbounded connection growth during traffic spikes and ensures your application's connection usage remains within expected bounds.

Disable application pooling with external poolers: When using PgBouncer or Supavisor, use NullPool to disable application-level pooling. This avoids double-pooling and simplifies the overall architecture by consolidating connection management at the infrastructure layer.

Respect minimum pool size: Do not set pool size below 5 connections. DBOS framework operations require a minimum number of connections for workflow state management, recovery operations, and notification listening. Smaller pools may cause deadlocks or performance degradation.

Monitor connection usage: Check your PostgreSQL server's max_connections limit (typically 100 connections per GB of memory). Ensure the sum of all application pools across all server instances does not exceed this limit. Use SHOW max_connections; in PostgreSQL to verify your database's capacity.

Use session mode with external poolers: When deploying with PgBouncer or Supavisor, always use session mode, never transaction mode. Transaction mode is incompatible with DBOS's use of LISTEN/NOTIFY for workflow coordination and will cause runtime errors.

Scale pool size with workflow concurrency: Adjust sys_db_pool_size based on your application's workflow concurrency patterns. Applications executing many concurrent workflows may require larger system database pools to avoid connection starvation. Monitor pool exhaustion metrics and adjust accordingly.

Isolate production and development configurations: Use environment variables or separate configuration files for different environments. Production deployments typically benefit from larger pools and stricter timeouts, while development environments can use smaller pools and more lenient settings.

Relevant Code Files#

File	Purpose	Key Components
dbos/_dbos_config.py	Configuration management	sys_db_pool_size, db_engine_kwargs, default pool parameters
dbos/_sys_db_postgres.py	System database connection	PostgreSQL engine creation
dbos/_sys_db.py	System database base class	Engine initialization logic
dbos/_app_db.py	Application database connection	PostgreSQL engine, SQLite engine
dbos/_sys_db_sqlite.py	SQLite system database	Parameter filtering for SQLite