Documents
architecture
architecture
Type
External
Status
Published
Created
Mar 4, 2026
Updated
Mar 5, 2026
Updated by
Dosu Bot
Source
View

Architecture#

DBSurveyor follows a security-first, modular architecture designed for flexibility, maintainability, and offline operation. This document details the system architecture and design decisions.

System Overview#

Crate Architecture#

Workspace Structure#

DBSurveyor uses a Cargo workspace with three main crates:

dbsurveyor/
├── dbsurveyor-core/ # Shared library
├── dbsurveyor-collect/ # Collection binary
├── dbsurveyor/ # Documentation binary
└── Cargo.toml # Workspace configuration

Dependency Graph#

Core Library (dbsurveyor-core)#

Module Structure#

// dbsurveyor-core/src/lib.rs
pub mod adapters; // Database adapter traits and factory
pub mod error; // Comprehensive error handling
pub mod models; // Unified data models
pub mod security; // Encryption and credential protection

// Re-exports for public API
pub use adapters::{create_adapter, DatabaseAdapter};
pub use error::{DbSurveyorError, Result};
pub use models::{DatabaseSchema, DatabaseType};

Data Models#

The core defines unified data structures that work across all database types:

// Unified schema representation
pub struct DatabaseSchema {
    pub format_version: String,
    pub database_info: DatabaseInfo,
    pub tables: Vec<Table>,
    pub views: Vec<View>,
    pub indexes: Vec<Index>,
    pub constraints: Vec<Constraint>,
    pub procedures: Vec<Procedure>,
    pub functions: Vec<Procedure>,
    pub triggers: Vec<Trigger>,
    pub custom_types: Vec<CustomType>,
    pub samples: Option<Vec<TableSample>>,
    pub collection_metadata: CollectionMetadata,
}

// Cross-database type mapping
pub enum UnifiedDataType {
    String { max_length: Option<u32> },
    Integer { bits: u8, signed: bool },
    Float { precision: Option<u8> },
    Boolean,
    DateTime { with_timezone: bool },
    Json,
    Array { element_type: Box<UnifiedDataType> },
    Custom { type_name: String },
}

Adapter Pattern#

Database adapters implement a common trait for unified access:

#[async_trait]
pub trait DatabaseAdapter: Send + Sync {
    async fn test_connection(&self) -> Result<()>;
    async fn collect_schema(&self) -> Result<DatabaseSchema>;
    fn database_type(&self) -> DatabaseType;
    fn supports_feature(&self, feature: AdapterFeature) -> bool;
    fn connection_config(&self) -> ConnectionConfig;
}

Factory Pattern#

The adapter factory provides database-agnostic instantiation:

pub async fn create_adapter(connection_string: &str) -> Result<Box<dyn DatabaseAdapter>> {
    let database_type = detect_database_type(connection_string)?;

    match database_type {
        DatabaseType::PostgreSQL => {
            #[cfg(feature = "postgresql")]
            {
                let adapter = PostgresAdapter::new(connection_string).await?;
                Ok(Box::new(adapter))
            }
            #[cfg(not(feature = "postgresql"))]
            Err(DbSurveyorError::unsupported_feature("PostgreSQL"))
        }
        // ... other database types
    }
}

Security Architecture#

Credential Protection#

Implementation:

use zeroize::{Zeroize, Zeroizing};

#[derive(Zeroize)]
#[zeroize(drop)]
pub struct Credentials {
    pub username: Zeroizing<String>,
    pub password: Zeroizing<Option<String>>,
}

// Connection config never contains credentials
pub struct ConnectionConfig {
    pub host: String,
    pub port: Option<u16>,
    pub database: Option<String>,
    // No username/password fields
}

Encryption Architecture#

Security Properties:

  • Confidentiality: AES-GCM-256 encryption
  • Integrity: 128-bit authentication tags
  • Authenticity: Authenticated encryption prevents tampering
  • Forward Secrecy: Random nonces prevent replay attacks
  • Key Security: Argon2id with memory-hard parameters

Database Adapter Architecture#

Adapter Hierarchy#

Connection Pooling#

Each adapter manages its own connection pool with security-focused defaults:

pub struct ConnectionConfig {
    pub connect_timeout: Duration, // Default: 30s
    pub query_timeout: Duration, // Default: 30s
    pub max_connections: u32, // Default: 10
    pub read_only: bool, // Default: true
}

Feature Flags#

Database support is controlled by feature flags for minimal binary size:

[features]
default = ["postgresql", "sqlite"]
postgresql = ["sqlx", "sqlx/postgres"]
mysql = ["sqlx", "sqlx/mysql"]
sqlite = ["sqlx", "sqlx/sqlite"]
mongodb = ["dep:mongodb"]
mssql = ["dep:tiberius"]

Error Handling Architecture#

Error Hierarchy#

#[derive(Debug, thiserror::Error)]
pub enum DbSurveyorError {
    #[error("Database connection failed")]
    Connection(#[from] ConnectionError),

    #[error("Schema collection failed: {context}")]
    Collection {
        context: String,
        source: Box<dyn std::error::Error>,
    },

    #[error("Configuration error: {message}")]
    Configuration { message: String },

    #[error("Encryption operation failed")]
    Encryption(#[from] EncryptionError),

    #[error("I/O operation failed: {context}")]
    Io {
        context: String,
        source: std::io::Error,
    },
}

Error Context Chain#

Security Guarantee: All error messages are sanitized to prevent credential leakage.

CLI Architecture#

Command Structure#

Configuration Hierarchy#

Configuration is loaded from multiple sources with clear precedence:

  1. Command Line Arguments (highest priority)
  2. Environment Variables
  3. Project Configuration (.dbsurveyor.toml)
  4. User Configuration (~/.config/dbsurveyor/config.toml)
  5. Default Values (lowest priority)

Documentation Generation Architecture#

Template Engine#

Output Format Pipeline#

pub trait OutputGenerator {
    fn generate(&self, schema: &DatabaseSchema) -> Result<String>;
    fn file_extension(&self) -> &'static str;
    fn mime_type(&self) -> &'static str;
}

// Implementations for each format
impl OutputGenerator for MarkdownGenerator { ... }
impl OutputGenerator for HtmlGenerator { ... }
impl OutputGenerator for JsonGenerator { ... }
impl OutputGenerator for MermaidGenerator { ... }

Performance Architecture#

Memory Management#

Concurrency Model#

// Async/await with Tokio runtime
#[tokio::main]
async fn main() -> Result<()> {
    // Connection pooling for concurrent queries
    let pool = PgPoolOptions::new()
        .max_connections(10)
        .connect_timeout(Duration::from_secs(30))
        .connect(&database_url).await?;

    // Concurrent schema collection
    let tables = collect_tables(&pool).await?;
    let views = collect_views(&pool).await?;
    let indexes = collect_indexes(&pool).await?;

    // Join all concurrent operations
    let (tables, views, indexes) = tokio::try_join!(
        collect_tables(&pool),
        collect_views(&pool),
        collect_indexes(&pool)
    )?;
}

Testing Architecture#

Test Organization#

tests/
├── integration/ # End-to-end tests
│ ├── postgres_tests.rs
│ ├── mysql_tests.rs
│ └── sqlite_tests.rs
├── security/ # Security-focused tests
│ ├── credential_tests.rs
│ ├── encryption_tests.rs
│ └── offline_tests.rs
└── fixtures/ # Test data
    ├── sample_schemas/
    └── test_databases/

Test Categories#

Build and Distribution Architecture#

Feature Matrix#

Binary Variants#

DBSurveyor produces multiple binary variants to support different database environments. GoReleaser v2 builds 7 distinct binaries:

  • dbsurveyor - Documentation postprocessor (all features)
  • dbsurveyor-collect - Data collection tool with variants:
    • all - PostgreSQL, MySQL, SQLite, MongoDB, MSSQL
    • postgresql - PostgreSQL only
    • mysql - MySQL only
    • sqlite - SQLite only
    • mongodb - MongoDB only
    • mssql - MSSQL only

Each variant is built with specific feature flags:

# All features (default release artifacts)
cargo zigbuild --release --all-features -p=dbsurveyor-collect

# PostgreSQL-only variant
cargo zigbuild --release --no-default-features \
  --features=postgresql,compression,encryption -p=dbsurveyor-collect

# SQLite-only variant
cargo zigbuild --release --no-default-features \
  --features=sqlite,compression,encryption -p=dbsurveyor-collect

Artifact Naming: Release artifacts follow the pattern:

dbsurveyor_{variant}_{OS}_{arch}.{tar.gz|zip}

Examples:

  • dbsurveyor_all_Linux_x86_64.tar.gz
  • dbsurveyor_postgresql_Darwin_x86_64.tar.gz
  • dbsurveyor_sqlite_Windows_x86_64.zip

Deployment Architecture#

Airgap Deployment#

CI/CD Integration#

DBSurveyor uses GoReleaser v2 with cargo-zigbuild for cross-compilation:

# GitHub Actions release workflow
  - name: Install Rust toolchain
    uses: dtolnay/rust-toolchain@stable
    with:
      toolchain: 1.93.1

  - name: Install Zig
    uses: mlugg/setup-zig@v2
    with:
      version: 0.13.0

  - name: Install cargo-zigbuild
    run: cargo install --locked cargo-zigbuild --version 0.19.8

  - name: Install Cosign
    uses: sigstore/cosign-installer@v3

  - name: Install Syft
    uses: anchore/sbom-action/download-syft@v0

  - name: Run GoReleaser
    uses: goreleaser/goreleaser-action@v6
    with:
      distribution: goreleaser
      version: ~> v2
      args: release --clean
    env:
      GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      HOMEBREW_TAP_TOKEN: ${{ secrets.HOMEBREW_TAP_TOKEN }}

Cross-Compilation Targets (6 platforms):

  • x86_64-unknown-linux-gnu - Linux x86_64 (glibc)
  • aarch64-unknown-linux-gnu - Linux ARM64 (glibc)
  • x86_64-unknown-linux-musl - Linux x86_64 (musl/Alpine)
  • x86_64-apple-darwin - macOS Intel
  • aarch64-apple-darwin - macOS Apple Silicon
  • x86_64-pc-windows-gnu - Windows x86_64

Security Features:

  • Cosign Keyless Signing: Checksums are signed using GitHub OIDC identity
  • Syft SBOM Generation: Software Bill of Materials for all archives
  • Reproducible Builds: Consistent timestamps via {{ .CommitTimestamp }}

This architecture ensures DBSurveyor maintains its security-first principles while providing flexibility, performance, and maintainability across all supported platforms and use cases.