Architecture#
DBSurveyor follows a security-first, modular architecture designed for flexibility, maintainability, and offline operation. This document details the system architecture and design decisions.
System Overview#
Crate Architecture#
Workspace Structure#
DBSurveyor uses a Cargo workspace with three main crates:
dbsurveyor/
├── dbsurveyor-core/ # Shared library
├── dbsurveyor-collect/ # Collection binary
├── dbsurveyor/ # Documentation binary
└── Cargo.toml # Workspace configuration
Dependency Graph#
Core Library (dbsurveyor-core)#
Module Structure#
// dbsurveyor-core/src/lib.rs
pub mod adapters; // Database adapter traits and factory
pub mod error; // Comprehensive error handling
pub mod models; // Unified data models
pub mod security; // Encryption and credential protection
// Re-exports for public API
pub use adapters::{create_adapter, DatabaseAdapter};
pub use error::{DbSurveyorError, Result};
pub use models::{DatabaseSchema, DatabaseType};
Data Models#
The core defines unified data structures that work across all database types:
// Unified schema representation
pub struct DatabaseSchema {
pub format_version: String,
pub database_info: DatabaseInfo,
pub tables: Vec<Table>,
pub views: Vec<View>,
pub indexes: Vec<Index>,
pub constraints: Vec<Constraint>,
pub procedures: Vec<Procedure>,
pub functions: Vec<Procedure>,
pub triggers: Vec<Trigger>,
pub custom_types: Vec<CustomType>,
pub samples: Option<Vec<TableSample>>,
pub collection_metadata: CollectionMetadata,
}
// Cross-database type mapping
pub enum UnifiedDataType {
String { max_length: Option<u32> },
Integer { bits: u8, signed: bool },
Float { precision: Option<u8> },
Boolean,
DateTime { with_timezone: bool },
Json,
Array { element_type: Box<UnifiedDataType> },
Custom { type_name: String },
}
Adapter Pattern#
Database adapters implement a common trait for unified access:
#[async_trait]
pub trait DatabaseAdapter: Send + Sync {
async fn test_connection(&self) -> Result<()>;
async fn collect_schema(&self) -> Result<DatabaseSchema>;
fn database_type(&self) -> DatabaseType;
fn supports_feature(&self, feature: AdapterFeature) -> bool;
fn connection_config(&self) -> ConnectionConfig;
}
Factory Pattern#
The adapter factory provides database-agnostic instantiation:
pub async fn create_adapter(connection_string: &str) -> Result<Box<dyn DatabaseAdapter>> {
let database_type = detect_database_type(connection_string)?;
match database_type {
DatabaseType::PostgreSQL => {
#[cfg(feature = "postgresql")]
{
let adapter = PostgresAdapter::new(connection_string).await?;
Ok(Box::new(adapter))
}
#[cfg(not(feature = "postgresql"))]
Err(DbSurveyorError::unsupported_feature("PostgreSQL"))
}
// ... other database types
}
}
Security Architecture#
Credential Protection#
Implementation:
use zeroize::{Zeroize, Zeroizing};
#[derive(Zeroize)]
#[zeroize(drop)]
pub struct Credentials {
pub username: Zeroizing<String>,
pub password: Zeroizing<Option<String>>,
}
// Connection config never contains credentials
pub struct ConnectionConfig {
pub host: String,
pub port: Option<u16>,
pub database: Option<String>,
// No username/password fields
}
Encryption Architecture#
Security Properties:
- Confidentiality: AES-GCM-256 encryption
- Integrity: 128-bit authentication tags
- Authenticity: Authenticated encryption prevents tampering
- Forward Secrecy: Random nonces prevent replay attacks
- Key Security: Argon2id with memory-hard parameters
Database Adapter Architecture#
Adapter Hierarchy#
Connection Pooling#
Each adapter manages its own connection pool with security-focused defaults:
pub struct ConnectionConfig {
pub connect_timeout: Duration, // Default: 30s
pub query_timeout: Duration, // Default: 30s
pub max_connections: u32, // Default: 10
pub read_only: bool, // Default: true
}
Feature Flags#
Database support is controlled by feature flags for minimal binary size:
[features]
default = ["postgresql", "sqlite"]
postgresql = ["sqlx", "sqlx/postgres"]
mysql = ["sqlx", "sqlx/mysql"]
sqlite = ["sqlx", "sqlx/sqlite"]
mongodb = ["dep:mongodb"]
mssql = ["dep:tiberius"]
Error Handling Architecture#
Error Hierarchy#
#[derive(Debug, thiserror::Error)]
pub enum DbSurveyorError {
#[error("Database connection failed")]
Connection(#[from] ConnectionError),
#[error("Schema collection failed: {context}")]
Collection {
context: String,
source: Box<dyn std::error::Error>,
},
#[error("Configuration error: {message}")]
Configuration { message: String },
#[error("Encryption operation failed")]
Encryption(#[from] EncryptionError),
#[error("I/O operation failed: {context}")]
Io {
context: String,
source: std::io::Error,
},
}
Error Context Chain#
Security Guarantee: All error messages are sanitized to prevent credential leakage.
CLI Architecture#
Command Structure#
Configuration Hierarchy#
Configuration is loaded from multiple sources with clear precedence:
- Command Line Arguments (highest priority)
- Environment Variables
- Project Configuration (
.dbsurveyor.toml) - User Configuration (
~/.config/dbsurveyor/config.toml) - Default Values (lowest priority)
Documentation Generation Architecture#
Template Engine#
Output Format Pipeline#
pub trait OutputGenerator {
fn generate(&self, schema: &DatabaseSchema) -> Result<String>;
fn file_extension(&self) -> &'static str;
fn mime_type(&self) -> &'static str;
}
// Implementations for each format
impl OutputGenerator for MarkdownGenerator { ... }
impl OutputGenerator for HtmlGenerator { ... }
impl OutputGenerator for JsonGenerator { ... }
impl OutputGenerator for MermaidGenerator { ... }
Performance Architecture#
Memory Management#
Concurrency Model#
// Async/await with Tokio runtime
#[tokio::main]
async fn main() -> Result<()> {
// Connection pooling for concurrent queries
let pool = PgPoolOptions::new()
.max_connections(10)
.connect_timeout(Duration::from_secs(30))
.connect(&database_url).await?;
// Concurrent schema collection
let tables = collect_tables(&pool).await?;
let views = collect_views(&pool).await?;
let indexes = collect_indexes(&pool).await?;
// Join all concurrent operations
let (tables, views, indexes) = tokio::try_join!(
collect_tables(&pool),
collect_views(&pool),
collect_indexes(&pool)
)?;
}
Testing Architecture#
Test Organization#
tests/
├── integration/ # End-to-end tests
│ ├── postgres_tests.rs
│ ├── mysql_tests.rs
│ └── sqlite_tests.rs
├── security/ # Security-focused tests
│ ├── credential_tests.rs
│ ├── encryption_tests.rs
│ └── offline_tests.rs
└── fixtures/ # Test data
├── sample_schemas/
└── test_databases/
Test Categories#
Build and Distribution Architecture#
Feature Matrix#
Binary Variants#
DBSurveyor produces multiple binary variants to support different database environments. GoReleaser v2 builds 7 distinct binaries:
dbsurveyor- Documentation postprocessor (all features)dbsurveyor-collect- Data collection tool with variants:all- PostgreSQL, MySQL, SQLite, MongoDB, MSSQLpostgresql- PostgreSQL onlymysql- MySQL onlysqlite- SQLite onlymongodb- MongoDB onlymssql- MSSQL only
Each variant is built with specific feature flags:
# All features (default release artifacts)
cargo zigbuild --release --all-features -p=dbsurveyor-collect
# PostgreSQL-only variant
cargo zigbuild --release --no-default-features \
--features=postgresql,compression,encryption -p=dbsurveyor-collect
# SQLite-only variant
cargo zigbuild --release --no-default-features \
--features=sqlite,compression,encryption -p=dbsurveyor-collect
Artifact Naming: Release artifacts follow the pattern:
dbsurveyor_{variant}_{OS}_{arch}.{tar.gz|zip}
Examples:
dbsurveyor_all_Linux_x86_64.tar.gzdbsurveyor_postgresql_Darwin_x86_64.tar.gzdbsurveyor_sqlite_Windows_x86_64.zip
Deployment Architecture#
Airgap Deployment#
CI/CD Integration#
DBSurveyor uses GoReleaser v2 with cargo-zigbuild for cross-compilation:
# GitHub Actions release workflow
- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable
with:
toolchain: 1.93.1
- name: Install Zig
uses: mlugg/setup-zig@v2
with:
version: 0.13.0
- name: Install cargo-zigbuild
run: cargo install --locked cargo-zigbuild --version 0.19.8
- name: Install Cosign
uses: sigstore/cosign-installer@v3
- name: Install Syft
uses: anchore/sbom-action/download-syft@v0
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v6
with:
distribution: goreleaser
version: ~> v2
args: release --clean
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
HOMEBREW_TAP_TOKEN: ${{ secrets.HOMEBREW_TAP_TOKEN }}
Cross-Compilation Targets (6 platforms):
x86_64-unknown-linux-gnu- Linux x86_64 (glibc)aarch64-unknown-linux-gnu- Linux ARM64 (glibc)x86_64-unknown-linux-musl- Linux x86_64 (musl/Alpine)x86_64-apple-darwin- macOS Intelaarch64-apple-darwin- macOS Apple Siliconx86_64-pc-windows-gnu- Windows x86_64
Security Features:
- Cosign Keyless Signing: Checksums are signed using GitHub OIDC identity
- Syft SBOM Generation: Software Bill of Materials for all archives
- Reproducible Builds: Consistent timestamps via
{{ .CommitTimestamp }}
This architecture ensures DBSurveyor maintains its security-first principles while providing flexibility, performance, and maintainability across all supported platforms and use cases.