Project Roadmap And Milestone Strategy#

The Project Roadmap and Milestone Strategy for DBSurveyor defines a version-based milestone naming strategy that guides the development of this security-focused, offline-first database analysis toolchain. The strategy employs semantic versioning with contextual descriptions to organize feature development into distinct capability phases, enabling systematic progression from minimum viable products to a production-ready system. This approach supports the project's dual-binary architecture consisting of separate collector (dbsurveyor-collect) and postprocessor (dbsurveyor) components designed for offline workflows and security-conscious operations.

The roadmap follows a progressive milestone structure: v0.1 (Collector MVP) establishes core collection infrastructure, v0.2 (Postprocessor MVP) adds documentation generation capabilities, v0.3 (Pro Features) introduces advanced visualization and analysis tools, and v1.0 (Production Release) delivers cross-platform distribution with comprehensive hardening. The project currently targets v1.0 with SQL database coverage spanning PostgreSQL, MySQL, and SQLite, deferring NoSQL and enterprise database support to post-v1.0 releases.

This milestone strategy reflects a deliberate prioritization of security, offline operation, and core functionality over feature breadth, aligning with the project's non-negotiable guarantees of zero telemetry, no credential exposure in outputs, and complete air-gap compatibility. The phased approach enables incremental delivery while maintaining strict quality standards including zero-warning compilation, 70%+ test coverage, and comprehensive security scanning at every stage.

Milestone Phases#

DBSurveyor's development roadmap is organized into four primary milestone phases, each building distinct capabilities that progressively advance the project toward production readiness. The version-based milestone naming strategy uses semantic versioning combined with descriptive names to communicate both the technical version and the functional focus of each release.

v0.1 - Collector MVP#

The Collector MVP milestone establishes the foundational infrastructure for database connectivity and metadata extraction. This phase focuses on building the core collection capabilities that enable DBSurveyor to connect to multiple database engines and extract comprehensive schema information.

Key Features:

Database connectivity and metadata extraction across multiple engine types
Multi-engine support for PostgreSQL, MySQL, SQLite, and MongoDB
Basic schema collection functionality including tables, columns, indexes, and constraints
Structured output generation in standardized JSON format with schema validation

v0.2 - Postprocessor MVP#

The Postprocessor MVP milestone introduces analysis and documentation generation capabilities built on the metadata collected in v0.1. This phase completes the dual-binary architecture by implementing the postprocessor component that transforms raw metadata into actionable documentation.

Key Features:

Documentation generation from collected metadata in multiple formats
Markdown and HTML report generation with customizable templates
SQL DDL reconstruction capabilities to recreate schema definitions
Privacy controls and redaction features for sensitive data protection

v0.3 - Pro Features#

The Pro Features milestone extends the platform with advanced analysis and visualization capabilities designed for professional use cases requiring enhanced schema understanding and compliance workflows.

Key Features:

Advanced schema diagramming using Mermaid and D2 diagram formats
Data classification and compliance reporting for regulatory requirements
Interactive HTML exports with search and filtering capabilities
Plugin system architecture enabling extensibility and custom integrations

v1.0 - Production Release#

The Production Release milestone represents the first production-ready version with comprehensive hardening, optimization, and distribution infrastructure. This phase prioritizes stability, performance, and ease of deployment across multiple platforms.

Key Features:

Cross-platform packaging and distribution for Linux, macOS, and Windows
Comprehensive documentation including user guides, API references, and examples
Security hardening and audit completion with formal security review
Performance optimization and tuning to meet established benchmarks

v1.0 Success Criteria:

The v1.0 release has clearly defined must-have features that establish the minimum requirements for production deployment:

PostgreSQL adapter with 100% feature completion including all schema objects and metadata
MySQL and SQLite adapters with core schema collection capabilities
Multi-database selection via flag-driven filters with glob pattern support (--all-databases, --include-system-databases, --exclude-databases)
Partial-failure behavior with machine-actionable failure metadata for automation workflows
Automation-friendly exit codes with default success on partial success and optional strict mode
Postprocessor generating offline Markdown documentation and SQL DDL reconstruction
AES-GCM encryption with Argon2id key derivation for sensitive output protection
70% test coverage minimum with testcontainers integration for comprehensive testing

Post-v1.0 Roadmap#

The phased milestone approach extends beyond v1.0 with additional capability expansions planned for future releases:

v1.5 - NoSQL Expansion (7-9 months total):

MongoDB adapter with full NoSQL schema collection support
Document-oriented database analysis capabilities
NoSQL-specific documentation and diagram generation

v2.0 - Enterprise Databases (10-13 months total):

SQL Server adapter with comprehensive T-SQL support
Oracle adapter for enterprise database environments
Mature plugin system architecture for third-party extensions

Deferred Features:
The following features are intentionally deferred to post-v1.0 releases to maintain focus on core SQL database functionality:

Pro-tier visual diagrams and interactive HTML reports
Advanced PII detection with machine learning
Data quality metrics and anomaly detection

Implementation Status#

The project tracks implementation progress through clear categorization of completed features and work in development, providing transparency into the current state of each milestone phase.

Completed Features#

Core Infrastructure:

The foundational architecture and security infrastructure have been fully implemented, establishing the project's dual-binary design pattern and operational guarantees. The initial project setup followed a security-first approach from the outset. The dual-binary architecture comprising collector and postprocessor components was implemented in PR #42 (merged September 1, 2025), enabling the separation of data collection from analysis for offline workflows. Offline-only operation with no telemetry has been architected as a non-negotiable guarantee.

Database Support:

Multi-database adapter support has been progressively implemented with focus on SQL databases for the v1.0 milestone. The PostgreSQL adapter with comprehensive schema collection capabilities was completed first, establishing the adapter interface pattern. SQLite adapter support followed, validating the adapter abstraction across different SQL dialects. In January 2026, PR #51 introduced unified database adapters with feature flag architecture supporting 6 engines (PostgreSQL, MySQL, SQLite, SQL Server, Oracle, and MongoDB), establishing the foundation for selective compilation and modular engine support.

Security Features:

Comprehensive security features protect sensitive data throughout the collection and storage lifecycle. AES-GCM encryption with Argon2id key derivation provides authenticated encryption for output files. Credential sanitization ensures no credentials appear in output files, implementing comprehensive pattern-based redaction. Memory-safe credential handling using the zeroize library prevents credential leakage through memory inspection.

Output and Infrastructure:

The output format and development infrastructure have been standardized to support automation and quality assurance. JSON Schema validation for all outputs using format version 1.0 ensures consistent, machine-readable metadata. Zstandard compression support reduces output file sizes by 60-80%. The CI/CD pipeline with comprehensive security scanning including CodeQL, Syft, and Grype was established early in development.

Additional capabilities have been delivered in 2026:

Core schema discovery and data sampling engine (merged January 29, 2026) provides comprehensive introspection
Data quality metrics with configurable thresholds (merged January 30, 2026) enables completeness, uniqueness, and consistency analysis
GoReleaser v2 migration with multi-variant builds (merged February 2026) introduced per-database variant distributions and removed 15 redundant configuration files

In Development#

Current development work focuses on completing v1.0 milestone requirements:

MySQL, MongoDB, and SQL Server adapters for additional database engine coverage
Advanced HTML report generation (placeholder implementation)
SQL DDL reconstruction for schema recreation (placeholder implementation)
Mermaid ERD diagram generation for visual schema representation (placeholder implementation)
Multi-database collection with partial-failure handling (planned)

Feature Prioritization#

DBSurveyor employs a formal feature priority matrix to guide development resource allocation and milestone planning. Features are categorized into High, Medium, and Low priority tiers based on their criticality to core functionality, security requirements, and user value.

High Priority Features#

High priority features (F000-F007, F014-F015, F021-F023) represent the essential capabilities that define DBSurveyor's core value proposition and enable basic operational functionality. These features receive priority scheduling and blocking status for milestone completion:

Core dual-binary architecture: Separation of collection and postprocessing for offline workflows
Database survey and connectivity: Multi-engine connection capabilities with secure credential handling
Portable output generation: Standardized JSON format with schema validation
Offline operation mode: Zero network dependencies after installation
Pluggable database engines: Modular adapter system with feature flag compilation
Throttling and compression capabilities: Performance optimization and storage efficiency

Medium Priority Features#

Medium priority features (F013, F016-F019) extend the platform with valuable analysis and documentation capabilities that enhance but do not define core functionality:

SQL reconstruction: DDL generation from collected metadata
Report and diagram generation modes: Multiple output format support
Pro features foundation: Advanced visualization and analysis infrastructure
Data sampling with privacy controls: Representative data extraction with redaction

Low Priority Features#

Low priority features (F020) provide enhanced user experience and advanced capabilities that can be deferred to later milestones without impacting core functionality:

HTML output generation: Pro-tier feature for interactive reports
Advanced visualizations: Enhanced diagram types and customization options
Enhanced privacy features: Machine learning-based PII detection and classification

Development Strategy#

The development strategy for DBSurveyor balances rapid iteration with rigorous quality standards through a single-maintainer model, automated tooling, and phased timeline planning.

Timeline and Phases#

The phased milestone approach establishes a clear timeline with defined effort estimates for each major release:

v0.5 (Phase 1): PostgreSQL Foundation - 2-3 months dedicated to establishing comprehensive PostgreSQL adapter support with testcontainers integration
v1.0 (Phase 2): SQL Database Coverage - 5-6 months total (13-19 weeks of development effort) to complete MySQL and SQLite adapters alongside core postprocessor functionality
v1.5 (Phase 3): NoSQL Expansion - 7-9 months total including MongoDB adapter and document-oriented database analysis capabilities
v2.0 (Phase 4): Enterprise Databases - 10-13 months total encompassing SQL Server and Oracle adapters plus mature plugin architecture

This timeline reflects a conservative estimate accounting for comprehensive testing, security review, and documentation requirements at each phase.

Team Structure#

DBSurveyor operates under a single-maintainer model with UncleSp1d3r serving as the primary maintainer. This organizational structure provides several strategic advantages:

Streamlined decision-making: Technical decisions require no multi-approval processes, enabling rapid response to design challenges
Direct push access: Maintainer has direct commit access for rapid iteration on urgent fixes and feature development
Optimized development cycles: Immediate feedback loops without coordination overhead accelerate development velocity

While this model optimizes for speed and decisiveness, it concentrates project knowledge and decision-making authority in a single person, creating potential continuity risks that are mitigated through comprehensive documentation and automated quality gates.

Quality Standards#

The project enforces strict quality requirements that apply to all code before merge, ensuring consistent quality across the codebase:

Rust Quality Gate:

Zero-warning compilation is enforced via cargo clippy -- -D warnings, treating all clippy warnings as compilation errors. This prevents warning debt accumulation and maintains code quality standards.

Additional Quality Requirements:

Formatting validation: cargo fmt --check ensures consistent code style across all Rust source files
Test suite passage: Complete test suite must pass with no failures, maintaining minimum 70% coverage for v1.0 milestone
Security scans: CodeQL static analysis, Syft SBOM generation, and Grype vulnerability scanning run on every pull request
License compliance: FOSSA validation ensures all dependencies meet license requirements

These automated quality gates run in CI/CD pipelines, providing fast feedback on quality violations without manual review overhead.

Code Review Process#

DBSurveyor uses CodeRabbit.ai as the primary code review tool rather than traditional human review for most changes. CodeRabbit provides:

Automated line-by-line code analysis: Every changed line receives automated review feedback
Conversational review feedback: Natural language comments explain issues and suggest improvements
Security analysis: Identifies potential security vulnerabilities and suggests mitigations
Best practice enforcement: Validates adherence to Rust idioms and project conventions

Notably, GitHub Copilot automatic reviews are explicitly disabled in favor of CodeRabbit's more comprehensive analysis capabilities.

Performance Requirements#

DBSurveyor establishes quantitative performance targets to ensure responsive operation and efficient resource utilization across various deployment scenarios.

General Performance Targets#

CLI startup time: < 100ms from invocation to ready state, ensuring minimal user wait time
Collection speed: < 10 seconds for databases with < 1000 tables, enabling rapid schema capture
Output file sizes: < 10MB when possible, facilitating efficient storage and transfer
Postprocessor speed: < 500ms on small/medium databases, providing near-instant documentation generation
Memory usage: < 1GB for typical workloads, supporting deployment on resource-constrained systems

Benchmark Targets for Milestone Planning#

The project defines specific benchmark targets that inform milestone completion criteria and performance regression testing:

Single database (100 tables): < 5 seconds total collection time
Multi-database (10 databases, 50 tables each): < 30 seconds for complete collection across all databases
Large database (1000 tables): < 60 seconds for comprehensive schema capture

These benchmarks account for network latency, query execution time, and metadata serialization overhead under typical production conditions.

Security Architecture#

Security is architected as a set of non-negotiable guarantees rather than configurable options, simplifying the security model and eliminating misconfiguration risks.

Critical Security Guarantees#

DBSurveyor provides four fundamental security guarantees that define its security posture:

Offline-Only Operation: Zero network calls after initial installation, preventing data exfiltration and eliminating dependency on external services
No Telemetry: Absolutely no data collection, usage tracking, or analytics of any kind
No Credentials in Outputs: Database credentials never appear in any output files, regardless of format or encryption status
Airgap Compatibility: Full functionality in air-gapped environments without internet access or external dependencies

Additional Security Requirements#

Beyond the core guarantees, additional security mechanisms protect data throughout its lifecycle:

AES-GCM authenticated encryption: Authenticated encryption with random nonces prevents tampering and ensures confidentiality
Argon2id key derivation: Memory-hard key derivation (256-bit keys, 64 MiB memory, 3 iterations) resists brute-force attacks
Secure memory handling: Zeroing on deallocation using the zeroize library prevents credential recovery from memory dumps
Pattern-based sensitive data redaction: Configurable patterns identify and redact PII, API keys, and other sensitive data
Restrictive file permissions: Output files created with 0600 permissions (owner read/write only) prevent unauthorized access

Technical Stack#

The technical stack emphasizes security, performance, and cross-platform compatibility through careful dependency selection and build configuration.

Language and Runtime#

DBSurveyor is implemented in Rust 1.93.1 with a Minimum Supported Rust Version (MSRV) of 1.77+. Rust provides memory safety without garbage collection, zero-cost abstractions, and strong type safety that aligns with the project's security-first approach.

Key Dependencies#

Command-Line Interface:

clap v4+: Type-safe argument parsing with derive macros for maintainable CLI definitions

Async Runtime:

tokio: Industry-standard async runtime providing efficient concurrent I/O for database operations

Database Drivers:

sqlx: Pure Rust SQL database driver for PostgreSQL, MySQL, and SQLite with compile-time query checking
tiberius: Async TDS (Tabular Data Stream) implementation for SQL Server connectivity
mongodb: Official MongoDB driver for Rust with full async support

Cryptography:

aes-gcm: AES-GCM authenticated encryption implementation
ring: Cryptographic primitives for key derivation and random number generation

Compression and Serialization:

zstd: Zstandard compression bindings achieving 60-80% size reduction
serde ecosystem: Serialization framework with JSON, YAML, and custom format support

Testing:

cargo-nextest: Next-generation test runner with improved parallelization and output
testcontainers: Docker-based integration testing with real database instances

Build System and Distribution#

The project uses GoReleaser v2 with cargo-zigbuild for cross-platform build and distribution. The GoReleaser v2 migration completed in February 2026 fully replaced cargo-dist with a multi-variant build strategy:

Multi-variant builds: 7 distinct binaries per platform (1 postprocessor + 6 collector variants: all, postgresql, mysql, sqlite, mongodb, mssql)
Cross-compilation: cargo-zigbuild handles cross-compilation for all targets without manual toolchain installation
Target platforms: 6 targets covering Linux (x86_64 gnu/musl, aarch64), macOS (x86_64, aarch64 Apple Silicon), Windows (x86_64)
Security signing: Cosign keyless signing on all release artifacts using GitHub Actions OIDC
SBOM generation: Syft generates Software Bill of Materials for each archive

This architecture enables users to download only the database drivers they need, reducing binary size and dependency footprint. Each variant includes the shared postprocessor binary alongside the variant-specific collector binary with compression and encryption support.

Release Engineering#

DBSurveyor's release engineering process emphasizes automation, reproducibility, and security through standardized CI/CD pipelines and output format specifications.

CI/CD Pipeline#

The project implements the EvilBit Labs Pipeline Standard using GitHub Actions with just task automation for consistent local and CI execution:

just test # Run all tests (unit + integration)
just lint # Run cargo clippy with strict warnings
just format # Code formatting validation
just build-release # Cross-platform builds
just package # Distribution packaging

The just command runner provides a standardized interface for common development tasks, ensuring identical behavior between developer workstations and CI environments. This consistency reduces "works on my machine" issues and streamlines contributor onboarding.

CI/CD Pipeline Features:

The pipeline incorporates comprehensive automation and security scanning:

Semantic versioning with Release Please: Automated changelog generation and version bumping based on conventional commit messages
Signed releases with Cosign keyless signing: Cryptographic signatures using GitHub Actions OIDC enable supply chain verification without key management
SBOM generation with Syft: Software Bill of Materials for all archives supporting security auditing and compliance
Automated dependency updates via Renovate: Continuous dependency monitoring with automated pull requests for updates
Security scanning suite: CodeQL static analysis, Syft SBOM generation, and Grype vulnerability scanning on every build
Coverage reporting: Codecov integration tracks test coverage trends and enforces minimum coverage requirements

The GoReleaser v2 implementation builds 42 binaries per release (7 variants × 6 targets), distributing them across multiple Linux package formats (deb, rpm, apk) and the EvilBit-Labs Homebrew tap. Each variant enables users to download only the database drivers they need.

Output Format Strategy#

DBSurveyor defines standardized output formats with explicit versioning ("format_version": "1.0") to support format evolution while maintaining backward compatibility:

.dbsurveyor.json: Uncompressed JSON metadata in human-readable format for easy inspection and scripting
.dbsurveyor.json.zst: Zstandard compressed format achieving 60-80% size reduction for efficient storage and transfer
.dbsurveyor.enc: AES-GCM encrypted format with embedded KDF parameters, enabling secure storage of sensitive metadata

The format versioning strategy allows tools to detect and handle format changes gracefully, preventing silent data corruption or misinterpretation.

Standards Compliance#

DBSurveyor achieves full compliance with all EvilBit Labs organizational standards, demonstrating alignment with enterprise-grade development practices:

Pipeline Standard: GitHub Actions CI/CD with just task automation and reproducible builds
Security Standard: No telemetry, credential protection, secure defaults, and comprehensive threat modeling
Documentation Standard: User guides, API documentation, architecture decision records, and inline code documentation
Testing Standard: >80% code coverage target (70% minimum for v1.0), integration tests with testcontainers, property-based testing
Release Standard: Semantic versioning, signed releases with Cosign, SBOM generation, automated changelogs
Offline Standard: Complete air-gap operation with zero network dependencies post-installation
Cross-Platform Standard: First-class support for Linux, macOS, and Windows with consistent behavior across platforms

The project documentation indicates no standard deviations have been identified, reflecting comprehensive adherence to organizational requirements.

Future Development Guidance#

The project maintains explicit guidance for future feature development to ensure consistency with organizational standards:

HTTP Client Requirements:

If future features require HTTP client functionality, developers must use OpenAPI Generator for Rust client code generation rather than hand-written HTTP clients. This approach aligns with EvilBit Labs standards for type-safe, well-documented API clients and ensures API contract enforcement at compile time.

This guidance reflects lessons learned across the organization and prevents common pitfalls associated with manually implementing HTTP clients, such as incomplete error handling, undocumented edge cases, and API drift detection.

Relevant Code Files#

The following files contain the authoritative documentation and implementation of the roadmap and milestone strategy:

File Path	Description
`project_specs/requirements.md`	Complete milestone structure (v0.1-v1.0), feature requirements, priority matrix, performance targets, and compliance standards
`CHANGELOG.md`	Implementation history tracking completed features, in-development work, and version progression
`CONTRIBUTORS.md`	Development approach, team structure, quality standards, code review process, and contributor guidelines
`.github/workflows/`	CI/CD pipeline implementation with security scanning, testing, and release automation
`Justfile`	Task automation recipes for build, test, format, lint, and release operations
`.goreleaser.yml`	Cross-platform release configuration for binary distribution

Security-First Database Tooling: Architectural patterns for building database analysis tools with non-negotiable security guarantees (offline-only operation, zero telemetry, credential protection) for use in security-sensitive and air-gapped environments
Progressive Milestone Planning: Version-based roadmap strategies that structure feature development through distinct capability phases (MVP → Documentation → Pro Features → Production) rather than incremental feature iteration
Dual-Binary Architecture Pattern: Design pattern of splitting functionality into separate collector and postprocessor binaries to enable offline workflows, process isolation, and air-gapped operation for database analysis tools
Feature Flag Architecture: Modular compilation strategies using Rust feature flags to enable selective database engine inclusion and reduce binary size for specific deployment scenarios
Multi-Database Collection: Patterns for surveying multiple databases with partial-failure handling, machine-actionable failure metadata, and automation-friendly exit codes
Performance Benchmarking for Database Tools: Establishing quantitative performance targets and regression testing strategies for database metadata collection and analysis tools

Project Roadmap And Milestone Strategy#

Milestone Phases#

v0.1 - Collector MVP#

v0.2 - Postprocessor MVP#

v0.3 - Pro Features#

v1.0 - Production Release#

Post-v1.0 Roadmap#

Implementation Status#

Completed Features#

In Development#

Feature Prioritization#

High Priority Features#

Medium Priority Features#

Low Priority Features#

Development Strategy#

Timeline and Phases#

Team Structure#

Quality Standards#

Code Review Process#

Performance Requirements#

General Performance Targets#

Benchmark Targets for Milestone Planning#

Security Architecture#

Critical Security Guarantees#

Additional Security Requirements#

Technical Stack#

Language and Runtime#

Key Dependencies#

Build System and Distribution#

Release Engineering#

CI/CD Pipeline#

Output Format Strategy#

Standards Compliance#

Future Development Guidance#

Relevant Code Files#

Related Topics#