Documents
output-formats
output-formats
Type
External
Status
Published
Created
Mar 4, 2026
Updated
Mar 5, 2026
Updated by
Dosu Bot
Source
View

Output Formats#

DBSurveyor generates structured output in multiple formats to support different use cases and workflows.

Schema File Formats#

Standard JSON Format (.dbsurveyor.json)#

The primary output format is a JSON file following the DBSurveyor Schema v1.0 specification:

{
  "format_version": "1.0",
  "database_info": {
    "name": "production_db",
    "version": "13.7",
    "size_bytes": 1073741824,
    "encoding": "UTF8",
    "collation": "en_US.UTF-8",
    "access_level": "Full",
    "collection_status": "Success"
  },
  "tables": [
    {
      "name": "users",
      "schema": "public",
      "columns": [
        {
          "name": "id",
          "data_type": {
            "Integer": {
              "bits": 32,
              "signed": true
            }
          },
          "is_nullable": false,
          "is_primary_key": true,
          "is_auto_increment": true,
          "ordinal_position": 1
        }
      ],
      "primary_key": {
        "name": "users_pkey",
        "columns": [
          "id"
        ]
      },
      "foreign_keys": [],
      "indexes": [],
      "constraints": []
    }
  ],
  "views": [],
  "indexes": [],
  "constraints": [],
  "procedures": [],
  "functions": [],
  "triggers": [],
  "custom_types": [],
  "collection_metadata": {
    "collected_at": "2024-01-15T10:30:00Z",
    "collection_duration_ms": 1500,
    "collector_version": "1.0.0",
    "warnings": []
  }
}

Table Sampling Data#

Tables may optionally include a sampling field containing sampled row data with metadata:

{
  "table_name": "users",
  "schema_name": "public",
  "rows": [
    {
      "id": 1,
      "username": "alice",
      "email": "alice@example.com"
    },
    {
      "id": 2,
      "username": "bob",
      "email": "bob@example.com"
    }
  ],
  "sample_size": 2,
  "total_rows": 1234,
  "sampling_strategy": {
    "MostRecent": {
      "limit": 10
    }
  },
  "collected_at": "2024-01-15T10:30:00Z",
  "warnings": [],
  "sample_status": "Complete"
}

Sample Status Field

The optional sample_status field indicates the outcome of the sampling operation:

  • "Complete": Sampling completed successfully with no issues
  • {"PartialRetry": {"original_limit": 10}}: Sampling partially completed with a reduced row limit (may be retried)
  • {"Skipped": {"reason": "table too large"}}: Sampling was skipped or not performed

Backward Compatibility: The sample_status field is optional and omitted when not set, ensuring older JSON files without this field remain valid.

Compressed Format (.dbsurveyor.json.zst)#

Large schema files can be compressed using Zstandard compression:

# Generate compressed output
dbsurveyor-collect --compress postgres://localhost/large_db

# Postprocessor automatically detects and decompresses
dbsurveyor generate schema.dbsurveyor.json.zst

Benefits:

  • 60-80% size reduction for large schemas
  • Fast compression/decompression
  • Maintains full data integrity

Encrypted Format (.enc)#

Sensitive schema data can be encrypted using AES-GCM-256:

# Generate encrypted output (prompts for password)
dbsurveyor-collect --encrypt postgres://localhost/sensitive_db

# Postprocessor prompts for decryption password
dbsurveyor generate schema.enc

Security Features:

  • AES-GCM-256 authenticated encryption
  • Argon2id key derivation (64 MiB memory, 3 iterations)
  • Random 96-bit nonces (never reused)
  • Embedded KDF parameters for decryption

Encrypted File Structure:

{
  "algorithm": "AES-GCM-256",
  "nonce": [/* 12 bytes */],
  "ciphertext": [/* encrypted data */],
  "auth_tag": [/* 16 bytes */],
  "kdf_params": {
    "salt": [/* 16 bytes */],
    "memory_cost": 65536,
    "time_cost": 3,
    "parallelism": 4,
    "version": "1.3"
  }
}

Documentation Formats#

Markdown Documentation (.md)#

Status: ✅ Implemented

Basic Markdown documentation with schema overview:

dbsurveyor generate schema.json --format markdown

Output Example:

# Database Schema: production_db

Generated by DBSurveyor v1.0.0
Collection Date: 2024-01-15 10:30:00 UTC

## Summary

- **Tables**: 25
- **Views**: 3
- **Indexes**: 47

## Tables

### users
- **Schema**: public
- **Columns**: 5
- **Primary Key**: id

JSON Analysis Report (.json)#

Status: ✅ Implemented

Structured analysis data for programmatic consumption:

dbsurveyor analyze schema.json --detailed

Output Example:

{
  "database_name": "production_db",
  "table_count": 25,
  "view_count": 3,
  "index_count": 47,
  "constraint_count": 32,
  "collection_date": "2024-01-15T10:30:00Z",
  "analysis": {
    "largest_table": "audit_log",
    "most_indexes": "users",
    "foreign_key_relationships": 18
  }
}

HTML Report (.html)#

Status: 🚧 Placeholder Implementation

Interactive HTML reports with search and navigation:

dbsurveyor generate schema.json --format html

Planned Features:

  • Interactive table browser
  • Search functionality
  • Relationship visualization
  • Responsive design for mobile/desktop
  • Offline-compatible (no external dependencies)

Mermaid ERD (.mmd)#

Status: 🚧 Placeholder Implementation

Entity Relationship Diagrams using Mermaid syntax:

dbsurveyor generate schema.json --format mermaid

Planned Output:

Loading diagram...

SQL DDL (.sql)#

Status: 🚧 Placeholder Implementation

Reconstructed SQL DDL for schema recreation:

dbsurveyor sql schema.json --dialect postgresql --output recreate.sql

Planned Features:

  • Multiple SQL dialect support (PostgreSQL, MySQL, SQLite, SQL Server)
  • Complete schema recreation scripts
  • Data type mapping between databases
  • Constraint and index recreation
  • Migration script generation

Unified Data Type System#

DBSurveyor uses a unified type system to represent data types across different databases:

Basic Types#

{
  "String": {
    "max_length": 255
  },
  "Integer": {
    "bits": 32,
    "signed": true
  },
  "Float": {
    "precision": 53
  },
  "Boolean": null,
  "Date": null,
  "DateTime": {
    "with_timezone": true
  },
  "Time": {
    "with_timezone": false
  }
}

Advanced Types#

{
  "Binary": {
    "max_length": 1024
  },
  "Json": null,
  "Uuid": null,
  "Array": {
    "element_type": {
      "String": {
        "max_length": 50
      }
    }
  },
  "Custom": {
    "type_name": "user_status_enum"
  }
}

Database-Specific Mapping#

DBSurveyor TypePostgreSQLMySQLSQLiteMongoDB
StringVARCHAR, TEXTVARCHAR, TEXTTEXTstring
IntegerINTEGER, BIGINTINT, BIGINTINTEGERint, long
FloatREAL, DOUBLEFLOAT, DOUBLEREALdouble
BooleanBOOLEANBOOLEANINTEGERbool
JsonJSON, JSONBJSONTEXTobject
ArrayARRAY[]JSONTEXTarray
CustomENUM, DOMAINENUMN/AN/A

Schema Validation#

All output files are validated against the DBSurveyor JSON Schema v1.0:

Validation Features#

  • Format Version Checking: Ensures compatibility
  • Required Field Validation: All mandatory fields present
  • Data Type Validation: Correct type structure
  • Security Validation: No credential fields allowed
  • Relationship Validation: Foreign key consistency

Validation Errors#

# Validate a schema file
dbsurveyor validate schema.json

# Example validation error
Error: Schema validation failed
- Missing required field: collection_metadata.collected_at
- Invalid data type: expected Integer, found String
- Security violation: credential field detected in column default_value

File Size Considerations#

Compression Recommendations#

Schema SizeRecommendationExpected Compression
< 1 MBStandard JSONN/A
1-10 MBZstandard compression60-70% reduction
> 10 MBCompression + chunking70-80% reduction

Large Schema Handling#

For very large databases (1000+ tables):

# Use compression for large schemas
dbsurveyor-collect --compress postgres://localhost/large_db

# Consider sampling for development
dbsurveyor-collect --sample 10 postgres://localhost/large_db

Security Considerations#

Credential Protection#

All output formats guarantee:

  • No database credentials in any output file
  • Connection strings sanitized in metadata
  • Error messages sanitized
  • Validation prevents credential leakage

Encryption Best Practices#

# Use strong passwords for encryption
dbsurveyor-collect --encrypt postgres://localhost/sensitive_db

# Verify encrypted file integrity
dbsurveyor validate schema.enc

Airgap Compatibility#

All formats work completely offline:

  • No external dependencies in generated files
  • Self-contained documentation
  • Local-only processing
  • No network calls during generation

Format Evolution#

Version Compatibility#

  • v1.0: Current format with full backward compatibility
  • Future versions: Additive changes only
  • Migration tools: Automatic format upgrades
  • Deprecation policy: 2-version support window

Schema Extensions#

The format supports extensions for:

  • Database-specific features
  • Custom metadata
  • Plugin-generated data
  • Performance metrics

Integration Examples#

CI/CD Pipeline#

# Generate schema documentation in CI
  - name: Collect Database Schema
    run: |
      dbsurveyor-collect --compress $DATABASE_URL
      dbsurveyor generate schema.dbsurveyor.json.zst --format markdown

  - name: Upload Documentation
    uses: actions/upload-artifact@v3
    with:
      name: database-docs
      path: schema.md

Programmatic Access#

import json

# Load schema data
with open("schema.dbsurveyor.json") as f:
    schema = json.load(f)

# Analyze tables
for table in schema["tables"]:
    print(f"Table: {table['name']}")
    print(f"Columns: {len(table['columns'])}")

Documentation Automation#

#!/bin/bash
# Automated documentation generation

# Collect from multiple environments
dbsurveyor-collect --output prod.json $PROD_DB_URL
dbsurveyor-collect --output staging.json $STAGING_DB_URL

# Generate documentation
dbsurveyor generate prod.json --format html --output prod-schema.html
dbsurveyor generate staging.json --format html --output staging-schema.html

# Compare schemas (future feature)
# dbsurveyor compare prod.json staging.json --output diff.html

This comprehensive output format system ensures that DBSurveyor can adapt to various workflows while maintaining security and compatibility guarantees.

output-formats | Dosu