Troubleshooting Common Issues#

This guide covers common issues you may encounter when using Zerobyte and their solutions.

Version Upgrade Issues#

Migration Errors on Upgrade to v0.24.0+#

Symptom: Upgrading from v0.22.0 to v0.24.0 fails with migration errors: "Migration 00001-retag-snapshots completed with errors: Restic password not configured for organization."

Solution:

Downgrade to v0.22.0
Run the doctor button on all repositories
Upgrade to v0.24.0 (may require two restarts)

Additional Issue: If data appears missing after upgrade, run:

docker exec -it zerobyte bun run cli assign-organization

Missing Required Environment Variables (v0.24.0+)#

Symptom: Container fails to start after upgrading to v0.24.0

Cause: Two new mandatory environment variables are required starting with v0.24.0

Solution: Add these environment variables to your docker-compose.yml:

environment:
  APP_SECRET: <generate with: openssl rand -hex 32>
  BASE_URL: http://your-server-ip:4096

Important: The BASE_URL must match the actual access URL. Using http://localhost:4096 when accessing from another machine causes "invalid origin" errors during admin user creation.

Client IP Detection Issues Behind Reverse Proxy#

Symptom: When Zerobyte is deployed behind a reverse proxy (nginx, Apache, Traefik, etc.), client IP addresses may not be correctly detected for rate limiting, logging, or security features

Cause: By default, Zerobyte does not trust the X-Forwarded-For header that reverse proxies use to pass the real client IP address

Solution: Set the TRUST_PROXY environment variable when deploying behind a reverse proxy:

environment:
  APP_SECRET: <generate with: openssl rand -hex 32>
  BASE_URL: http://your-server-ip:4096
  TRUST_PROXY: true

Security Warning: Only set TRUST_PROXY=true if you have a reverse proxy that properly sets X-Forwarded-For headers. Setting this to true without a proper reverse proxy can create security vulnerabilities as clients could spoof their IP addresses.

When to Use:

Set TRUST_PROXY=true when deploying behind nginx, Apache, Traefik, or similar reverse proxies
Leave at the default false value for direct deployments without a reverse proxy

Symptom: After migration to v0.24.0, dashboard shows no volumes, repos, or backups

Solution: Data still exists but isn't visible due to organization mapping issues. Run:

docker exec -it zerobyte bun run cli assign-organization

Volume Mounting Issues#

SMB Mount Permission Denied (v0.22.0)#

Symptom: SMB mounts that worked in v0.21.0 fail in v0.22.0 with "Unable to apply new capability set" errors

Workaround: Add DAC_READ_SEARCH to cap_add in docker-compose:

cap_add:
  - SYS_ADMIN
  - DAC_READ_SEARCH

Status: Fixed in v0.25.0

SMB Passwords with Special Characters#

Symptom: SMB passwords containing special characters like $ or commas cause STATUS_LOGON_FAILURE errors

Cause: The mount command doesn't properly escape special characters

Status: Fixed in v0.25.0

SMB Permission Errors During Backup#

Symptom: SMB mounted volumes report "xattr.list: permission denied" errors during backup. Files show a ? character at the end of permissions (e.g., -rwxr-xr-x?)

Solution: Switch from SMB to NFS volumes. The issue was present in earlier versions but became visible in the UI in v0.22.

Viewing Error Details: When a backup completes with warnings, detailed diagnostic information is displayed directly in the Schedule Summary page through a collapsible "Warning details" component. This shows the specific stderr output from restic, including permission denied errors and the exact files affected.

General SMB/NFS Mount Failures#

Symptom: Mount operations fail with "Permission denied"

Solution: Add capabilities to docker-compose.yml:

cap_add:
  - SYS_ADMIN
  - SYS_PTRACE
security_opt:
  - seccomp:unconfined
  - apparmor:unconfined
devices:
  - /dev/fuse

Reference: See TROUBLESHOOTING.md for detailed security configuration.

NFS v3 Connection Refused#

Symptom: NFS v3 mounts fail with "Connection refused" even when the same mount works on the host

Cause: NFS v3 requires the nolock option

Status: Fixed in v0.22.0-beta.4 by automatically adding the nolock option for NFS v3

SMB on macOS#

Symptom: SMB mounting from macOS hosts is unreliable and may timeout after 5000ms

Status: Known issue - SMB from macOS hosts may be unstable

Repository Issues#

Wrong Password or No Key Found#

Symptom: "wrong password or no key found" error even with correct password

Cause: This misleading error occurs when the drive containing the repository becomes unmounted. Restic returns this error when repository files are inaccessible, regardless of password correctness.

Solution: Check that the storage volume is properly mounted and accessible.

Blob Not Found in Index#

Symptom: "blob not found in index" error during backup or restore operations

Cause: Repository corruption where the index references data blobs that don't exist in storage. Common causes:

Bucket policies with auto-deletion
Interrupted operations
Storage backend issues

Solution:

Check for auto-deletion policies on your storage backend
Follow the Restic troubleshooting guide
If unrecoverable, delete and recreate the repository

Repository Lock Issues#

Symptom: SFTP repositories get stuck with persistent lock files that prevent operations

Automatic Recovery: When Zerobyte encounters a repository lock error (ResticLockError with exit code 11), it now automatically attempts to unlock stale locks and retry the operation once. This applies to all restic operations including backup, restore, check, forget, and other repository operations. In most cases, lock errors are automatically resolved without manual intervention.

Manual Intervention: If automatic recovery fails, you can manually resolve lock issues using the "Run doctor" button from the UI or the Unlock action. These should only be needed when automatic recovery is unsuccessful. The ResticLockError class is a dedicated error type for repository locking failures, exported from the error module alongside ResticError.

S3 Repository Health Check Failures#

Symptom: S3 repositories enter error state with "bad gateway" errors when running doctor, even though backups succeed

Cause: Related to mounting the source volume as read-only (:ro)

Solution: Remove read-only flag from source mounts in docker-compose.yml

Repository Initialization Timeout#

Symptom: Repository initialization times out when the storage backend is too slow or unreachable. Timeout errors are explicitly detected and reported with the message: "Command timed out before completing"

Cause: Repository initialization uses a timeout based on the SERVER_IDLE_TIMEOUT environment variable (default 60 seconds). Slow storage backends may exceed this limit.

Solution: Increase the timeout for slow storage backends:

environment:
  SERVER_IDLE_TIMEOUT: 120

Note: Timeout errors are now clearly identified in error messages, making it easier to distinguish timeout failures from other types of errors. The repository may be incomplete on disk and require recreation when a timeout occurs.

Backup Issues#

Backups or restores use too much CPU#

Start with the performance tuning guide. The usual fixes are:

Set GOMAXPROCS=1 or GOMAXPROCS=2 on the Zerobyte container and restart
Change repository compression mode from max to auto or off
Add --read-concurrency 1 to the affected backup schedule's custom Restic parameters

See Performance tuning for the full workflow and supported advanced flags.

Backup Warnings and Errors#

Viewing Details: When a backup completes with warnings or errors, detailed diagnostic information is displayed directly in the Schedule Summary page through a collapsible "Warning details" or "Error details" component.

Error Display Structure: Zerobyte separates error information into two tiers:

Error Summary: A high-level description of what went wrong (e.g., "Command failed: An error occurred while executing the command")
Diagnostic Details: Specific diagnostic information from stderr that helps troubleshoot the root cause (e.g., SSH key permission errors, connection failures, file read errors)

What You'll See: The UI prioritizes showing actionable troubleshooting information:

Specific error messages extracted from restic's stderr output
Permission denied errors with exact file paths
SSH connection failures with detailed diagnostic messages
File read errors showing which files couldn't be backed up

When Shown:

Warnings (exit code 3): Displays when a backup completes successfully but couldn't read some files due to permission issues or other read errors
Errors: Displays when a backup fails completely, showing the detailed stderr output that explains the root cause rather than just generic error codes

Additional Context: For more detailed logs beyond what's shown in the UI, check container logs with docker logs zerobyte.

Username Validation Error#

Symptom: Previously, after v0.22, usernames with hyphens showed a "username is invalid" error.

Status: This issue is resolved. Usernames can now include lowercase letters, numbers, hyphens (-), underscores (_), and dots (.).

Allowed Characters: Usernames must be 2-50 characters and may contain:

Lowercase and uppercase letters (a-z, A-Z)
Numbers (0-9)
Hyphens (-)
Underscores (_)
Dots (.)

Usernames are automatically normalized to lowercase and trimmed of whitespace. If you still encounter a validation error, ensure your username does not contain spaces or unsupported special characters (such as @, !, etc.).

You no longer need to use the CLI to change usernames with hyphens, underscores, or dots.

Backup Selecting Wrong Folder (BTRFS)#

Symptom: On BTRFS filesystems, backups include the parent directory instead of the selected folder

Cause: This bug was caused by incorrect path resolution when a selected subfolder had the exact same name as its parent volume mount. The path.relative check falsely identified the subfolder path as an absolute path already inside the volume, causing the entire volume root to be backed up instead of the intended subfolder.

Status: Fixed in PR #576. The issue has been resolved and no workaround is needed for current versions.

Workaround (for older versions): Uncheck the "one file system" option in backup settings

RESTIC_HOSTNAME Not Used#

Symptom: The RESTIC_HOSTNAME environment variable is ignored during repository creation

Status: Fixed in PR #583. When appConfig.resticHostname is configured, the system automatically adds a key with the configured hostname after repository initialization.

Technical Details: Restic's init command doesn't support the --host flag. The fix works around this limitation by calling a new keyAdd function immediately after initialization to add a key with the correct hostname. If key addition fails, the repository initialization still succeeds but a warning is logged.

For Repositories Created Before This Fix: If you have repositories that were created before PR #583, the hostname may not be set correctly. You can manually add a key with the desired hostname by:

Using the dev panel (accessible with Meta+Shift+D) to run restic key add --host <hostname> commands directly
Setting the hostname directly in docker-compose:
```
hostname: zerobyte
```

Note: Repositories created after this fix automatically have the correct hostname configured and require no manual intervention.

Passphrase-Protected SSH Keys#

Symptom: SFTP operations fail with passphrase-protected SSH keys

Cause: Zerobyte explicitly rejects passphrase-protected SSH keys

Solution: Use SSH keys without passphrases or use password-based authentication

Custom Restic Parameters Issues#

Invalid or Unsupported Flags#

Symptom: Backup fails with restic error about unknown flag or option

Cause: User entered invalid restic flags or typos in the custom parameters field

Solution:

Verify flags against the official restic backup command documentation
Check for typos in flag names
Ensure proper syntax with space-separated flags and values (e.g., --exclude-larger-than 100M)

Syntax Errors in Custom Parameters#

Symptom: Backup command fails or parameters not applied correctly

Cause: Incorrect spacing, missing dashes, or improper formatting

Solution: Use proper flag syntax:

Enter one flag per line (e.g., --ignore-inode on one line, --ignore-ctime on another)
Use either --flag-name value or --flag-name=value format
Avoid extra spaces within flag names
Ensure flags start with -- (double dash) or - (single dash) as appropriate

Conflicts with Existing Settings#

Symptom: Unexpected behavior, backup failures, or settings not taking effect as expected

Cause: Custom parameters conflict with Zerobyte's built-in settings (e.g., exclude patterns, oneFileSystem option)

Solution:

Review what Zerobyte already configures through the UI (exclude patterns, include patterns, one-file-system option)
Avoid duplicating flags that Zerobyte sets automatically
If unsure, test without custom parameters first, then add them incrementally

Common Use Cases#

Custom restic parameters are useful for advanced scenarios:

Ignore inode/ctime changes: Use --ignore-inode and --ignore-ctime for filesystems where these attributes change without content changes
Skip large files: Use --iexclude-larger-than 500M to exclude files larger than a specified size
Improve performance on large directories: Use --no-scan to skip directory scanning and process files directly
Adjust concurrent reads: Use --read-concurrency 8 to control how many files are read in parallel

Example configuration:

--ignore-inode
--ignore-ctime
--iexclude-larger-than 500M
--read-concurrency 8

Best Practices#

Test first: Try custom parameters on a small backup schedule before applying to production backups
Use with caution: The UI notes that custom parameters are appended as-is to the restic command
Check the UI first: Review warning or error details in the Schedule Summary page's collapsible component before checking container logs
Isolate issues: Remove custom parameters temporarily if backups fail, to determine whether the parameters are the cause
One flag per line: Enter each parameter on a separate line in the text field for clarity

Slow Backups#

Symptom: Backups take too long to complete

Common causes and solutions:

Check network bandwidth (for cloud repositories)
Check disk I/O performance
Enable compression for large compressible files
Try --read-concurrency 4 or --read-concurrency 8 for fast local SSD/NVMe sources
Consider bandwidth limits if other services are affected

See Performance tuning before changing multiple performance settings at once.

Restore Issues#

Original Location Restore Unavailable#

Symptom: When attempting to restore a snapshot, the "Original location" button is disabled with a warning: "Source paths do not match - This snapshot was created from source paths that do not match this Zerobyte server or the current linked volume. Restoring to the original location is unavailable."

Cause: The snapshot was created from a system with non-POSIX file paths (such as Windows paths like C:\Users...). Zerobyte cannot safely restore these snapshots to their original locations on POSIX-based systems (Linux/Unix) because the path structures are incompatible.

Solution:

Use the "Custom location" restore option to specify a target directory on your current system
Enter a valid POSIX path (e.g., /mnt/restore/data) as the restore destination
Alternatively, use the "Download" option to download the snapshot contents directly

This is expected behavior when restoring snapshots across different operating systems (Windows to Linux/Unix or vice versa).

Permission Errors During Restore#

Symptom: Restore operations fail with lchown/chmod permission errors on NFS, SMB, SFTP, or unprivileged LXC filesystems

Cause: Filesystems don't allow permission changes

Solution: In v0.13.0+, use the "Exclude xattrs" option in Advanced settings. For example, add:

system.nfs4_acl_xdr

Special Characters in Filenames#

Symptom: Restoring files with special characters (like Polish letter ó) fails with errors like 'no matching ID found for prefix "\b__bunstr_0"'

Status: Fixed in v0.19.3-beta.1

Restore to Windows via SFTP#

Symptom: Restoring to Windows machines via SFTP throws lchown errors

Cause: Attempts to set Unix ownership on Windows

Note: The restore may actually work despite the errors

Restore Target Path Not Allowed#

Symptom: "Restore target path is not allowed. Restoring to this path could overwrite critical system files or application data."

Cause: You are attempting to restore a snapshot to a path that is protected to prevent accidental overwriting of critical system files or application data.

Blocked paths include:

Repository base directory (/var/lib/zerobyte/repositories)
Restic cache directory (/var/lib/zerobyte/cache)
Rclone config directory (/root/.config/rclone)
Application directory (/app)
Database directory
Restic password file directory
System temp directory
Provisioning directory (if configured)

Solution: Choose a different restore target path that doesn't overlap with these protected system directories. For example, restore to a user data directory or a dedicated restore location outside of Zerobyte's internal directories.

Performance and Timeout Issues#

Slow Repository Response#

Symptom: Repositories on slow backends timeout after 60 seconds causing "failed to fetch" errors when loading snapshots

Solution: Increase timeout with environment variable:

environment:
  SERVER_IDLE_TIMEOUT: 120

Note: Default increased from 10s to 60s in v0.20.0

SSH Connection Timeouts#

Symptom: Large backups over SFTP fail with "connection closed by remote host" after transferring significant data (10-100GB)

Status: Fixed in v0.19.4-alpha.1+ with SSH keepalive settings (ServerAliveInterval 60, ServerAliveCountMax 240)

Restore Speed Degradation#

Symptom: SFTP restore speeds degraded from ~1GB/min in v0.22.0 to ~0.1GB/min in v0.24.2

Status: Fixed in v0.27.0 - restore speeds returned to normal (~1.8GB/min)

Doctor Operation Endless Loop#

Symptom: After upgrading to v0.24.0, repositories get stuck in endless "Doctor is running" loop. Canceling fails and prevents backups from running.

Solution: Update to v0.24.2 with proper APP_SECRET and BASE_URL environment variables

Rclone Issues#

Rclone Volume Mount Failures#

Symptom: Rclone volumes fail to mount with "Daemon timed out" errors even when the repository backend works

Note: This appears to be confusion between rclone volumes (mounting remote storage as a volume) vs rclone repositories (using rclone as backup destination)

Rclone SFTP Timeout#

Symptom: Rclone with SFTP backend times out when mounting volumes

Status: Fixed in PR #575. Rclone mount operations now use the configurable SERVER_IDLE_TIMEOUT setting (default 300 seconds) instead of a fixed timeout.

Solution: For slow SFTP connections or large mount operations, increase the timeout with an environment variable:

environment:
  SERVER_IDLE_TIMEOUT: 600

Note: The rclone backend timeout is aligned with other backend operations that use the centralized timeout configuration.

Certificate Validation Errors#

Symptom: Rclone operations fail with SSL/TLS certificate verification errors when using self-signed certificates or untrusted certificate authorities

Cause: By default, rclone validates SSL/TLS certificates. Self-signed certificates or certificates from untrusted CAs will be rejected.

Solution: Enable the insecureTls option in your rclone backend configuration. This sets the RCLONE_NO_CHECK_CERTIFICATE environment variable to disable certificate verification.

Configuration Example:

backend: rclone
remote: my-remote
path: /backups
insecureTls: true

Security Warning: Disabling certificate verification should only be done in trusted environments or when you specifically need to use self-signed certificates. This reduces security by making connections vulnerable to man-in-the-middle attacks.

When to Use:

Using self-signed certificates in a controlled environment
Testing with certificate authorities not in the system trust store
Internal networks where certificate verification is not critical

When NOT to Use:

Production environments with properly signed certificates
Public networks or untrusted connections
Any scenario where connection security is critical

Storage Backend Issues#

S3 Glacier/Cold Storage#

Symptom: Operations fail with "operation is not valid for the object's storage class" errors

Cause: Archival storage classes like S3 Glacier Deep Archive are not supported. Objects must be accessible immediately.

Solution: Do not use lifecycle rules that move objects to Glacier or other cold storage tiers

Diagnostic Features#

Zerobyte includes several built-in diagnostic features to help troubleshoot issues:

UI-Based Diagnostics#

Backup Status Details: The Schedule Summary page displays detailed warning and error information directly in the UI through collapsible components.

Error Display Structure: Zerobyte separates error information into two tiers to provide actionable troubleshooting information:

Error Summary: A high-level description (e.g., "Command failed")
Diagnostic Details: Specific stderr output from restic showing the root cause

What's Shown:

Warning details: Displays stderr output when backups complete with read errors (exit code 3), showing which files couldn't be read and why
Error details: Shows detailed diagnostic information when backups fail completely, extracted from restic's stderr rather than just generic error codes
Real-time visibility: No need to check container logs for common backup issues

Logging System#

Zerobyte uses a custom Consola-based logger with:

Automatic credential sanitization in production mode
Configurable log levels via LOG_LEVEL environment variable
Structured output with timestamps and color coding

Restic Error Codes#

Zerobyte maps Restic exit codes to human-readable error summaries and separates them from diagnostic details:

Exit code 1: Command failed
Exit code 2: Go runtime error
Exit code 3: Backup could not read all files
Exit code 10: Repository not found
Exit code 11: Failed to lock repository (ResticLockError)
Exit code 12: Wrong repository password
Exit code 130: Backup interrupted

The error display shows both the high-level summary and the specific diagnostic information from stderr to help with troubleshooting. Exit code 11 uses a dedicated ResticLockError class that triggers automatic lock recovery.

Repository Health Checks#

Run the "Doctor" button from the UI to:

Unlock stale repository locks
Repair repository index
Verify repository integrity

Auto-Remediation Jobs#

The application includes automatic recovery mechanisms:

Auto-remount: Automatically retries mounting volumes in error state
Dangling mount cleanup: Detects and removes orphaned mounts
Repository health checks: Periodic validation of repository status

CLI Tools#

Administrative command-line utilities:

docker exec -it zerobyte bun run cli

Available operations:

Password reset
Username changes
2FA management (disable, rekey)
Organization assignment

Frequently Asked Questions#

Q: Why can't I create an admin user?
Ensure BASE_URL matches the actual IP/hostname you're using to access Zerobyte. Using http://localhost:4096 fails when accessing from another machine.

Q: What Docker capabilities do I need?
At minimum: cap_add: SYS_ADMIN and devices: /dev/fuse. For SMB/NFS/SFTP, you may also need SYS_PTRACE and security options.

Q: Can I use cold storage like S3 Glacier?
No, archival storage classes are not supported. Objects must be accessible immediately.

Q: Why do snapshots fail to load?
Usually due to slow repository response or timeout issues. Try increasing SERVER_IDLE_TIMEOUT to 120 or more.

Q: How do I handle password requirements?
Password requirements are hardcoded (8 character minimum) and cannot be customized.

Additional Resources#

Comprehensive TROUBLESHOOTING.md guide - 465-line guide covering permission issues, FUSE mounts, rclone configuration, and security contexts
Restic Troubleshooting Guide - Official Restic documentation for repository issues

Troubleshooting Common Issues#

Version Upgrade Issues#

Migration Errors on Upgrade to v0.24.0+#

Missing Required Environment Variables (v0.24.0+)#

Client IP Detection Issues Behind Reverse Proxy#

Missing Data After Login#

Volume Mounting Issues#

SMB Mount Permission Denied (v0.22.0)#

SMB Passwords with Special Characters#

SMB Permission Errors During Backup#

General SMB/NFS Mount Failures#

NFS v3 Connection Refused#

SMB on macOS#

Repository Issues#

Wrong Password or No Key Found#

Blob Not Found in Index#

Repository Lock Issues#

S3 Repository Health Check Failures#

Repository Initialization Timeout#

Backup Issues#

Backups or restores use too much CPU#

Backup Warnings and Errors#

Username Validation Error#

Backup Selecting Wrong Folder (BTRFS)#

RESTIC_HOSTNAME Not Used#

Passphrase-Protected SSH Keys#

Custom Restic Parameters Issues#

Invalid or Unsupported Flags#

Syntax Errors in Custom Parameters#

Conflicts with Existing Settings#

Common Use Cases#

Best Practices#

Slow Backups#

Restore Issues#

Original Location Restore Unavailable#

Permission Errors During Restore#

Special Characters in Filenames#

Restore to Windows via SFTP#

Restore Target Path Not Allowed#

Performance and Timeout Issues#

Slow Repository Response#

SSH Connection Timeouts#

Restore Speed Degradation#

Doctor Operation Endless Loop#

Rclone Issues#

Rclone Volume Mount Failures#

Rclone SFTP Timeout#

Certificate Validation Errors#

Storage Backend Issues#

S3 Glacier/Cold Storage#

Diagnostic Features#

UI-Based Diagnostics#

Logging System#

Restic Error Codes#

Repository Health Checks#

Auto-Remediation Jobs#

CLI Tools#

Frequently Asked Questions#

Additional Resources#