Troubleshooting Common Issues#
This guide covers common issues you may encounter when using Zerobyte and their solutions.
Version Upgrade Issues#
Migration Errors on Upgrade to v0.24.0+#
Symptom: Upgrading from v0.22.0 to v0.24.0 fails with migration errors: "Migration 00001-retag-snapshots completed with errors: Restic password not configured for organization."
Solution:
- Downgrade to v0.22.0
- Run the doctor button on all repositories
- Upgrade to v0.24.0 (may require two restarts)
Additional Issue: If data appears missing after upgrade, run:
docker exec -it zerobyte bun run cli assign-organization
Missing Required Environment Variables (v0.24.0+)#
Symptom: Container fails to start after upgrading to v0.24.0
Cause: Two new mandatory environment variables are required starting with v0.24.0
Solution: Add these environment variables to your docker-compose.yml:
environment:
APP_SECRET: <generate with: openssl rand -hex 32>
BASE_URL: http://your-server-ip:4096
Important: The BASE_URL must match the actual access URL. Using http://localhost:4096 when accessing from another machine causes "invalid origin" errors during admin user creation.
Client IP Detection Issues Behind Reverse Proxy#
Symptom: When Zerobyte is deployed behind a reverse proxy (nginx, Apache, Traefik, etc.), client IP addresses may not be correctly detected for rate limiting, logging, or security features
Cause: By default, Zerobyte does not trust the X-Forwarded-For header that reverse proxies use to pass the real client IP address
Solution: Set the TRUST_PROXY environment variable when deploying behind a reverse proxy:
environment:
APP_SECRET: <generate with: openssl rand -hex 32>
BASE_URL: http://your-server-ip:4096
TRUST_PROXY: true
Security Warning: Only set TRUST_PROXY=true if you have a reverse proxy that properly sets X-Forwarded-For headers. Setting this to true without a proper reverse proxy can create security vulnerabilities as clients could spoof their IP addresses.
When to Use:
- Set
TRUST_PROXY=truewhen deploying behind nginx, Apache, Traefik, or similar reverse proxies - Leave at the default
falsevalue for direct deployments without a reverse proxy
Missing Data After Login#
Symptom: After migration to v0.24.0, dashboard shows no volumes, repos, or backups
Solution: Data still exists but isn't visible due to organization mapping issues. Run:
docker exec -it zerobyte bun run cli assign-organization
Volume Mounting Issues#
SMB Mount Permission Denied (v0.22.0)#
Symptom: SMB mounts that worked in v0.21.0 fail in v0.22.0 with "Unable to apply new capability set" errors
Workaround: Add DAC_READ_SEARCH to cap_add in docker-compose:
cap_add:
- SYS_ADMIN
- DAC_READ_SEARCH
Status: Fixed in v0.25.0
SMB Passwords with Special Characters#
Symptom: SMB passwords containing special characters like $ or commas cause STATUS_LOGON_FAILURE errors
Cause: The mount command doesn't properly escape special characters
Status: Fixed in v0.25.0
SMB Permission Errors During Backup#
Symptom: SMB mounted volumes report "xattr.list: permission denied" errors during backup. Files show a ? character at the end of permissions (e.g., -rwxr-xr-x?)
Solution: Switch from SMB to NFS volumes. The issue was present in earlier versions but became visible in the UI in v0.22.
Viewing Error Details: When a backup completes with warnings, detailed diagnostic information is displayed directly in the Schedule Summary page through a collapsible "Warning details" component. This shows the specific stderr output from restic, including permission denied errors and the exact files affected.
General SMB/NFS Mount Failures#
Symptom: Mount operations fail with "Permission denied"
Solution: Add capabilities to docker-compose.yml:
cap_add:
- SYS_ADMIN
- SYS_PTRACE
security_opt:
- seccomp:unconfined
- apparmor:unconfined
devices:
- /dev/fuse
Reference: See TROUBLESHOOTING.md for detailed security configuration.
NFS v3 Connection Refused#
Symptom: NFS v3 mounts fail with "Connection refused" even when the same mount works on the host
Cause: NFS v3 requires the nolock option
Status: Fixed in v0.22.0-beta.4 by automatically adding the nolock option for NFS v3
SMB on macOS#
Symptom: SMB mounting from macOS hosts is unreliable and may timeout after 5000ms
Status: Known issue - SMB from macOS hosts may be unstable
Repository Issues#
Wrong Password or No Key Found#
Symptom: "wrong password or no key found" error even with correct password
Cause: This misleading error occurs when the drive containing the repository becomes unmounted. Restic returns this error when repository files are inaccessible, regardless of password correctness.
Solution: Check that the storage volume is properly mounted and accessible.
Blob Not Found in Index#
Symptom: "blob not found in index" error during backup or restore operations
Cause: Repository corruption where the index references data blobs that don't exist in storage. Common causes:
- Bucket policies with auto-deletion
- Interrupted operations
- Storage backend issues
Solution:
- Check for auto-deletion policies on your storage backend
- Follow the Restic troubleshooting guide
- If unrecoverable, delete and recreate the repository
Repository Lock Issues#
Symptom: SFTP repositories get stuck with persistent lock files that prevent operations
Solution: Run the doctor button from the UI to clear stale locks. The ResticError class maps exit code 11 to repository locking failures.
S3 Repository Health Check Failures#
Symptom: S3 repositories enter error state with "bad gateway" errors when running doctor, even though backups succeed
Cause: Related to mounting the source volume as read-only (:ro)
Solution: Remove read-only flag from source mounts in docker-compose.yml
Repository Initialization Timeout#
Symptom: Repository initialization times out when the storage backend is too slow or unreachable. Timeout errors are explicitly detected and reported with the message: "Command timed out before completing"
Cause: Repository initialization uses a timeout based on the SERVER_IDLE_TIMEOUT environment variable (default 60 seconds). Slow storage backends may exceed this limit.
Solution: Increase the timeout for slow storage backends:
environment:
SERVER_IDLE_TIMEOUT: 120
Note: Timeout errors are now clearly identified in error messages, making it easier to distinguish timeout failures from other types of errors. The repository may be incomplete on disk and require recreation when a timeout occurs.
Backup Issues#
Backup Warnings and Errors#
Viewing Details: When a backup completes with warnings or errors, detailed diagnostic information is displayed directly in the Schedule Summary page through a collapsible "Warning details" or "Error details" component.
Error Display Structure: Zerobyte separates error information into two tiers:
- Error Summary: A high-level description of what went wrong (e.g., "Command failed: An error occurred while executing the command")
- Diagnostic Details: Specific diagnostic information from stderr that helps troubleshoot the root cause (e.g., SSH key permission errors, connection failures, file read errors)
What You'll See: The UI prioritizes showing actionable troubleshooting information:
- Specific error messages extracted from restic's stderr output
- Permission denied errors with exact file paths
- SSH connection failures with detailed diagnostic messages
- File read errors showing which files couldn't be backed up
When Shown:
- Warnings (exit code 3): Displays when a backup completes successfully but couldn't read some files due to permission issues or other read errors
- Errors: Displays when a backup fails completely, showing the detailed stderr output that explains the root cause rather than just generic error codes
Additional Context: For more detailed logs beyond what's shown in the UI, check container logs with docker logs zerobyte.
Username Validation Error#
Symptom: Previously, after v0.22, usernames with hyphens showed a "username is invalid" error.
Status: This issue is resolved. Usernames can now include lowercase letters, numbers, hyphens (-), underscores (_), and dots (.).
Allowed Characters: Usernames must be 2-50 characters and may contain:
- Lowercase and uppercase letters (a-z, A-Z)
- Numbers (0-9)
- Hyphens (
-) - Underscores (
_) - Dots (
.)
Usernames are automatically normalized to lowercase and trimmed of whitespace. If you still encounter a validation error, ensure your username does not contain spaces or unsupported special characters (such as @, !, etc.).
You no longer need to use the CLI to change usernames with hyphens, underscores, or dots.
Backup Selecting Wrong Folder (BTRFS)#
Symptom: On BTRFS filesystems, backups include the parent directory instead of the selected folder
Cause: This bug was caused by incorrect path resolution when a selected subfolder had the exact same name as its parent volume mount. The path.relative check falsely identified the subfolder path as an absolute path already inside the volume, causing the entire volume root to be backed up instead of the intended subfolder.
Status: Fixed in PR #576. The issue has been resolved and no workaround is needed for current versions.
Workaround (for older versions): Uncheck the "one file system" option in backup settings
RESTIC_HOSTNAME Not Used#
Symptom: The RESTIC_HOSTNAME environment variable is ignored during repository creation
Status: Fixed in PR #583. When appConfig.resticHostname is configured, the system automatically adds a key with the configured hostname after repository initialization.
Technical Details: Restic's init command doesn't support the --host flag. The fix works around this limitation by calling a new keyAdd function immediately after initialization to add a key with the correct hostname. If key addition fails, the repository initialization still succeeds but a warning is logged.
For Repositories Created Before This Fix: If you have repositories that were created before PR #583, the hostname may not be set correctly. You can manually add a key with the desired hostname by:
- Using the dev panel (accessible with
Meta+Shift+D) to runrestic key add --host <hostname>commands directly - Setting the hostname directly in docker-compose:
hostname: zerobyte
Note: Repositories created after this fix automatically have the correct hostname configured and require no manual intervention.
Passphrase-Protected SSH Keys#
Symptom: SFTP operations fail with passphrase-protected SSH keys
Cause: Zerobyte explicitly rejects passphrase-protected SSH keys
Solution: Use SSH keys without passphrases or use password-based authentication
Custom Restic Parameters Issues#
Invalid or Unsupported Flags#
Symptom: Backup fails with restic error about unknown flag or option
Cause: User entered invalid restic flags or typos in the custom parameters field
Solution:
- Verify flags against the official restic backup command documentation
- Check for typos in flag names
- Ensure proper syntax with space-separated flags and values (e.g.,
--exclude-larger-than 100M)
Syntax Errors in Custom Parameters#
Symptom: Backup command fails or parameters not applied correctly
Cause: Incorrect spacing, missing dashes, or improper formatting
Solution: Use proper flag syntax:
- Enter one flag per line (e.g.,
--ignore-inodeon one line,--ignore-ctimeon another) - Use either
--flag-name valueor--flag-name=valueformat - Avoid extra spaces within flag names
- Ensure flags start with
--(double dash) or-(single dash) as appropriate
Conflicts with Existing Settings#
Symptom: Unexpected behavior, backup failures, or settings not taking effect as expected
Cause: Custom parameters conflict with Zerobyte's built-in settings (e.g., exclude patterns, oneFileSystem option)
Solution:
- Review what Zerobyte already configures through the UI (exclude patterns, include patterns, one-file-system option)
- Avoid duplicating flags that Zerobyte sets automatically
- If unsure, test without custom parameters first, then add them incrementally
Common Use Cases#
Custom restic parameters are useful for advanced scenarios:
- Ignore inode/ctime changes: Use
--ignore-inodeand--ignore-ctimefor filesystems where these attributes change without content changes - Skip large files: Use
--iexclude-larger-than 500Mto exclude files larger than a specified size - Improve performance on large directories: Use
--no-scanto skip directory scanning and process files directly - Adjust concurrent reads: Use
--read-concurrency 8to control how many files are read in parallel
Example configuration:
--ignore-inode
--ignore-ctime
--iexclude-larger-than 500M
--read-concurrency 8
Best Practices#
- Test first: Try custom parameters on a small backup schedule before applying to production backups
- Use with caution: The UI notes that custom parameters are appended as-is to the restic command
- Check the UI first: Review warning or error details in the Schedule Summary page's collapsible component before checking container logs
- Isolate issues: Remove custom parameters temporarily if backups fail, to determine whether the parameters are the cause
- One flag per line: Enter each parameter on a separate line in the text field for clarity
Restore Issues#
Permission Errors During Restore#
Symptom: Restore operations fail with lchown/chmod permission errors on NFS, SMB, SFTP, or unprivileged LXC filesystems
Cause: Filesystems don't allow permission changes
Solution: In v0.13.0+, use the "Exclude xattrs" option in Advanced settings. For example, add:
system.nfs4_acl_xdr
Special Characters in Filenames#
Symptom: Restoring files with special characters (like Polish letter รณ) fails with errors like 'no matching ID found for prefix "\b__bunstr_0"'
Status: Fixed in v0.19.3-beta.1
Restore to Windows via SFTP#
Symptom: Restoring to Windows machines via SFTP throws lchown errors
Cause: Attempts to set Unix ownership on Windows
Note: The restore may actually work despite the errors
Restore Target Path Not Allowed#
Symptom: "Restore target path is not allowed. Restoring to this path could overwrite critical system files or application data."
Cause: You are attempting to restore a snapshot to a path that is protected to prevent accidental overwriting of critical system files or application data.
Blocked paths include:
- Repository base directory (/var/lib/zerobyte/repositories)
- Restic cache directory (/var/lib/zerobyte/cache)
- SSH keys directory (/var/lib/zerobyte/ssh)
- Rclone config directory (/root/.config/rclone)
- Application directory (/app)
- Database directory
- Restic password file directory
- System temp directory
- Provisioning directory (if configured)
Solution: Choose a different restore target path that doesn't overlap with these protected system directories. For example, restore to a user data directory or a dedicated restore location outside of Zerobyte's internal directories.
Performance and Timeout Issues#
Slow Repository Response#
Symptom: Repositories on slow backends timeout after 60 seconds causing "failed to fetch" errors when loading snapshots
Solution: Increase timeout with environment variable:
environment:
SERVER_IDLE_TIMEOUT: 120
Note: Default increased from 10s to 60s in v0.20.0
SSH Connection Timeouts#
Symptom: Large backups over SFTP fail with "connection closed by remote host" after transferring significant data (10-100GB)
Status: Fixed in v0.19.4-alpha.1+ with SSH keepalive settings (ServerAliveInterval 60, ServerAliveCountMax 240)
Restore Speed Degradation#
Symptom: SFTP restore speeds degraded from ~1GB/min in v0.22.0 to ~0.1GB/min in v0.24.2
Status: Fixed in v0.27.0 - restore speeds returned to normal (~1.8GB/min)
Doctor Operation Endless Loop#
Symptom: After upgrading to v0.24.0, repositories get stuck in endless "Doctor is running" loop. Canceling fails and prevents backups from running.
Solution: Update to v0.24.2 with proper APP_SECRET and BASE_URL environment variables
Rclone Issues#
Rclone Volume Mount Failures#
Symptom: Rclone volumes fail to mount with "Daemon timed out" errors even when the repository backend works
Note: This appears to be confusion between rclone volumes (mounting remote storage as a volume) vs rclone repositories (using rclone as backup destination)
Rclone SFTP Timeout#
Symptom: Rclone with SFTP backend times out when mounting volumes
Status: Fixed in PR #575. Rclone mount operations now use the configurable SERVER_IDLE_TIMEOUT setting (default 300 seconds) instead of a fixed timeout.
Solution: For slow SFTP connections or large mount operations, increase the timeout with an environment variable:
environment:
SERVER_IDLE_TIMEOUT: 600
Note: The rclone backend timeout is aligned with other backend operations that use the centralized timeout configuration.
Certificate Validation Errors#
Symptom: Rclone operations fail with SSL/TLS certificate verification errors when using self-signed certificates or untrusted certificate authorities
Cause: By default, rclone validates SSL/TLS certificates. Self-signed certificates or certificates from untrusted CAs will be rejected.
Solution: Enable the insecureTls option in your rclone backend configuration. This sets the RCLONE_NO_CHECK_CERTIFICATE environment variable to disable certificate verification.
Configuration Example:
backend: rclone
remote: my-remote
path: /backups
insecureTls: true
Security Warning: Disabling certificate verification should only be done in trusted environments or when you specifically need to use self-signed certificates. This reduces security by making connections vulnerable to man-in-the-middle attacks.
When to Use:
- Using self-signed certificates in a controlled environment
- Testing with certificate authorities not in the system trust store
- Internal networks where certificate verification is not critical
When NOT to Use:
- Production environments with properly signed certificates
- Public networks or untrusted connections
- Any scenario where connection security is critical
Storage Backend Issues#
S3 Glacier/Cold Storage#
Symptom: Operations fail with "operation is not valid for the object's storage class" errors
Cause: Archival storage classes like S3 Glacier Deep Archive are not supported. Objects must be accessible immediately.
Solution: Do not use lifecycle rules that move objects to Glacier or other cold storage tiers
Diagnostic Features#
Zerobyte includes several built-in diagnostic features to help troubleshoot issues:
UI-Based Diagnostics#
Backup Status Details: The Schedule Summary page displays detailed warning and error information directly in the UI through collapsible components.
Error Display Structure: Zerobyte separates error information into two tiers to provide actionable troubleshooting information:
- Error Summary: A high-level description (e.g., "Command failed")
- Diagnostic Details: Specific stderr output from restic showing the root cause
What's Shown:
- Warning details: Displays stderr output when backups complete with read errors (exit code 3), showing which files couldn't be read and why
- Error details: Shows detailed diagnostic information when backups fail completely, extracted from restic's stderr rather than just generic error codes
- Real-time visibility: No need to check container logs for common backup issues
Logging System#
Zerobyte uses a custom Consola-based logger with:
- Automatic credential sanitization in production mode
- Configurable log levels via
LOG_LEVELenvironment variable - Structured output with timestamps and color coding
Restic Error Codes#
The ResticError class maps Restic exit codes to human-readable error summaries and separates them from diagnostic details:
- Exit code 1: Command failed
- Exit code 2: Go runtime error
- Exit code 3: Backup could not read all files
- Exit code 10: Repository not found
- Exit code 11: Failed to lock repository
- Exit code 12: Wrong repository password
- Exit code 130: Backup interrupted
The error display shows both the high-level summary and the specific diagnostic information from stderr to help with troubleshooting.
Repository Health Checks#
Run the "Doctor" button from the UI to:
- Unlock stale repository locks
- Repair repository index
- Verify repository integrity
Auto-Remediation Jobs#
The application includes automatic recovery mechanisms:
- Auto-remount: Automatically retries mounting volumes in error state
- Dangling mount cleanup: Detects and removes orphaned mounts
- Repository health checks: Periodic validation of repository status
CLI Tools#
Administrative command-line utilities:
docker exec -it zerobyte bun run cli
Available operations:
- Password reset
- Username changes
- 2FA management (disable, rekey)
- Organization assignment
Frequently Asked Questions#
Q: Why can't I create an admin user?
Ensure BASE_URL matches the actual IP/hostname you're using to access Zerobyte. Using http://localhost:4096 fails when accessing from another machine.
Q: What Docker capabilities do I need?
At minimum: cap_add: SYS_ADMIN and devices: /dev/fuse. For SMB/NFS/SFTP, you may also need SYS_PTRACE and security options.
Q: Can I use cold storage like S3 Glacier?
No, archival storage classes are not supported. Objects must be accessible immediately.
Q: Why do snapshots fail to load?
Usually due to slow repository response or timeout issues. Try increasing SERVER_IDLE_TIMEOUT to 120 or more.
Q: How do I handle password requirements?
Password requirements are hardcoded (8 character minimum) and cannot be customized.
Additional Resources#
- Comprehensive TROUBLESHOOTING.md guide - 465-line guide covering permission issues, FUSE mounts, rclone configuration, and security contexts
- Restic Troubleshooting Guide - Official Restic documentation for repository issues