Nginx Large File Upload Configuration#

Overview#

Nginx Large File Upload Configuration refers to specific nginx reverse proxy settings optimized for handling multi-gigabyte file transfers in the CipherSwarm distributed password cracking platform. CipherSwarm uses nginx as a load balancer to distribute traffic across horizontally scaled web application replicas, with a critical requirement to support large file uploads including hash lists and wordlists that commonly exceed several gigabytes. These files are uploaded through Rails Active Storage and must traverse nginx's reverse proxy layer without timing out or exhausting filesystem resources.

The configuration challenge is particularly acute in CipherSwarm's primary deployment environment: airgapped laboratory and secure facility networks where password cracking operations occur in isolation from the Internet. In these environments, large wordlist databases and hash files must be uploaded locally rather than downloaded from external sources, placing the entire burden of file transfer on the internal infrastructure. Standard nginx configurations with default timeout and buffering settings fail under these conditions, producing timeout errors and filesystem exhaustion.

Three core nginx directives address these challenges: client_max_body_size to control maximum upload size, proxy_request_buffering to manage how nginx handles request bodies during proxying, and extended timeout values (proxy_send_timeout and proxy_read_timeout) to accommodate slow transfers. The optimal configuration of these directives evolved through operational experience, with significant improvements proposed in March 2026 after production deployments encountered silent upload failures caused by insufficient timeout values and filesystem exhaustion from request buffering.

Configuration Evolution and Current State#

Production Configuration (February-March 2026)#

The nginx reverse proxy configuration was introduced in February 2026 via PR #594, establishing load balancing infrastructure for horizontal scaling. The current production configuration in docker/nginx/nginx.conf includes:

client_max_body_size 100M — limits HTTP request body size to 100 megabytes
proxy_send_timeout 60s — 60-second timeout for sending requests to backend
proxy_read_timeout 300s — 5-minute timeout for reading responses from backend
No dedicated /rails/active_storage/ location block
No explicit proxy_request_buffering directive (defaults to on)

The configuration defines two location blocks: /cable for WebSocket connections with extended timeouts for persistent Action Cable connections, and / as a catch-all for general HTTP traffic including Active Storage uploads.

Proposed Hardening for Large File Uploads#

PR #748, opened March 16, 2026, proposes three critical configuration changes to address production issues with large file uploads:

Unlimited body size: client_max_body_size 0 — removes the 100MB upload limit to support multi-gigabyte wordlists and hash files
Disable request buffering: proxy_request_buffering off — prevents nginx from buffering entire uploads to temporary files in /var/cache/nginx/client_temp/, which exhausts Docker overlay filesystem storage
Extended timeouts: Both proxy_send_timeout and proxy_read_timeout increased to 3600 seconds (1 hour) — accommodates slow transfers over internal networks

The changes were motivated by operational failures where Thruster's 30-second HTTP_READ_TIMEOUT caused silent upload failures with HTTP 502 errors. The solution removed Thruster as an intermediary layer, configured Puma to serve directly on port 80, and delegated HTTP/2 compression and caching to nginx. As of March 2026, these improvements remain under review and have not yet been merged into production.

Client Body Size Configuration#

The client_max_body_size directive controls the maximum size of the HTTP request body that nginx will accept from clients. This setting directly determines the largest file that can be uploaded through the reverse proxy. In CipherSwarm's context, this limit affects uploads of hash lists, wordlists, and rule files through Rails Active Storage.

Current Implementation#

The production configuration sets client_max_body_size 100M, limiting uploads to 100 megabytes. This constraint is problematic for password cracking operations, where common wordlist files exceed this threshold:

rockyou.txt: 133 MB (basic wordlist, already exceeds limit)
crackstation-human-only.txt: ~15 GB (comprehensive wordlist database)
Custom hash lists: 10-50 GB typical for enterprise password audits

When a client attempts to upload a file exceeding the configured limit, nginx immediately rejects the request with HTTP 413 "Payload Too Large" error before any data transfer occurs.

Recommended Configuration#

For production deployments supporting large file uploads, two approaches are viable:

# Option 1: Unlimited upload size
client_max_body_size 0;

# Option 2: Explicit large limit matching storage capacity
client_max_body_size 50G;

Setting the value to 0 removes the limit entirely, allowing uploads of arbitrary size constrained only by backend storage capacity and timeout settings. Alternatively, operators can set an explicit large value matching their storage provisioning. The directive applies globally to all HTTP requests processed by the nginx server block.

Request Buffering Configuration#

The proxy_request_buffering directive controls whether nginx buffers the entire client request body to disk before forwarding it to the upstream backend server. This behavior has critical implications for large file uploads in containerized environments.

Default Behavior and Problems#

When proxy_request_buffering is enabled (nginx's default), the proxy follows this workflow:

Client initiates file upload to nginx
Nginx writes the entire request body to temporary files in /var/cache/nginx/client_temp/
After the complete file is buffered to disk, nginx forwards it to the backend
Backend processes the upload and writes to final storage

In CipherSwarm deployments, this buffering behavior causes critical problems. The nginx container runs on Docker's overlay filesystem with limited storage. Multi-gigabyte uploads rapidly fill available disk space, causing container crashes and upload failures. Production logs showed "a client request body is buffered to a temporary file" warnings even for small uploads, indicating that all uploads were triggering disk buffering.

Streaming Mode#

Setting proxy_request_buffering off enables streaming mode, where nginx forwards request data to the backend in chunks as it arrives from the client, without intermediate disk storage:

location / {
    proxy_pass http://web_backend;
    proxy_request_buffering off; # Stream directly to backend
}

This configuration eliminates temporary file usage, preventing filesystem exhaustion. The trade-off is that nginx cannot retry failed requests to alternate backend servers when using non-idempotent HTTP methods (POST, PUT, PATCH), as partial data may have already been forwarded. For CipherSwarm's Active Storage uploads via HTTP PUT to /rails/active_storage/disk/* endpoints, this trade-off is acceptable — upload failures will be detected immediately rather than silently corrupting data.

Application Scope#

The directive can be applied globally in the server context or selectively in specific location blocks. For CipherSwarm, the proposed configuration applies it to all proxy locations, ensuring consistent streaming behavior for Active Storage uploads and other large request bodies.

Proxy Timeout Configuration#

Nginx proxy timeout directives control how long the reverse proxy waits during various phases of backend communication. Two directives are critical for large file upload support: proxy_send_timeout and proxy_read_timeout.

Send and Read Timeouts#

proxy_send_timeout specifies the maximum time nginx waits for the backend server to accept request data. This timeout is critical during file uploads, where large request bodies take extended periods to transmit from nginx to the backend. The current production configuration sets this to 60 seconds, which is insufficient for multi-gigabyte transfers.

proxy_read_timeout specifies the maximum time nginx waits for the backend to send response data. Production currently configures 300 seconds (5 minutes) with a comment noting "longer for potentially slow API responses". This timeout affects downloads of large hash lists and wordlists from the backend to clients.

When either timeout expires, nginx terminates the connection. Clients typically observe this as a connection reset or HTTP 502 Bad Gateway error, depending on when during the request lifecycle the timeout occurs.

Calculating Required Timeouts#

For large file uploads over internal networks, timeout requirements can be estimated from file size and network bandwidth:

Upload Time = File Size ÷ Network Bandwidth

For example:

10 GB file over 100 Mbps network: ~800 seconds
50 GB file over 1 Gbps network: ~400 seconds

The proposed configuration increases both timeouts to 3600 seconds (1 hour), providing substantial headroom for large transfers over slower network links common in laboratory environments. This value accommodates files up to approximately:

45 GB over 100 Mbps links
450 GB over 1 Gbps links

WebSocket Exception#

The configuration includes a dedicated location block for WebSocket connections that requires different timeout behavior. The /cable location sets proxy_read_timeout 86400s (24 hours) to maintain persistent Action Cable connections for real-time dashboard updates. These connections remain open indefinitely with periodic heartbeat messages, requiring timeouts far longer than typical HTTP requests.

Load Balancing and Active Storage Integration#

Upstream Configuration#

CipherSwarm's nginx configuration uses the least_conn load balancing algorithm, which routes incoming requests to the backend replica with the fewest active connections. This algorithm is particularly beneficial for large file uploads, where connections remain open for extended periods. Without connection-aware distribution, a single backend replica could become saturated with multiple simultaneous large uploads while other replicas remain idle.

The upstream block maintains a pool of 32 keepalive connections to backend servers, avoiding repeated TCP handshake overhead for sequential operations. This optimization is significant in airgapped environments where latency predictability allows aggressive connection reuse.

Health monitoring uses a dedicated /health endpoint on port 8080, decoupling nginx's health status from backend availability. This design prevents cascading failures where a single unhealthy backend replica causes the entire load balancer to report as unhealthy.

Active Storage Upload Workflow#

Rails Active Storage implements direct uploads through a multi-step process that nginx must accommodate:

Client initiates an upload request through the Rails application
Rails generates a signed URL for direct upload and returns it to the client
Client performs an HTTP PUT to /rails/active_storage/disk/[token] with the file content
Nginx proxies the PUT request to a backend Puma server
Rails writes the uploaded file to the shared Docker volume at /rails/storage

This workflow requires nginx's large file upload configuration at step 4, where the PUT request body contains the entire file. The client_max_body_size, proxy_request_buffering, and timeout directives all apply during this proxying phase.

CipherSwarm uses local disk storage by default with Docker volumes shared between web replicas and Sidekiq background workers. This enables immediate file access for processing without network transfer delays. S3-compatible storage backends (MinIO, SeaweedFS, Garage) are supported as alternatives, but nginx's configuration requirements remain identical — the storage backend is transparent to the reverse proxy layer.

Upload Flow Diagram#

The following sequence diagram illustrates the complete Active Storage direct upload process with nginx configuration checkpoints:

Filesystem Exhaustion Issues#

Several production issues in CipherSwarm relate to filesystem management for large file operations, complementing the nginx upload configuration:

Out-of-Memory Crashes: Issue #695 documented crashes when viewing large wordlists through the web interface. The Downloadable concern loaded entire files into memory before applying line limits, causing OOM kills within the 512MB web container memory limit. The solution in PR #697 implemented streaming file reads with configurable limits (1000 lines default, 5000 maximum, 5MB byte cap).

Sidekiq Temporary File Exhaustion: Issue #675 identified filesystem exhaustion when Sidekiq workers processed large Active Storage attachments. When ProcessHashListJob calls list.file.open, Rails streams blobs to temporary files under /tmp on the container's overlay filesystem. With multiple Sidekiq replicas running at default concurrency (10 threads), this rapidly fills available disk space. PR #748's solution implements tmpfs mounts at /tmp (blob downloads) and /rails/tmp (Bootsnap cache), with recommended sizing of 512MB-1GB depending on deployment scale.

Active Storage Retention: Issue #639 noted that uploaded hash list files remain indefinitely in Active Storage after being processed into individual hash items, causing unnecessary storage accumulation. This requires implementing automatic cleanup using ActiveStorage's purge or purge_later methods with appropriate retention policies.

Deployment in Airgapped Environments#

Network Characteristics#

CipherSwarm is designed for deployment in airgapped laboratory and secure facility networks where password cracking operations occur in isolation from the Internet. These environments present distinct considerations for nginx file upload configuration:

No Internet Connectivity: All hash lists, wordlists, and rule files must be uploaded locally rather than downloaded from external repositories. Standard password cracking operations rely on large wordlist databases (crackstation: 15+ GB, rockyou: 133 MB) that would typically be downloaded but must instead be uploaded through the web interface in airgapped deployments.

Predictable LAN Latency: Internal network connections exhibit stable, predictable latency characteristics compared to Internet transfers. This allows operators to tune timeout values more aggressively — a 10 GB upload over a 1 Gbps internal link consistently completes in approximately 80 seconds, whereas the same upload over Internet connections may vary by orders of magnitude due to route changes and congestion.

Fixed Infrastructure: Airgapped environments typically maintain static network topology and bandwidth characteristics. Operators can precisely calculate required timeout values based on measured network performance and expected file sizes, rather than accommodating worst-case Internet scenarios.

Storage Provisioning#

All storage capacity must be pre-provisioned on isolated systems before deployment. Planning considerations include:

Hash lists: 10-50 GB typical for comprehensive enterprise password audits
Wordlist databases: 15+ GB for standard collections (crackstation, weakpass)
Application data: PostgreSQL database for campaign tracking, results storage, and system state
Temporary processing space: Sidekiq worker tmpfs mounts (512MB-1GB per replica)

The nginx service itself requires minimal resources: 0.5 CPU and 128MB memory, with no significant disk storage for its core proxy functionality. However, without proper configuration (specifically proxy_request_buffering off), nginx can exhaust container filesystem space by buffering large uploads to temporary files.

Container Resource Limits#

Production deployments use constrained container resources that affect large file handling:

Nginx: 0.5 CPU, 128MB memory — sufficient for proxy operations with streaming mode
Web service: 512MB memory limit — constrains in-memory file operations like viewing
Sidekiq workers: Minimum 1GB memory recommended to handle concurrent job processing

These limits necessitate the streaming approaches documented in both nginx configuration (proxy_request_buffering off) and application code (streaming file reads for viewing, tmpfs for temporary files).

Complete Configuration Example#

A comprehensive nginx location block incorporating all recommended settings for large file upload support:

location / {
    proxy_pass http://web_backend;
    proxy_http_version 1.1;

    # Large file support
    client_max_body_size 0; # Unlimited upload size
    proxy_request_buffering off; # Stream directly to backend without disk buffering

    # Extended timeouts for large transfers
    proxy_send_timeout 3600s; # 1 hour for uploading to backend
    proxy_read_timeout 3600s; # 1 hour for reading response from backend

    # HTTP connection headers
    proxy_set_header Connection "";
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    # Retry configuration for transient failures
    # Only retries GET/HEAD (idempotent methods); POST/PUT are never retried
    proxy_next_upstream error timeout http_502 http_503 http_504;
    proxy_next_upstream_timeout 30s;
    proxy_next_upstream_tries 3;
}

This configuration removes upload size limits, eliminates filesystem buffering overhead, and provides one-hour timeouts suitable for multi-gigabyte transfers over internal networks.

Monitoring and Troubleshooting#

Log Analysis#

Nginx access logs include detailed upstream timing metrics that reveal upload performance characteristics:

upstream_addr: Identifies which backend replica handled the request, useful for detecting uneven load distribution
upstream_response_time: Measures backend processing time, indicating slow replicas or resource contention
request_time: Total request duration including proxy overhead and network transfer time

Example log analysis commands for production deployments:

# View all requests with upstream timing information
docker compose -f docker-compose.yml -f docker-compose.prod.yml logs nginx | grep upstream_response_time

# Identify slow uploads (backend response time exceeding 10 seconds)
docker compose -f docker-compose.yml -f docker-compose.prod.yml logs nginx | awk '/upstream_response_time=[0-9]{2,}\./ {print}'

Common Upload Issues#

Symptom	Root Cause	Solution
413 Payload Too Large	`client_max_body_size` limit exceeded	Increase limit to accommodate largest expected file, or set to `0` for unlimited
Timeout after 60 seconds	`proxy_send_timeout` insufficient for upload duration	Increase to 3600s or calculate based on file size and bandwidth
Upload completes but file truncated	Request buffering re-enabled after config change	Verify `proxy_request_buffering off` is set; restart nginx service
Slow upload to specific replica	Uneven load distribution or replica resource constraint	Check `least_conn` algorithm configuration; monitor replica resource usage
502 Bad Gateway during upload	Backend replica crashed or became unresponsive	Check backend logs for OOM kills or application errors; verify health check configuration
Nginx container disk full	Request buffering writing to `/var/cache/nginx/client_temp/`	Disable request buffering with `proxy_request_buffering off`

Performance Validation#

After implementing large file upload configuration changes, validate the setup with test uploads:

Upload a file approaching the expected maximum size (e.g., 10 GB)
Monitor nginx logs for buffering warnings: grep "buffered to a temporary file" nginx_logs
Verify upload completes within configured timeout period
Check nginx container disk usage: docker exec nginx df -h
Confirm backend successfully receives and processes the file

Relevant Code Files#

File	Purpose	Key Configurations
docker/nginx/nginx.conf	Main nginx reverse proxy configuration	`client_max_body_size`, `proxy_send_timeout`, `proxy_read_timeout`, upstream load balancing
docker-compose.prod.yml	Production layer compose file	Docker volume mounts for shared storage, container resource limits, tmpfs configuration
config/storage.yml	Rails ActiveStorage backend selection	Local disk (`local`) vs S3-compatible storage (`s3`) configuration
config/environments/production.rb	Rails production environment settings	ActiveStorage service selection via `ACTIVE_STORAGE_SERVICE` environment variable

Load Balancing Visualization#

The following diagram illustrates how nginx's least_conn algorithm distributes long-running upload connections:

Request Buffering Comparison#

The critical difference between buffered and streaming upload modes:

Rails Active Storage Direct Upload: Multi-step upload workflow with signed URLs that nginx must proxy transparently
Docker Overlay Filesystem Performance: Storage layer constraints affecting nginx temporary file buffering and container disk usage
Load Balancing Algorithms: Comparison of least_conn, round_robin, and ip_hash for long-running connections
Timeout Tuning Strategies: Calculating appropriate timeout values based on file size, network bandwidth, and reliability requirements
Hashcat Distributed Cracking: Password recovery operations that generate the large hash list and wordlist files requiring upload support