You have done it thousands of times. Click a button, pick a file, watch a progress bar crawl across the screen. Maybe you have even cursed at that bar when it stalls at 99% on a slow connection. But have you ever wondered what is actually happening during those seconds (or minutes) while your file makes its journey to the cloud?
The truth is, modern file uploads are far more sophisticated than most developers realize. Behind that simple progress bar lies a complex orchestration of chunking algorithms, parallel connections, integrity verification, and recovery mechanisms. These are the same patterns that power Google Drive, Dropbox, AWS S3, and every major cloud storage platform handling billions of uploads daily.
Let us pull back the curtain and explore what really happens when you upload a file.
The Problem with Simple Uploads
In the early days of the web, uploading a file was straightforward: pack the entire file into an HTTP POST request and send it to the server. This approach works fine for small files, but it falls apart quickly when files get larger or networks get flaky.
Consider what happens when you try to upload a 500 MB video using a traditional single-request approach. If your connection drops at 95%, you lose everything and have to start over. On mobile networks, where connections are inherently unstable, this becomes a nightmare. Even on stable connections, the browser has to hold the entire file in memory, potentially causing performance issues.
The major cloud platforms solved these problems years ago. Today, every serious file upload system uses a combination of three core techniques: chunked uploads, resumability, and parallel transfers. These are not optional optimizations. They are table stakes for any production upload system.
Breaking Files into Chunks
The first insight is simple but powerful: instead of sending a file as one giant blob, break it into smaller pieces. Each piece (called a chunk or part) can be uploaded independently, verified independently, and retried independently if something goes wrong.
But what size should each chunk be? This is where things get interesting. Different platforms have landed on different answers based on their specific use cases:
- Google Drive requires chunks to be multiples of 256 KiB, recommending 8 MiB or larger for performance
- Dropbox uses fixed 4 MB blocks, which aligns with their deduplication system
- AWS S3 allows 5 MB to 5 GB per part, with 100 MB being optimal for most scenarios
- Azure Blob Storage supports blocks up to 4000 MiB (with modern API versions), with up to 50,000 blocks per blob
- Backblaze B2 recommends 100 MB parts for large files
The choice of chunk size involves real tradeoffs. Smaller chunks mean faster recovery when something fails (you only lose a small piece), but they add overhead from more HTTP requests. Larger chunks are more efficient for transfers but riskier on unstable connections. Most systems adapt their chunk size based on network conditions, using smaller chunks on mobile connections and larger ones on stable high-bandwidth networks.
The Magic of Resumable Uploads
Chunking alone does not solve the resumability problem. You also need a way to track which chunks have been successfully uploaded and pick up where you left off after an interruption. This is where upload sessions come in.
Here is how Google Drive's resumable upload protocol works, which has become something of an industry reference:
1. INITIATE SESSION
POST /upload/drive/v3/files?uploadType=resumable
Response: Location header with session URI
2. UPLOAD CHUNKS
PUT {sessionUri}
Content-Range: bytes 0-524287/2097152
Response: 308 Resume Incomplete
3. ON INTERRUPTION - QUERY PROGRESS
PUT {sessionUri}
Content-Range: bytes */*
Response: 308, Range: bytes=0-524287
4. RESUME FROM LAST BYTE
PUT {sessionUri}
Content-Range: bytes 524288-2097151/2097152The beauty of this approach is that the session URI acts as a stateful handle to an in-progress upload. You can close your browser, restart your computer, switch networks, and still resume the upload days later (Google keeps sessions alive for a week). The server tracks which bytes it has received, and the client can query that progress at any time.
Microsoft OneDrive and Box use similar session-based approaches, though with their own protocol variations. The open-source tus protocol standardizes these patterns and is supported by client libraries in over 20 programming languages.
Parallel Uploads: The Speed Multiplier
Here is a question that seems obvious in hindsight: if you are uploading a file in chunks, why upload them one at a time? Network connections are designed to handle multiple simultaneous streams. By uploading several chunks in parallel, you can dramatically increase throughput.
AWS Transfer Acceleration combined with multipart uploads can provide significant performance improvements. In typical scenarios with files over 100 MB, the combination of parallelization and edge optimization can reduce upload times by 50% or more compared to traditional single-part uploads.
The optimal number of parallel connections depends on your environment: 4 to 6 for desktop browsers, 2 to 4 for mobile on WiFi, and 1 to 2 for cellular connections. Going higher than this typically does not help because you hit network bandwidth limits rather than connection limits.
Note:An important note: with parallel uploads, chunks can arrive at the server out of order. The server stores them temporarily and assembles them into the final file only after all chunks are received and verified. This is why the "commit" or "finalize" step is essential in multipart upload protocols.
How Dropbox Revolutionized Deduplication
While most platforms focus on reliable transfer, Dropbox took chunking a step further with block-level deduplication. Instead of just breaking files into arbitrary pieces, they use content-defined chunking with SHA-256 hashes to identify unique blocks.
Here is the clever part: before uploading any chunk, the client sends its hash to the server and asks, "Do you already have this?" If Dropbox has seen that exact sequence of bytes before (from any user, in any file), the chunk does not need to be uploaded at all. The server just references the existing copy.
This has massive implications. If you upload a 100 MB file and only 12 MB of it is truly new content, you only upload 12 MB. When millions of users share similar files (think common software installers, stock photos, or popular documents), the storage and bandwidth savings are enormous.
Dropbox built their own storage system called Magic Pocket to support this architecture at exabyte scale. The system includes specialized hardware with SMR (Shingled Magnetic Recording) drives, intelligent tiering between hot and cold storage, and redundancy through erasure coding rather than simple replication.
Delta Sync: Uploading Only What Changed
Microsoft OneDrive takes efficiency in a different direction with Remote Differential Compression (RDC). Rather than deduplicating across all users, they focus on syncing changes to files you already have in the cloud.
Imagine you have a 50 MB PowerPoint presentation synced to OneDrive. You add one new slide and save. With a naive approach, you would re-upload all 50 MB. With RDC, OneDrive compares the local and remote versions using rolling checksums, identifies the specific bytes that changed, and uploads only the delta. That one new slide might only be a few hundred kilobytes.
Originally optimized for Office files, OneDrive has progressively extended differential sync to support all file types, including images, videos, PDFs, and ZIP archives. The impact on bandwidth consumption, especially for large files that get frequent small updates, is substantial.
Presigned URLs: The Security Pattern Everyone Uses
There is a subtle but important security challenge with file uploads: how do you let a browser upload directly to cloud storage without exposing your credentials?
The answer, used by AWS S3, Azure, Google Cloud Storage, and most modern platforms, is presigned URLs. Your backend server (which has credentials) generates a special URL that grants temporary, limited permission to upload a specific file. The browser uses this URL to upload directly to cloud storage without ever seeing your secret keys.
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
const s3Client = new S3Client({ region: "us-east-1" });
// Backend generates a secure, time-limited upload URL
const command = new PutObjectCommand({
Bucket: "my-bucket",
Key: "uploads/user-123/document.pdf",
ContentType: "application/pdf",
});
const presignedUrl = await getSignedUrl(s3Client, command, {
expiresIn: 900, // 15 minutes
});
// Browser uploads directly to S3 using this URL
// No credentials exposed to the clientThe security best practices here are critical. Keep expiration times short (5 to 15 minutes for uploads). Generate URLs only on the server, never in client code. Include content type and size restrictions in the signature. Monitor usage with audit logs. The presigned URL pattern is elegant, but misconfigured implementations have led to real security breaches.
Zero-Knowledge Encryption: MEGA's Approach
Most cloud storage providers encrypt your files in transit and at rest, but they hold the encryption keys. This means they could, technically, access your data. MEGA took a fundamentally different approach with client-side, zero-knowledge encryption.
In MEGA's architecture, encryption happens entirely in your browser before any data leaves your device. Each file gets a random AES-128 key. That key is then encrypted with your master key (derived from your password via PBKDF2). The encrypted file and the encrypted file key are sent to MEGA's servers. But crucially, your master key never leaves your device.
The result is that MEGA genuinely cannot read your files. Even if compelled by law enforcement or compromised by attackers, they do not have the keys to decrypt anything. pCloud offers a similar opt-in "Crypto Folder" feature using RSA 4096-bit for key exchange and AES 256-bit for file encryption.
Warning:Zero-knowledge encryption comes with tradeoffs. Forget your password, and your data is gone forever. There is no "forgot password" recovery. Server-side features like search and preview cannot work because the server cannot read your files. It is a powerful tool, but not always the right one.
Integrity Verification: Trust, But Verify
With all these chunks flying around in parallel, potentially across unreliable networks, how do you ensure nothing got corrupted along the way? Every major platform implements integrity verification using cryptographic hashes.
The approach varies by provider. AWS S3 supports CRC32, CRC32C, SHA-1, and SHA-256 checksums. Box requires SHA-1 digests with every chunk and the final commit. Backblaze B2 mandates SHA-1 per part. The client calculates a hash of each chunk before sending, includes it in the request, and the server verifies the received data matches. If there is a mismatch, the chunk is rejected and must be re-sent.
Some platforms go further. pCloud uses Merkle trees (the same data structure that underlies Git and Bitcoin) to create a hierarchical verification structure where any change to any chunk invalidates the root hash. This makes tampering detection cryptographically robust.
When Things Go Wrong: Retry Strategies
Network failures are inevitable. Servers have hiccups. Rate limits get hit. A robust upload system needs a sophisticated retry strategy. The industry has converged on exponential backoff with jitter:
// Helper function for async delay
const sleep = (ms) => new Promise(resolve => setTimeout(resolve, ms));
async function uploadWithRetry(chunk, attempt = 0) {
const MAX_RETRIES = 5;
const BASE_DELAY = 1000; // 1 second
const MAX_DELAY = 60000; // 60 seconds
try {
return await uploadChunk(chunk);
} catch (error) {
if (isRetryable(error) && attempt < MAX_RETRIES) {
// Exponential delay: 1s, 2s, 4s, 8s, 16s...
let delay = Math.min(
BASE_DELAY * Math.pow(2, attempt),
MAX_DELAY
);
// Add jitter to prevent thundering herd
delay += Math.random() * delay * 0.1;
await sleep(delay);
return uploadWithRetry(chunk, attempt + 1);
}
throw error;
}
}
function isRetryable(error) {
const retryableCodes = [408, 429, 500, 502, 503, 504];
return retryableCodes.includes(error.status);
}The jitter (adding randomness to the delay) is important. Without it, if a server comes back online after an outage, all waiting clients would retry at exactly the same moment, potentially overwhelming the server again. Jitter spreads out the retries to prevent this "thundering herd" problem.
The Complete Picture
Pulling this all together, a modern file upload involves a remarkable amount of engineering:
- Session initiation: Client requests an upload session from the server, receives a unique session identifier
- Chunking: The file is split into appropriately-sized pieces based on network conditions
- Hash calculation: SHA-256 (or similar) hash computed for each chunk for integrity verification
- Presigned URL generation: Server generates secure, temporary URLs for direct-to-storage uploads
- Parallel transfer: Multiple chunks upload simultaneously across several connections
- Progress tracking: Client persists upload state locally for recovery across sessions
- Integrity verification: Server validates each chunk's hash before accepting it
- Retry handling: Failed chunks automatically retry with exponential backoff
- Assembly: After all chunks arrive, server assembles them into the final file
- Finalization: Client commits the upload, server moves file to permanent storage
All of this happens behind that simple progress bar. The user sees "75% complete." The system sees a distributed state machine coordinating temporary storage, integrity checks, network recovery, and eventually consistent replication across data centers.
What This Means for Developers
If you are building an application that handles file uploads, you do not need to implement all of this from scratch. The tus protocol provides an open standard with implementations in every major language. Cloud providers like AWS, Azure, and Google Cloud offer SDKs that handle multipart uploads automatically. Services like Cloudinary, Uploadcare, and Transloadit provide complete managed solutions.
But understanding these patterns matters. It helps you choose the right tool for your use case. It helps you debug problems when uploads fail. And it gives you appreciation for the engineering that makes something as seemingly simple as "upload a file" actually work reliably at scale.
The best upload experience is one the user never has to think about. Fast on good connections. Resilient on bad ones. Recoverable when things go wrong.
Next time you click that upload button, you will know exactly what is happening in those seconds while the progress bar fills. And if it stalls at 99%? Well, at least now you know there is a session waiting for you to pick up where you left off.
Tip:Want to dive deeper? The external resources below include official documentation from Google Drive and AWS S3, plus Dropbox's engineering deep-dive on rebuilding their sync engine for billions of files.
