Original Research

Encoding Format Comparison

By Michael Lip · Published April 10, 2026 · Data source: Stack Overflow API, RFC specifications · Last updated:
12
Formats Compared
1,290
SO Votes (Top Q)
1.25x
Lowest Overhead
6x+
Highest Overhead

Every encoding format makes a tradeoff between size overhead, character safety, and compatibility. Base64 adds 33% overhead but works everywhere. Hex doubles the size but is human-readable. URL encoding varies wildly based on input content. This guide compares 12 encoding formats with real overhead calculations and browser support data.

Encoding Format Comparison Table

Format Overhead Ratio Character Set Browser API Reversible Primary Use Cases Standard
Base64 1.33x A-Z a-z 0-9 + / = btoa() / atob() Yes Email (MIME), data URIs, API payloads, embedded images RFC 4648
Base64url 1.33x A-Z a-z 0-9 - _ Manual (replace +/ with -_) Yes JWT tokens, URL parameters, filenames RFC 4648 S5
Hex (Base16) 2.00x 0-9 A-F Manual (toString(16)) Yes Hash digests, color codes, MAC addresses, debugging RFC 4648
ASCII85 (Base85) 1.25x 33-117 ASCII range No native API Yes PDF streams, Git binary patches, PostScript btoa format
Base32 1.60x A-Z 2-7 = No native API Yes TOTP/HOTP secrets, case-insensitive systems, DNS RFC 4648
URL Encoding (%) 1x-3x* A-Z a-z 0-9 - _ . ~ %HH encodeURIComponent() Yes URL query parameters, form data, path segments RFC 3986
HTML Entities 1x-6x* Named (&) or numeric (&) DOM textContent (implicit) Yes HTML content, preventing XSS, special chars in markup HTML5 spec
JSON String Escaping 1x-6x* Unicode escape: \uXXXX JSON.stringify() Yes JSON values, API responses, config files RFC 8259
JWT (JSON Web Token) ~1.5x Base64url segments separated by dots No native API Decode yes, verify needs key Authentication tokens, API authorization, SSO RFC 7519
UUencode 1.37x 32-95 ASCII range No native API Yes Legacy email attachments, Usenet (mostly deprecated) IEEE Std 1003.1
Quoted-Printable 1x-3x* Printable ASCII + =HH escapes No native API Yes Email headers, MIME, mostly-ASCII text RFC 2045
Punycode 1x-5x* a-z 0-9 - (ASCII-compatible) URL constructor (implicit) Yes Internationalized domain names (IDN), Unicode in DNS RFC 3492

* Variable overhead depends on input content. URL encoding is 1x for ASCII alphanumerics, 3x for special characters. HTML entities range from 1x (no special chars) to 6x+ (all special chars like & = 5 chars for 1 char).

Size Overhead Comparison (1 KB Input)

Format Binary Input (1 KB) ASCII Text Input (1 KB) Unicode Text Input (1 KB) Encoding Speed
ASCII851,280 B1,024 B (passthrough)1,280 BFast
Base641,368 B1,368 B1,368 BVery Fast
Base64url1,368 B1,368 B1,368 BVery Fast
UUencode1,406 B1,406 B1,406 BFast
Base321,640 B1,640 B1,640 BFast
URL Encoding3,072 B~1,050 B~2,800 BVery Fast
Hex2,048 B2,048 B2,048 BVery Fast
Quoted-Printable3,072 B~1,030 B~2,500 BFast

Community Questions from Stack Overflow

Real encoding questions from developers, sourced from the Stack Overflow API (sorted by votes):

1,290
408
382
236
223
155
112

Methodology

  • Overhead ratios — Calculated from the mathematical encoding algorithms as defined in their respective RFCs. Base64: ceil(n/3)*4, Hex: n*2, Base32: ceil(n/5)*8, ASCII85: ceil(n/4)*5
  • Size comparison — Measured using 1 KB test inputs: random binary data, ASCII-only English text, and mixed Unicode text (CJK + emoji + Latin)
  • Browser support — Verified against MDN Web Docs compatibility tables for native API availability
  • Stack Overflow data — Fetched via api.stackexchange.com/2.3/search?intitle=base64+encoding on April 10, 2026

Frequently Asked Questions

What is the overhead of Base64 encoding?

Base64 encoding has a fixed overhead ratio of approximately 1.33x (4/3). Every 3 bytes of input produce 4 bytes of output, plus padding characters. For a 1 MB file, the Base64 encoded version will be approximately 1.33 MB. Base64url variant uses - and _ instead of + and / to be URL-safe.

When should I use Hex encoding vs Base64?

Use Hex encoding when human readability matters (hash digests, color codes, MAC addresses) or when output must use only alphanumeric characters. Use Base64 when space efficiency matters more — Base64 has 1.33x overhead vs Hex's 2x overhead. For embedding binary data in JSON, XML, or email, Base64 is the standard.

Why does URL encoding sometimes encode spaces as + and sometimes as %20?

There are two different URL encoding standards. 'application/x-www-form-urlencoded' (used in HTML form submissions) encodes spaces as +, while RFC 3986 percent-encoding uses %20. The + convention comes from the original HTML form specification. JavaScript's encodeURIComponent() produces %20. PHP's urlencode() uses +, while rawurlencode() uses %20.

What encoding format has the lowest overhead?

ASCII85 (also known as Base85) has the lowest overhead among common text-based binary encodings at 1.25x (5 bytes output per 4 bytes input). This compares to Base64 at 1.33x and Hex at 2.0x. However, ASCII85 uses a wider character set including special characters, making it less suitable for URLs or XML.

Is it safe to put Base64 data in a URL?

Standard Base64 is NOT URL-safe because it uses + and / characters which have special meanings in URLs. Use Base64url encoding instead, which replaces + with - and / with _. JWT tokens use Base64url encoding for this reason. Most URLs have practical length limits of 2,048-8,192 characters, so very large Base64 payloads should use POST body instead.

Related Tools

Base64 Encoder/Decoder URL Encoder/Decoder HTML Encoder/Decoder JWT Decoder

Free to use under CC BY 4.0 license. Cite this page when sharing.