Text Encoding Explorer

See how characters are represented as bytes. Understand ASCII, UTF-8, and Unicode without diving into implementation details.

Enter Text

Encoding Analysis

0
Characters
0
UTF-8 Bytes
0
ASCII Chars
0
Multi-byte

Character-by-Character Analysis

Char Unicode UTF-8 Bytes Hex Type

How Text Encoding Works

The Core Concept

Computers store everything as numbers. Text encoding is simply a system that assigns a number to each character. That number is then stored as bytes.

ASCII (7-bit)

The oldest standard. Uses numbers 0-127 to represent basic English letters, digits, and symbols. Each character = 1 byte.

Example
'A' = 65 = 01000001 in binary

Unicode

A universal catalog that assigns a unique number (code point) to every character in every language. Written as U+XXXX (e.g., U+0041 for 'A').

UTF-8

The most common encoding for storing Unicode. Variable-length: ASCII characters use 1 byte, other characters use 2-4 bytes.

Code Point RangeBytesExample
U+0000 to U+007F1 byteA, B, 1, 2
U+0080 to U+07FF2 bytesΓ©, Γ±, Ξ±
U+0800 to U+FFFF3 bytesδΈ­, ζ—₯, €
U+10000 to U+10FFFF4 bytes🌍, πŸ˜€