Unicode Converter
Convert text and emoji to Unicode code points, decode U+, 0x, simple \u, or decimal tokens, and inspect UTF-8 bytes, UTF-16 units, and surrogate pairs locally.
Input Text
Characters
0 / 50,000
Quick Start
Common Scenarios
Programming
handle encodings and debug string issues
Data analysis
inspect special characters/encodings in text
Protocol text
inspect pasted payload text as code points and UTF-8/UTF-16 representations without decoding packet bytes.
Education
learn fundamentals of character encodings
Security review
spot-check zero-width, variation selector, ZWJ, or control code points without a confusable or bidi report.
Format conversion
switch between readable text and code point tokens
Conversion Parameters & Formats
Code Point Boundaries
Usage Advice
Limitations & Compatibility
Privacy & Security
FAQ
Because its code point is beyond the BMP (> U+FFFF), so UTF‑16 uses a surrogate pair (high/low)
Some emoji are composite sequences (base + modifier + variation selector + ZWJ). This tool shows their code points, but it does not validate grapheme clusters or emoji standards.
Yes. U+, 0x, simple \u, and decimal tokens can be mixed. Use prefixes for hexadecimal values; unprefixed tokens are treated as decimal.
No. Decode only parses U+, 0x, simple \u, and decimal tokens. JavaScript brace escapes, CSS escapes, HTML entities, and URL percent encoding belong to separate tools.
The detail cards use lightweight local ranges and heuristics. They do not include official Unicode names, versions, scripts, bidi classes, or normalization data.
No. It shows UTF-8 bytes and UTF-16 code units for inspected characters, but it does not decode pasted byte streams, byte order, or other character sets.
ASCII covers 0–127; Unicode spans global scripts and symbols up to U+10FFFF