Unicode Converter
Convert characters and Unicode code points with emoji support and detailed character info
Usage Instructions
🚀 Quick Start
- Characters → Code points: enter text, then click "Encode" to see code points (e.g., U+0041 U+1F600)
- Code points → Characters: enter code points (U+ / 0x / decimal / \u) and click "Decode" to get readable text (including composite emoji)
- Batch: convert multiple characters or code points at once
- Direction: choose via the Encode/Decode buttons
📌 Common Scenarios
- Programming: handle encodings and debug string issues
- Data analysis: inspect special characters/encodings in text
- Networking: parse/verify hex fields in packets; directionality
- Education: learn fundamentals of character encodings
- Security: detect invisible characters (e.g., zero‑width, ZWJ)
- Format conversion: convert between different encoding systems
🎛️ Conversion Parameters & Formats
- Code point: U+4E2D, U+0041
- Escape sequence: \u4E2D, \u0041
- Decimal: 20013, 65
- Hexadecimal: 0x4E2D, 0x0041
- Separators: spaces, commas, or semicolons
🧭 Usage Advice
- Text input: type or paste the content
- Emoji specifics: some emoji are composed of multiple code points (base + variation selector + ZWJ)
- Surrogate awareness: characters beyond the BMP (> U+FFFF) appear as UTF‑16 surrogate pairs
⚠️ Limitations & Compatibility
- Invalid range: values <0 or >0x10FFFF are rejected
- Rendering: display varies by platform/font
- Blocks: a few historical blocks may not render on some terminals/browsers
🔒 Privacy & Security
- All processing happens in your browser; data never leaves your device
❓ FAQ
Why does one character show two or more UTF‑16 units?
Because its code point is beyond the BMP (> U+FFFF), so UTF‑16 uses a surrogate pair (high/low)
Why are some emoji made of multiple code points?
Some emoji are composite sequences (base + variation selector + ZWJ)
Can I mix hexadecimal and decimal?
Yes. U+ / 0x / \u / decimal are auto‑detected; spaces/commas/semicolons work as separators
How is it different from ASCII?
ASCII covers 0–127; Unicode spans global scripts and symbols up to U+10FFFF
