LazyTools – Header

Normalize Unicode Text

Convert fancy Unicode text — bold, italic, cursive, circled, fullwidth, mathematical — back to plain ASCII. Remove diacritics, strip invisible characters, and apply Unicode normalization forms (NFC, NFD, NFKC, NFKD). Free, client-side, no sign-up.

Client-Side ✓ Free Forever No Sign-Up
5.4K+monthly searches
4Unicode forms

Normalize Unicode Text

Runs in your browser — nothing sent to a server
Unicode normalization form
Character options
Text options
Output case:
Characters detected in input
Paste text to scan…
0 Chars converted
0 Input length
0 Invisible removed
0 Accents removed
Input text 0 chars
Normalized output 0 chars
4.8/5
1284 ratings

Rate this tool

How useful was the Normalize Unicode tool?

Normalize your text in 4 steps

Choose a normalization form, set options, paste your text, and copy or download the clean result.

1

Choose a form

Select Fancy → Plain for social media / LinkedIn text, or a Unicode normalization form: NFKC for the broadest compatibility cleaning, NFC for web standards.

2

Set options

Enable Remove diacritics to strip accents (café → cafe), Replace smart quotes for ASCII punctuation, and Expand ligatures to split fi → fi.

3

Paste your text

With Live preview on, the output updates instantly. The detection panel shows what Unicode character categories were found in your input.

4

Copy or download

Choose an output case (preserve, upper, lower, sentence, or title), then copy to clipboard or download as a .txt file. Stats show exactly how many characters were changed.

Advertisement
728 × 90 — Leaderboard Ad

Frequently asked questions

Everything you need to know about Unicode normalization.

Unicode allows some characters to be represented in multiple ways. For example, the letter "é" can be stored as a single precomposed character (U+00E9) or as two characters: "e" followed by a combining accent (U+0065 + U+0301). These look identical but are different byte sequences, causing string comparisons to fail, search to miss results, and databases to store duplicates. Normalization converts text to a consistent single form so it behaves predictably.
NFC (Canonical Decomposition followed by Canonical Composition) — the standard for web and most modern systems. Prefers single composed characters. NFD (Canonical Decomposition) — decomposes characters into base + combining marks. Used in some file systems (macOS HFS+). NFKC (Compatibility Decomposition + Composition) — like NFC but also converts compatibility characters: fi → fi, ① → 1, ½ → 1/2, and fancy Unicode fonts → plain ASCII. Recommended for search indexing and databases. NFKD — the decomposed version of NFKC.
Fancy Unicode text uses characters from Unicode's Mathematical Alphanumeric Symbols block (U+1D400–U+1D7FF) which were originally designed for mathematical notation — bold, italic, script, fraktur, double-struck, monospace, sans-serif. Because they're actual Unicode code points (not formatting), they can be pasted anywhere plain text is accepted — social media bios, usernames, LinkedIn headlines, Twitter names. This tool converts them back to regular A-Z letters so they work in databases, search engines, and accessibility tools.
Diacritics are accent marks added to letters — é, ü, ñ, ç, ă, ș, and hundreds more. Removing them converts accented characters to their base ASCII equivalents: café → cafe, résumé → resume, naïve → naive, Ångström → Angstrom. This is useful when preparing text for systems that only support ASCII, creating URL slugs, or normalizing names for matching. The tool uses NFD decomposition to separate base letters from combining marks, then strips the marks.
The tool detects and removes: Zero-Width Space (U+200B), Zero-Width Non-Joiner (U+200C), Zero-Width Joiner (U+200D), Byte Order Mark (U+FEFF), Soft Hyphen (U+00AD), Left-to-Right Mark (U+200E), Right-to-Left Mark (U+200F), and other formatting-only characters. These appear commonly in AI-generated text (used as watermarks), text copied from web pages, and PDFs. They're invisible but can cause search failures, database mismatches, and encoding errors.
Ligatures are single Unicode characters that represent two or more letters merged together — fi (fi), ff (ff), ffi (ffi), ffl (ffl), st (st), Æ (AE), and others. PDFs and some fonts emit these as single characters when you copy text, causing word searches to miss them (searching "office" won't find "office"). Expanding ligatures splits them back into individual letters, making the text fully searchable and compatible.
No. All processing runs entirely in your browser using JavaScript. Your text never leaves your device and is never transmitted to LazyTools or any third party. This makes it safe to use with confidential documents, customer data, proprietary content, or any sensitive text.

LazyTools vs other Unicode normalizers

How we compare on features that matter for real text cleaning tasks.

Feature LazyTools ✦ texttools.org onlinetools.com thetoolsfoundry.com inputoutput.dev
Fancy Unicode → plain ASCII
NFC / NFD normalization forms
NFKC / NFKD normalization forms
Remove diacritics / accents
Remove invisible Unicode chars
Replace smart quotes / fancy punctuation
Expand ligatures (fi → fi)
Output case control (upper / lower / sentence / title)
Live character-type detection panel
Live stats (chars converted, invisible removed)
Live preview (auto-normalizes)
Side-by-side input / output
Download output as .txt
100% client-side (private)