Original Text

Detection Summary

0
Hidden Chars
0
Watermark Signals
0
Tag Chars (Smuggling)
0
Mixed-Script Glyphs
0
AI Buzzwords
0
Unique Types
0 unfixed issues —

Paste text on the left to automatically analyze it...

Paste text on the left first...

Detection Categories

◆ Zero-Width & BiDi: ZWSP, ZWJ, ZWNJ, LRM, RLM, LRE, RLE, PDF, LRO, RLO, FSI, LRI, RLI, PDI
◆ Extended Whitespace: NBSP, NNBSP (OpenAI o3/o4 watermark), En/Em/Thin/Hair/Figure space and all Unicode spacing variants
◆ Control Characters: C0 (U+0000–U+001F) and C1 (U+007F–U+009F)
◆ Invisible Operators: Word Joiner, Function Application, Invisible Times/Plus/Separator (U+2060–U+2064), Soft Hyphen, Combining Grapheme Joiner
◆ Unicode Tags Block (ASCII Smuggling): U+E0000–U+E007F — invisible ASCII alphabet used for prompt injection and steganographic watermarking
◆ Variation Selectors: U+FE00–U+FE0F & U+E0100–U+E01EF — can encode hidden bits per character
◆ Mixed-Script Homoglyphs: Only flagged when a single word contains characters from two different scripts. Pure Cyrillic, Greek, or Latin text is never flagged.
◇ Stylistic: Em dash (—) → en dash (–). Curly quotes (‘’“”) are not flagged — they are normal in many languages.
◇ AI Buzzwords: Statistically over-represented in LLM output (English only)

ℹ About SynthID-Text (Google DeepMind)

SynthID-Text does not insert Unicode characters. It biases token selection probabilities via a pseudorandom g-function keyed to a secret — the pattern is statistical and undetectable by character scanning. The OpenAI o3/o4-mini NNBSP (U+202F) character-level watermark is detectable and flagged above.