INVIDEC

Original Text

Detection Summary

Hidden Chars

Watermark Signals

Tag Chars (Smuggling)

Mixed-Script Glyphs

AI Buzzwords

Unique Types

0 unresolved issues —

Paste text on the left to automatically analyze it...

Paste text on the left first...

Detection Categories

◆ Zero-Width & BiDi: ZWSP, ZWJ, ZWNJ, LRM, RLM, LRE, RLE, PDF, LRO, RLO, FSI, LRI, RLI, PDI

◆ Extended Whitespace: NBSP, NNBSP (OpenAI o3/o4 watermark), En/Em/Thin/Hair/Figure space and all Unicode spacing variants

◆ Control Characters: C0 (U+0000–U+001F) and C1 (U+007F–U+009F)

◆ Invisible Operators: Word Joiner, Function Application, Invisible Times/Plus/Separator (U+2060–U+2064), Soft Hyphen, Combining Grapheme Joiner

◆ Unicode Tags Block (ASCII Smuggling): U+E0000–U+E007F — invisible ASCII alphabet used for prompt injection and steganographic watermarking

◆ Variation Selectors: U+FE00–U+FE0F & U+E0100–U+E01EF — can encode hidden bits per character

◆ Mixed-Script Homoglyphs: Only flagged when a single word contains characters from two different scripts. Pure Cyrillic, Greek, or Latin text is never flagged.

◇ Stylistic: Em dash (—) → en dash (–). Curly quotes (‘’“”) are not flagged — they are normal in many languages.

◇ AI Buzzwords: Statistically over-represented in LLM output (English only)

ℹ About SynthID-Text (Google DeepMind)

SynthID-Text does not insert Unicode characters. It biases token selection probabilities via a pseudorandom g-function keyed to a secret — the pattern is statistical and undetectable by character scanning. The OpenAI o3/o4-mini NNBSP (U+202F) character-level watermark is detectable and flagged above.