i18n & RTL
Unicode word-boundary behaviour, RTL caret tracking, IME composition guards.
@danielivanovz/mention ships with the Unicode plumbing most mention libraries skip — CJK word boundaries, RTL bidi-aware caret math, and IME composition guards.
Unicode word boundaries
The trigger character (@ by default) only fires when it sits at a word boundary — start-of-input, after whitespace, or after a CJK / Thai / Khmer / Lao / Myanmar character. This is what blocks email-like false positives (foo@bar.com does not open the menu).
For non-whitespace-segmented scripts the library uses Unicode property regex on the previous character to detect the soft boundary:
// triggers: previous char is hiragana / kanji / hangul / etc.
"こんにちは@田" // → opens at @, query = "田"
"한국어 @" // → opens at @
"ภาษาไทย@" // → opens at @Known limitation: Han + @ + Latin
用户@example — Han characters then @ then Latin — opens the menu. The library can't distinguish this from the intended CJK trigger case (こんにちは@田中) at the dispatcher layer; both produce identical Intl.Segmenter output. The bias is toward triggering, since CJK users genuinely need the feature; the rare email-like patterns prompt the popover, which Escape dismisses.
If your domain has many email-like Han+@+Latin patterns and few real mentions, override the trigger to a different character (for example, @ (U+FF20, fullwidth at-sign) used in CJK contexts).
RTL — Hebrew, Arabic
The mirror-div caret math is direction-aware. Set dir="rtl" on the textarea (or any ancestor) and the popover follows the caret correctly for right-to-left scripts:
<Mention.Input dir="rtl" aria-label="הודעה" />The library copies direction and unicode-bidi to the mirror so its bidi resolution matches the textarea exactly.
RTL caveats
- Pixel-perfect caret tracking at end-of-content is environment-dependent in chromium's RTL inline layout. The mid-content trigger case (the typical mention scenario) is precise; tracking a caret past the last character of pure RTL content may drift by a few pixels in some font / wrap conditions.
getInsertTextshould match the user's locale. Inserting@usernameinto an RTL textarea produces a left-to-right run inside RTL content; the bidi algorithm handles the visual direction correctly but consider whether you want to wrap with explicit Unicode bidi controls (U+202B / U+202C) for stable cursor behavior near the chip.
IME composition (CJK input methods)
Japanese, Chinese Pinyin, and other input methods commit multiple characters at once via compositionstart / compositionend events. The library suppresses dispatch during composition so intermediate composing characters don't spuriously open or close the popover.
// Internal flow:
// 1. compositionstart → set isComposingRef = true
// 2. change events fire for each composing keystroke → ignored
// 3. compositionend → set isComposingRef = false, run the full scanYou don't need to wire anything — <Mention.Input> handles it. If you're using useMention() with a custom textarea, spread getInputProps() to inherit the same guards.
The composition contract is pinned by 3 RTL tests + 1 chromium e2e test driving synthetic CompositionEvents, plus a manual smoke rig at packages/react/manual-at/ime/ covering macOS Japanese, Windows Pinyin, and Android Gboard against the live IME stack. See the Accessibility › IME smoke rig section for invariants and how to reproduce.
Locale-aware filtering
The library doesn't filter items for you — that's the consumer's job. For locale-aware sort / match, pass Intl.Collator into your filter:
const collator = new Intl.Collator(navigator.language, {
sensitivity: "base", // ignore case + diacritics
usage: "search",
});
function filter<T>(items: readonly T[], q: string, getLabel: (i: T) => string) {
if (!q) return items;
return items.filter((i) =>
collator.compare(getLabel(i).slice(0, q.length), q) === 0,
);
}For CJK fuzzy matching, libraries like fuse.js (with getFn reading romanised aliases) round it out — but that's beyond the primitive's scope.