pub trait UnicodeNormalization<I: Iterator<Item = char>> {
    fn nfd(self) -> Decompositions<I> ;
    fn nfkd(self) -> Decompositions<I> ;
    fn nfc(self) -> Recompositions<I> ;
    fn nfkc(self) -> Recompositions<I> ;
    fn cjk_compat_variants(self) -> Replacements<I> ;
    fn stream_safe(self) -> StreamSafe<I> ;
}
Expand description

Methods for iterating over strings while applying Unicode normalizations as described in Unicode Standard Annex #15.

Required Methods§

Returns an iterator over the string in Unicode Normalization Form D (canonical decomposition).

Returns an iterator over the string in Unicode Normalization Form KD (compatibility decomposition).

An Iterator over the string in Unicode Normalization Form C (canonical decomposition followed by canonical composition).

An Iterator over the string in Unicode Normalization Form KC (compatibility decomposition followed by canonical composition).

A transformation which replaces CJK Compatibility Ideograph codepoints with normal forms using Standardized Variation Sequences. This is not part of the canonical or compatibility decomposition algorithms, but performing it before those algorithms produces normalized output which better preserves the intent of the original text.

Note that many systems today ignore variation selectors, so these may not immediately help text display as intended, but they at least preserve the information in a standardized form, giving implementations the option to recognize them.

An Iterator over the string with Conjoining Grapheme Joiner characters inserted according to the Stream-Safe Text Process (UAX15-D4)

Implementations on Foreign Types§

Implementors§