utf8proc

utf8proc is a library for processing UTF-8 encoded Unicode strings. Some features are Unicode normalization, stripping of default ignorable characters, case folding and detection of grapheme cluster boundaries. A special character mapping is available, which converts for example the characters “Hyphen” (U+2010), “Minus” (U+2212) and “Hyphen-Minus” (U+002D, ASCII Minus) all into the ASCII minus sign, to make them equal for comparisons.

utf8proc is now maintained by the Julia project.

The Public Software Group still hosts an archive page for the older versions (until v1.1.6). For the newer versions, please visit utf8proc on julialang.org.