Technical Note About Unicode and the Rename Window

    If right now you're wondering "What is Unicode?" you may wish to skip the following discussion. :-)

    Starting with File Buddy 7.5.1, the Rename window correctly handles Unicode strings. (Actually, all windows now handle Unicode strings correctly, but this discussion will be limited to how this change affects the Rename window.) One consequence of this change is that the Replace Characters and Delete Characters options now respect grapheme cluster boundaries as opposed to character boundaries.

    For example, the Replace Characters option actually parses the strings in each field into clusters, and then replaces clusters in the first string with clusters in the second string.*
    This change from character to cluster boundaries was necessary for file renaming to consistently work as expected since file names on HFS+ volumes in Mac OS X are stored as UTF-16 in an Apple-modified form of Normalization Form D (decomposed).
    In simpler terms, this means that an é occurring in a file name is stored by the file system in Mac OS X as the two characters:
    U+0065 LATIN SMALL LETTER E (e)+
    U+0301 COMBINING ACUTE ACCENT (´).
    In Mac OS 9, é is stored as the single character U+00E9 LATIN SMALL LETTER E WITH ACUTE (é), and this is how File Buddy parsed strings in the Rename window prior to v7.5.1. As a result, the Rename function was unable to rename items whose names contained high-ASCII characters or other, more complex clusters in Mac OS X.
    *Because many users (especially in the United States) are unfamiliar with the concept of a grapheme cluster, and since in most cases the clusters will consist of single characters, the word "character" was retained in the relevant pop-up menu items.


Table of Contents