(part of InternationalizationPlan)

To allow for interoperability between users with different charset encodings, Mercurial will transcode certain elements of the data it manages to UTF-8. Mercurial intentionally makes no assumptions about the charset of any data it manages except the elements described below.

Elements that need to be transcoded

Files and encodings

Things that need to be done

Legacy repositories

Legacy repositories may contain non-UTF-8 data as UTF-8 wasn't enforced. To continue to operate robustly, we do the following:

Windows and OS X charset weirdness

See CharacterEncodingOnWindows for a discussion of dealing with Windows charset braindamage and Character_Encoding_On_OSX for a similar form of braindamage on OS X.


CategoryNewFeatures