%general-entities; ]> $LastChangedBy$ $Date$ Locale Related Issues This page contains information about locale related problems and issues. In this paragraph you'll find a generic overview of things that can come up when configuring your system for various locales. The previous sentence and the remainder of this paragraph must still be revised/completed. Package Specific Locale Issues For package-specific issues, find the concerned package from the list below and follow the link to view the available information. If a package is not listed here, it does not mean there are no known locale-specific issues or problems with that package. It only means that this page has not been updated with the locale-specific information regarding that package. Please reference the BLFS Wiki page for a particular package for any additional locale-specific information. List of Packages with Locale Related Issues <xref linkend="mc"/> This package makes the assumption that characters and bytes are the same thing. This is not true in UTF-8 based locales. Due to this assumption MC will incorrectly position characters on the screen. After the cursor is moved a bit the screen becomes totally unreadable, as illustrated on this screenshot (taken in a ru_RU.UTF-8 locale). Additionally, input of non-ASCII characters in the editor is impossible, even after selecting Other 8-bit encoding from the menu. <xref linkend="unzip"/> Use of UnZip in the JDK, Mozilla, DocBook or any other BLFS package installation is not a problem, as BLFS instructions never use UnZip to extract a file with non-ASCII characters in the file's name. The UnZip package assumes that filenames stored in the ZIP archives created on non-Unix systems are encoded in CP850, and that they should be converted to ISO-8859-1 when writing files onto the filesystem. Such assumptions are not always valid. In fact, inside the ZIP archive, filenames are encoded in the DOS codepage that is in use in the relevant country, and the filenames on disk should be in the locale encoding. In MS Windows, the OemToChar() C function (from User32.DLL) does the correct conversion (which is indeed the conversion from CP850 to a superset of ISO-8859-1 if MS Windows is set up to use the US English language), but there is no equivalent in Linux. When using unzip to unpack a ZIP archive containing non-ASCII filenames, the filenames are damaged because unzip uses improper conversion when any of its encoding assumptions are incorrect. For example, in the ru_RU.KOI8-R locale, conversion of filenames from CP866 to KOI8-R is required, but conversion from CP850 to ISO-8859-1 is done, which produces filenames consisting of undecipherable characters instead of words (the closest equivalent understandable example for English-only users is rot13). There are several ways around this limitation: 1) For unpacking ZIP archives with filenames containing non-ASCII characters, use WinZip while running the Wine Windows emulator. 2) After running unzip, fix the damage made to the filenames using the convmv tool (). The following is an example for the ru_RU.KOI8-R locale:
Step 1. Undo the conversion done by unzip: convmv -f iso-8859-1 -t cp850 -r --nosmart --notest \ </path/to/unzipped/files> Step 2. Do the correct conversion instead: convmv -f cp866 -t koi8-r -r --nosmart --notest \ </path/to/unzipped/files>
3) Apply this patch to unzip: It allows to specify the assumed filename encoding in the ZIP archive using the option and the on-disk filename encoding using the option. Defaults: the on-disk filename encoding is the locale encoding, the encoding inside the ZIP archive is guessed according to the builtin table based on the locale encoding. For US English users, this still means that unzip converts from CP850 to ISO-8859-1 by default. Caveat: this method works only with 8-bit locale encodings, not with UTF-8. Attempting to use a patched unzip in UTF-8 locales may result in a segmentation fault and is probably a security risk.
<xref linkend="nano"/> The current stable version of Nano (&nano-version;) does not support UTF-8 character encodings. A development version is available which addresses these issues. This version can be downloaded at . Instructions for installing this version are the same as those found on the page.