- Timestamp:
- 10/28/2006 07:13:18 AM (17 years ago)
- Branches:
- 10.0, 10.1, 11.0, 11.1, 11.2, 11.3, 12.0, 12.1, 6.2, 6.2.0, 6.2.0-rc1, 6.2.0-rc2, 6.3, 6.3-rc1, 6.3-rc2, 6.3-rc3, 7.10, 7.4, 7.5, 7.6, 7.6-blfs, 7.6-systemd, 7.7, 7.8, 7.9, 8.0, 8.1, 8.2, 8.3, 8.4, 9.0, 9.1, basic, bdubbs/svn, elogind, gnome, kde5-13430, kde5-14269, kde5-14686, kea, ken/TL2024, ken/inkscape-core-mods, ken/tuningfonts, krejzi/svn, lazarus, lxqt, nosym, perl-modules, plabs/newcss, plabs/python-mods, python3.11, qt5new, rahul/power-profiles-daemon, renodr/vulkan-addition, systemd-11177, systemd-13485, trunk, upgradedb, xry111/intltool, xry111/llvm18, xry111/soup3, xry111/test-20220226, xry111/xf86-video-removal
- Children:
- 0952b7d8
- Parents:
- 3aeb033
- Location:
- general/sysutils
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
general/sysutils/mc.xml
r3aeb033 r86eaa277 37 37 38 38 <caution> 39 <para>The <application>MC</application> package has some issues when 40 used in a UTF-8 based locale. For a full explanation of the issues, see 41 the <xref linkend="locale-mc"/> section of the 42 <xref linkend="locale-issues"/>.</para> 39 <para>The <application>MC</application> package has major issues when 40 used in a UTF-8 based locale because it assumes the characters are 41 always one byte wide. See <ulink url="&files-anduin;/mc-bad.png">this 42 screenshot</ulink> (taken in a ru_RU.UTF-8 locale). 43 See the <ulink url="&blfs-wiki;/MC">MC Wiki</ulink> page for a way 44 to work around these problems. 45 For a general discussion of these types of issues, see 46 the <xref linkend="locale-issues"/> page.</para> 43 47 </caution> 44 48 -
general/sysutils/unzip.xml
r3aeb033 r86eaa277 39 39 <caution> 40 40 <para>The <application>UnZip</application> package has some locale 41 related issues. For a full explanation of the issues and some possible 42 solutions, see the <xref linkend="locale-unzip"/> section of the 43 <xref linkend="locale-issues"/>.</para> 41 related issues. See the discussion below in the 42 <xref linkend="unzip-locale-issues"/> section. A more general 43 discussion of these problems can be found on the 44 <xref linkend="locale-issues"/> page.</para> 44 45 </caution> 45 46 … … 68 69 <para condition="html" role="usernotes">User Notes: 69 70 <ulink url="&blfs-wiki;/unzip"/></para> 71 72 </sect2> 73 74 <sect2 id="unzip-locale-issues"> 75 <title>UnZip Locale Issues</title> 76 77 <note> 78 <para>Use of <application>UnZip</application> in the 79 <application>JDK</application>, <application>Mozilla</application>, 80 <application>DocBook</application> or any other BLFS package 81 installation is not a problem, as BLFS instructions never use 82 <application>UnZip</application> to extract a file with non-ASCII 83 characters in the file's name.</para> 84 </note> 85 86 <para>The <application>UnZip</application> package assumes that filenames 87 stored in the ZIP archives created on non-Unix systems are encoded in 88 CP850, and that they should be converted to ISO-8859-1 when writing files 89 onto the filesystem. Such assumptions are not always valid. In fact, 90 inside the ZIP archive, filenames are encoded in the DOS codepage that is 91 in use in the relevant country, and the filenames on disk should be in 92 the locale encoding. In MS Windows, the OemToChar() C function (from 93 <filename>User32.DLL</filename>) does the correct conversion (which is 94 indeed the conversion from CP850 to a superset of ISO-8859-1 if MS 95 Windows is set up to use the US English language), but there is no 96 equivalent in Linux.</para> 97 98 <para>When using <command>unzip</command> to unpack a ZIP archive 99 containing non-ASCII filenames, the filenames are damaged because 100 <command>unzip</command> uses improper conversion when any of its 101 encoding assumptions are incorrect. For example, in the ru_RU.KOI8-R 102 locale, conversion of filenames from CP866 to KOI8-R is required, but 103 conversion from CP850 to ISO-8859-1 is done, which produces filenames 104 consisting of undecipherable characters instead of words (the closest 105 equivalent understandable example for English-only users is rot13). There 106 are several ways around this limitation:</para> 107 108 <para>1) For unpacking ZIP archives with filenames containing non-ASCII 109 characters, use <ulink url="http://www.winzip.com/">WinZip</ulink> while- running the <ulink url="http://www.winehq.com/">Wine</ulink> Windows 110 emulator.</para> 111 112 <para>2) After running <command>unzip</command>, fix the damage made to 113 the filenames using the <command>convmv</command> tool 114 (<ulink url="http://j3e.de/linux/convmv/"/>). The following is an example 115 for the ru_RU.KOI8-R locale:</para> 116 117 <blockquote> 118 <para>Step 1. Undo the conversion done by 119 <command>unzip</command>:</para> 120 121 <screen><userinput>convmv -f iso-8859-1 -t cp850 -r --nosmart --notest \ 122 <replaceable></path/to/unzipped/files></replaceable></userinput></screen> 123 124 <para>Step 2. Do the correct conversion instead:</para> 125 126 <screen><userinput>convmv -f cp866 -t koi8-r -r --nosmart --notest \ 127 <replaceable></path/to/unzipped/files></replaceable></userinput></screen> 128 </blockquote> 129 130 <para>3) Apply this patch to unzip: 131 <ulink url="https://bugzilla.altlinux.ru/attachment.cgi?id=532"/></para> 132 133 <para>It allows to specify the assumed filename encoding in the ZIP 134 archive using the <option>-O charset_name</option> option and the 135 on-disk filename encoding using the <option>-I charset_name</option> 136 option. Defaults: the on-disk filename encoding is the locale encoding, 137 the encoding inside the ZIP archive is guessed according to the builtin 138 table based on the locale encoding. For US English users, this still 139 means that unzip converts from CP850 to ISO-8859-1 by default.</para> 140 141 <para>Caveat: this method works only with 8-bit locale encodings, not 142 with UTF-8. Attempting to use a patched <command>unzip</command> in UTF-8 143 locales may result in a segmentation fault and is probably a security 144 risk.</para> 70 145 71 146 </sect2>
Note:
See TracChangeset
for help on using the changeset viewer.