source: general/sysutils/unzip.xml

trunk
Last change on this file was af2b317, checked in by Xi Ruoyao <xry111@…>, 4 days ago

Punctuation/comma vs. quote clean up

We've done this several times but there are still remaining cases.

  • Property mode set to 100644
File size: 11.4 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6
7 <!ENTITY unzip-download-http "&sourceforge-dl;/infozip/unzip60.tar.gz">
8 <!ENTITY unzip-download-ftp " ">
9 <!ENTITY unzip-md5sum "62b490407489521db863b523a7f86375">
10 <!ENTITY unzip-size "1.3 MB">
11 <!ENTITY unzip-buildsize "9 MB">
12 <!ENTITY unzip-time "less than 0.1 SBU">
13]>
14
15<sect1 id="unzip" xreflabel="UnZip-&unzip-version;">
16 <?dbhtml filename="unzip.html"?>
17
18
19 <title>UnZip-&unzip-version;</title>
20
21 <indexterm zone="unzip">
22 <primary sortas="a-UnZip">UnZip</primary>
23 </indexterm>
24
25 <sect2 role="package">
26 <title>Introduction to UnZip</title>
27
28 <para>
29 The <application>UnZip</application> package contains
30 <filename>ZIP</filename> extraction utilities. These are useful for
31 extracting files from <filename>ZIP</filename> archives.
32 <filename>ZIP</filename> archives are created with
33 <application>PKZIP</application> or <application>Info-ZIP</application>
34 utilities, primarily in a DOS environment.
35 </para>
36
37 &lfs121_checked;
38
39 <caution>
40 <para>
41 The previous version of the <application>UnZip</application>
42 package had some locale related issues. Currently there are no BLFS
43 editors capable of testing these locale issues. Therefore, the
44 locale related information is left on this page, but has not been
45 tested. A more general discussion of these problems can be found in
46 the <xref linkend="locale-assumed-encoding"/> section of the <xref
47 linkend="locale-issues"/> page.
48 </para>
49 </caution>
50
51 <bridgehead renderas="sect3">Package Information</bridgehead>
52 <itemizedlist spacing="compact">
53 <listitem>
54 <para>
55 Download (HTTP): <ulink url="&unzip-download-http;"/>
56 </para>
57 </listitem>
58 <listitem>
59 <para>
60 Download (FTP): <ulink url="&unzip-download-ftp;"/>
61 </para>
62 </listitem>
63 <listitem>
64 <para>
65 Download MD5 sum: &unzip-md5sum;
66 </para>
67 </listitem>
68 <listitem>
69 <para>
70 Download size: &unzip-size;
71 </para>
72 </listitem>
73 <listitem>
74 <para>
75 Estimated disk space required: &unzip-buildsize;
76 </para>
77 </listitem>
78 <listitem>
79 <para>
80 Estimated build time: &unzip-time;
81 </para>
82 </listitem>
83 </itemizedlist>
84
85 <bridgehead renderas="sect3">Additional Downloads</bridgehead>
86 <itemizedlist spacing='compact'>
87 <listitem>
88 <para>
89 Required patch: <ulink
90 url="&patch-root;/unzip-&unzip-version;-consolidated_fixes-1.patch"/>
91 </para>
92 </listitem>
93 <listitem>
94 <para>
95 Required patch: <ulink
96 url="&patch-root;/unzip-&unzip-version;-gcc14-1.patch"/>
97 </para>
98 </listitem>
99 </itemizedlist>
100
101 </sect2>
102
103 <sect2 id="unzip-locale-issues">
104 <title>UnZip Locale Issues</title>
105
106 <note>
107 <para>
108 Use of <application>UnZip</application> in the
109 <application>JDK</application>, <application>Mozilla</application>,
110 <application>DocBook</application> or any other BLFS package
111 installation is not a problem, as BLFS instructions never use
112 <application>UnZip</application> to extract a file with non-ASCII
113 characters in the file's name.
114 </para>
115 </note>
116
117 <para>
118 These issues are thought to be fixed in the patch. But since none
119 of the editors have data to test this, the following workarounds are
120 retained in case they might still be needed.
121 </para>
122
123 <para>
124 The <application>UnZip</application> package assumes that filenames
125 stored in the ZIP archives created on non-Unix systems are encoded in
126 CP850, and that they should be converted to ISO-8859-1 when writing files
127 onto the filesystem. Such assumptions are not always valid. In fact,
128 inside the ZIP archive, filenames are encoded in the DOS codepage that is
129 in use in the relevant country, and the filenames on disk should be in
130 the locale encoding. In MS Windows, the OemToChar() C function (from
131 <filename>User32.DLL</filename>) does the correct conversion (which is
132 indeed the conversion from CP850 to a superset of ISO-8859-1 if MS
133 Windows is set up to use the US English language), but there is no
134 equivalent in Linux.
135 </para>
136
137 <para>
138 When using <command>unzip</command> to unpack a ZIP archive
139 containing non-ASCII filenames, the filenames are damaged because
140 <command>unzip</command> uses improper conversion when any of its
141 encoding assumptions are incorrect. For example, in the ru_RU.KOI8-R
142 locale, conversion of filenames from CP866 to KOI8-R is required, but
143 conversion from CP850 to ISO-8859-1 is done, which produces filenames
144 consisting of undecipherable characters instead of words (the closest
145 equivalent understandable example for English-only users is rot13). There
146 are several ways around this limitation:
147 </para>
148
149 <para>
150 1) For unpacking ZIP archives with filenames containing non-ASCII
151 characters, use <ulink url="https://www.winzip.com/">WinZip</ulink> while
152 running the <ulink url="https://www.winehq.org/">Wine</ulink> Windows
153 emulator.
154 </para>
155
156 <para>
157 2) Use <command>bsdtar -xf</command> from
158 <xref role="nodep" linkend="libarchive"/> to unpack the ZIP archive.
159 Then fix the damage made to
160 the filenames using the <command>convmv</command> tool
161 (<ulink url="https://j3e.de/linux/convmv/"/>). The following is an example
162 for the zh_CN.UTF-8 locale:
163 </para>
164
165<screen><userinput>convmv -f cp936 -t utf-8 -r --nosmart --notest \
166 <replaceable>&lt;/path/to/unzipped/files&gt;</replaceable></userinput></screen>
167<!--
168 <para>
169 3) Apply the optional
170 <filename>unzip-5.50-alt-iconv-v1.1.patch</filename> patch to
171 <application>UnZip</application>. It will apply with some offsets.
172 </para>
173
174 <para>
175 It allows to specify the assumed filename encoding in the ZIP
176 archive using the <option>-O charset_name</option> option and the
177 on-disk filename encoding using the <option>-I charset_name</option>
178 option. Defaults: the on-disk filename encoding is the locale encoding,
179 the encoding inside the ZIP archive is guessed according to the builtin
180 table based on the locale encoding. For US English users, this still
181 means that unzip converts from CP850 to ISO-8859-1 by default.
182 </para>
183
184 <para>
185 Caveat: this method works only with 8-bit locale encodings, not
186 with UTF-8. Attempting to use a patched <command>unzip</command> in UTF-8
187 locales may result in a segmentation fault and is probably a security
188 risk.
189 </para>
190-->
191 </sect2>
192
193 <sect2 role="installation">
194 <title>Installation of UnZip</title>
195
196 <para>
197 First apply the patches:
198 </para>
199
200<screen><userinput remap="pre">patch -Np1 -i ../unzip-&unzip-version;-consolidated_fixes-1.patch
201patch -Np1 -i ../unzip-&unzip-version;-gcc14-1.patch</userinput></screen>
202
203 <para>
204 Now compile the package:
205 </para>
206
207<screen><userinput>make -f unix/Makefile generic</userinput></screen>
208
209 <para>
210 The test suite does not work for target <literal>generic</literal>.
211 </para>
212
213 <para>
214 Now, as the <systemitem class="username">root</systemitem> user:
215 </para>
216
217<screen role="root"><userinput>make prefix=/usr MANDIR=/usr/share/man/man1 \
218 -f unix/Makefile install</userinput></screen>
219
220 </sect2>
221
222 <sect2 role="commands">
223 <title>Command Explanations</title>
224
225 <para>
226 <command>make -f unix/Makefile generic</command>:
227 This target begins by running a configure script (unlike the older
228 targets such as linux and linux_noasm) which creates a flags file that
229 is then used in the build. This ensures that the 32-bit x86 build
230 receives the right flags to unzip files which are larger than 2GB
231 when extracted.
232 </para>
233
234 </sect2>
235
236 <sect2 role="content">
237 <title>Contents</title>
238
239 <segmentedlist>
240 <segtitle>Installed Programs</segtitle>
241 <segtitle>Installed Libraries</segtitle>
242 <segtitle>Installed Directories</segtitle>
243
244 <seglistitem>
245 <seg>funzip, unzip, unzipfsx, zipgrep, and zipinfo</seg>
246 <seg>None</seg>
247 <seg>None</seg>
248 </seglistitem>
249 </segmentedlist>
250
251 <variablelist>
252 <bridgehead renderas="sect3">Short Descriptions</bridgehead>
253 <?dbfo list-presentation="list"?>
254 <?dbhtml list-presentation="table"?>
255
256 <varlistentry id="funzip">
257 <term><command>funzip</command></term>
258 <listitem>
259 <para>
260 allows the output of <command>unzip</command> commands to be
261 redirected
262 </para>
263 <indexterm zone="unzip funzip">
264 <primary sortas="b-funzip">funzip</primary>
265 </indexterm>
266 </listitem>
267 </varlistentry>
268
269 <varlistentry id="unzip-prog">
270 <term><command>unzip</command></term>
271 <listitem>
272 <para>
273 lists, tests or extracts files from a <filename>ZIP</filename>
274 archive
275 </para>
276 <indexterm zone="unzip unzip-prog">
277 <primary sortas="b-unzip">unzip</primary>
278 </indexterm>
279 </listitem>
280 </varlistentry>
281
282 <varlistentry id="unzipfsx">
283 <term><command>unzipfsx</command></term>
284 <listitem>
285 <para>
286 is a self-extracting stub that can be prepended to a
287 <filename>ZIP</filename> archive. Files in this format allow the
288 recipient to decompress the archive without installing
289 <application>UnZip</application>
290 </para>
291 <indexterm zone="unzip unzipfsx">
292 <primary sortas="b-unzipfsx">unzipfsx</primary>
293 </indexterm>
294 </listitem>
295 </varlistentry>
296
297 <varlistentry id="zipgrep">
298 <term><command>zipgrep</command></term>
299 <listitem>
300 <para>
301 searches files in a <filename>ZIP</filename> archive for
302 lines matching a pattern
303 </para>
304 <indexterm zone="unzip zipgrep">
305 <primary sortas="b-zipgrep">zipgrep</primary>
306 </indexterm>
307 </listitem>
308 </varlistentry>
309
310 <varlistentry id="zipinfo">
311 <term><command>zipinfo</command></term>
312 <listitem>
313 <para>
314 produces technical information about the files in a
315 <filename>ZIP</filename> archive, including file access permissions,
316 encryption status, type of compression, etc
317 </para>
318 <indexterm zone="unzip zipinfo">
319 <primary sortas="b-zipinfo">zipinfo</primary>
320 </indexterm>
321 </listitem>
322 </varlistentry>
323<!--
324 <varlistentry id="libunzip">
325 <term><filename class='libraryfile'>libunzip.so</filename></term>
326 <listitem>
327 <para>
328 contains the API functions required by the
329 <application>UnZip</application> programs.
330 </para>
331 <indexterm zone="unzip libunzip">
332 <primary sortas="c-libunzip">libunzip.so</primary>
333 </indexterm>
334 </listitem>
335 </varlistentry>
336-->
337 </variablelist>
338
339 </sect2>
340
341</sect1>
Note: See TracBrowser for help on using the repository browser.