source: general/sysutils/unzip.xml

trunk
Last change on this file was 133eab2, checked in by Bruce Dubbs <bdubbs@…>, 2 months ago

Initial LFS 12.1 tags

  • Property mode set to 100644
File size: 11.2 KB
Line 
1<?xml version="1.0" encoding="UTF-8"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6
7 <!ENTITY unzip-download-http "&sourceforge-dl;/infozip/unzip60.tar.gz">
8 <!ENTITY unzip-download-ftp " ">
9 <!ENTITY unzip-md5sum "62b490407489521db863b523a7f86375">
10 <!ENTITY unzip-size "1.3 MB">
11 <!ENTITY unzip-buildsize "9 MB">
12 <!ENTITY unzip-time "less than 0.1 SBU">
13]>
14
15<sect1 id="unzip" xreflabel="UnZip-&unzip-version;">
16 <?dbhtml filename="unzip.html"?>
17
18
19 <title>UnZip-&unzip-version;</title>
20
21 <indexterm zone="unzip">
22 <primary sortas="a-UnZip">UnZip</primary>
23 </indexterm>
24
25 <sect2 role="package">
26 <title>Introduction to UnZip</title>
27
28 <para>
29 The <application>UnZip</application> package contains
30 <filename>ZIP</filename> extraction utilities. These are useful for
31 extracting files from <filename>ZIP</filename> archives.
32 <filename>ZIP</filename> archives are created with
33 <application>PKZIP</application> or <application>Info-ZIP</application>
34 utilities, primarily in a DOS environment.
35 </para>
36
37 &lfs121_checked;
38
39 <caution>
40 <para>
41 The previous version of the <application>UnZip</application>
42 package had some locale related issues. Currently there are no BLFS
43 editors capable of testing these locale issues. Therefore, the
44 locale related information is left on this page, but has not been
45 tested. A more general discussion of these problems can be found in
46 the <xref linkend="locale-assumed-encoding"/> section of the <xref
47 linkend="locale-issues"/> page.
48 </para>
49 </caution>
50
51 <bridgehead renderas="sect3">Package Information</bridgehead>
52 <itemizedlist spacing="compact">
53 <listitem>
54 <para>
55 Download (HTTP): <ulink url="&unzip-download-http;"/>
56 </para>
57 </listitem>
58 <listitem>
59 <para>
60 Download (FTP): <ulink url="&unzip-download-ftp;"/>
61 </para>
62 </listitem>
63 <listitem>
64 <para>
65 Download MD5 sum: &unzip-md5sum;
66 </para>
67 </listitem>
68 <listitem>
69 <para>
70 Download size: &unzip-size;
71 </para>
72 </listitem>
73 <listitem>
74 <para>
75 Estimated disk space required: &unzip-buildsize;
76 </para>
77 </listitem>
78 <listitem>
79 <para>
80 Estimated build time: &unzip-time;
81 </para>
82 </listitem>
83 </itemizedlist>
84
85 <bridgehead renderas="sect3">Additional Downloads</bridgehead>
86 <itemizedlist spacing='compact'>
87 <listitem>
88 <para>
89 Required patch: <ulink
90 url="&patch-root;/unzip-&unzip-version;-consolidated_fixes-1.patch"/>
91 </para>
92 </listitem>
93 </itemizedlist>
94
95 </sect2>
96
97 <sect2 id="unzip-locale-issues">
98 <title>UnZip Locale Issues</title>
99
100 <note>
101 <para>
102 Use of <application>UnZip</application> in the
103 <application>JDK</application>, <application>Mozilla</application>,
104 <application>DocBook</application> or any other BLFS package
105 installation is not a problem, as BLFS instructions never use
106 <application>UnZip</application> to extract a file with non-ASCII
107 characters in the file's name.
108 </para>
109 </note>
110
111 <para>
112 These issues are thought to be fixed in the patch. But since none
113 of the editors have data to test this, the following workarounds are
114 retained in case they might still be needed.
115 </para>
116
117 <para>
118 The <application>UnZip</application> package assumes that filenames
119 stored in the ZIP archives created on non-Unix systems are encoded in
120 CP850, and that they should be converted to ISO-8859-1 when writing files
121 onto the filesystem. Such assumptions are not always valid. In fact,
122 inside the ZIP archive, filenames are encoded in the DOS codepage that is
123 in use in the relevant country, and the filenames on disk should be in
124 the locale encoding. In MS Windows, the OemToChar() C function (from
125 <filename>User32.DLL</filename>) does the correct conversion (which is
126 indeed the conversion from CP850 to a superset of ISO-8859-1 if MS
127 Windows is set up to use the US English language), but there is no
128 equivalent in Linux.
129 </para>
130
131 <para>
132 When using <command>unzip</command> to unpack a ZIP archive
133 containing non-ASCII filenames, the filenames are damaged because
134 <command>unzip</command> uses improper conversion when any of its
135 encoding assumptions are incorrect. For example, in the ru_RU.KOI8-R
136 locale, conversion of filenames from CP866 to KOI8-R is required, but
137 conversion from CP850 to ISO-8859-1 is done, which produces filenames
138 consisting of undecipherable characters instead of words (the closest
139 equivalent understandable example for English-only users is rot13). There
140 are several ways around this limitation:
141 </para>
142
143 <para>
144 1) For unpacking ZIP archives with filenames containing non-ASCII
145 characters, use <ulink url="https://www.winzip.com/">WinZip</ulink> while
146 running the <ulink url="https://www.winehq.org/">Wine</ulink> Windows
147 emulator.
148 </para>
149
150 <para>
151 2) Use <command>bsdtar -xf</command> from
152 <xref role="nodep" linkend="libarchive"/> to unpack the ZIP archive.
153 Then fix the damage made to
154 the filenames using the <command>convmv</command> tool
155 (<ulink url="https://j3e.de/linux/convmv/"/>). The following is an example
156 for the zh_CN.UTF-8 locale:
157 </para>
158
159<screen><userinput>convmv -f cp936 -t utf-8 -r --nosmart --notest \
160 <replaceable>&lt;/path/to/unzipped/files&gt;</replaceable></userinput></screen>
161<!--
162 <para>
163 3) Apply the optional
164 <filename>unzip-5.50-alt-iconv-v1.1.patch</filename> patch to
165 <application>UnZip</application>. It will apply with some offsets.
166 </para>
167
168 <para>
169 It allows to specify the assumed filename encoding in the ZIP
170 archive using the <option>-O charset_name</option> option and the
171 on-disk filename encoding using the <option>-I charset_name</option>
172 option. Defaults: the on-disk filename encoding is the locale encoding,
173 the encoding inside the ZIP archive is guessed according to the builtin
174 table based on the locale encoding. For US English users, this still
175 means that unzip converts from CP850 to ISO-8859-1 by default.
176 </para>
177
178 <para>
179 Caveat: this method works only with 8-bit locale encodings, not
180 with UTF-8. Attempting to use a patched <command>unzip</command> in UTF-8
181 locales may result in a segmentation fault and is probably a security
182 risk.
183 </para>
184-->
185 </sect2>
186
187 <sect2 role="installation">
188 <title>Installation of UnZip</title>
189
190 <para>
191 First apply the patch:
192 </para>
193
194<screen><userinput remap="pre">patch -Np1 -i ../unzip-&unzip-version;-consolidated_fixes-1.patch</userinput></screen>
195
196 <para>
197 Now compile the package:
198 </para>
199
200<screen><userinput>make -f unix/Makefile generic</userinput></screen>
201
202 <para>
203 The test suite does not work for target <quote>generic</quote>.
204 </para>
205
206 <para>
207 Now, as the <systemitem class="username">root</systemitem> user:
208 </para>
209
210<screen role="root"><userinput>make prefix=/usr MANDIR=/usr/share/man/man1 \
211 -f unix/Makefile install</userinput></screen>
212
213 </sect2>
214
215 <sect2 role="commands">
216 <title>Command Explanations</title>
217
218 <para>
219 <command>make -f unix/Makefile generic</command>:
220 This target begins by running a configure script (unlike the older
221 targets such as linux and linux_noasm) which creates a flags file that
222 is then used in the build. This ensures that the 32-bit x86 build
223 receives the right flags to unzip files which are larger than 2GB
224 when extracted.
225 </para>
226
227 </sect2>
228
229 <sect2 role="content">
230 <title>Contents</title>
231
232 <segmentedlist>
233 <segtitle>Installed Programs</segtitle>
234 <segtitle>Installed Libraries</segtitle>
235 <segtitle>Installed Directories</segtitle>
236
237 <seglistitem>
238 <seg>funzip, unzip, unzipfsx, zipgrep, and zipinfo</seg>
239 <seg>None</seg>
240 <seg>None</seg>
241 </seglistitem>
242 </segmentedlist>
243
244 <variablelist>
245 <bridgehead renderas="sect3">Short Descriptions</bridgehead>
246 <?dbfo list-presentation="list"?>
247 <?dbhtml list-presentation="table"?>
248
249 <varlistentry id="funzip">
250 <term><command>funzip</command></term>
251 <listitem>
252 <para>
253 allows the output of <command>unzip</command> commands to be
254 redirected
255 </para>
256 <indexterm zone="unzip funzip">
257 <primary sortas="b-funzip">funzip</primary>
258 </indexterm>
259 </listitem>
260 </varlistentry>
261
262 <varlistentry id="unzip-prog">
263 <term><command>unzip</command></term>
264 <listitem>
265 <para>
266 lists, tests or extracts files from a <filename>ZIP</filename>
267 archive
268 </para>
269 <indexterm zone="unzip unzip-prog">
270 <primary sortas="b-unzip">unzip</primary>
271 </indexterm>
272 </listitem>
273 </varlistentry>
274
275 <varlistentry id="unzipfsx">
276 <term><command>unzipfsx</command></term>
277 <listitem>
278 <para>
279 is a self-extracting stub that can be prepended to a
280 <filename>ZIP</filename> archive. Files in this format allow the
281 recipient to decompress the archive without installing
282 <application>UnZip</application>
283 </para>
284 <indexterm zone="unzip unzipfsx">
285 <primary sortas="b-unzipfsx">unzipfsx</primary>
286 </indexterm>
287 </listitem>
288 </varlistentry>
289
290 <varlistentry id="zipgrep">
291 <term><command>zipgrep</command></term>
292 <listitem>
293 <para>
294 searches files in a <filename>ZIP</filename> archive for
295 lines matching a pattern
296 </para>
297 <indexterm zone="unzip zipgrep">
298 <primary sortas="b-zipgrep">zipgrep</primary>
299 </indexterm>
300 </listitem>
301 </varlistentry>
302
303 <varlistentry id="zipinfo">
304 <term><command>zipinfo</command></term>
305 <listitem>
306 <para>
307 produces technical information about the files in a
308 <filename>ZIP</filename> archive, including file access permissions,
309 encryption status, type of compression, etc
310 </para>
311 <indexterm zone="unzip zipinfo">
312 <primary sortas="b-zipinfo">zipinfo</primary>
313 </indexterm>
314 </listitem>
315 </varlistentry>
316<!--
317 <varlistentry id="libunzip">
318 <term><filename class='libraryfile'>libunzip.so</filename></term>
319 <listitem>
320 <para>
321 contains the API functions required by the
322 <application>UnZip</application> programs.
323 </para>
324 <indexterm zone="unzip libunzip">
325 <primary sortas="c-libunzip">libunzip.so</primary>
326 </indexterm>
327 </listitem>
328 </varlistentry>
329-->
330 </variablelist>
331
332 </sect2>
333
334</sect1>
Note: See TracBrowser for help on using the repository browser.