source: postlfs/config/compressdoc.xml@ 4fb71d8e

10.0 10.1 11.0 11.1 11.2 11.3 12.0 12.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 kea ken/TL2024 ken/inkscape-core-mods ken/tuningfonts krejzi/svn lazarus lxqt nosym perl-modules plabs/newcss plabs/python-mods python3.11 qt5new rahul/power-profiles-daemon renodr/vulkan-addition systemd-11177 systemd-13485 trunk upgradedb xry111/intltool xry111/llvm18 xry111/soup3 xry111/test-20220226 xry111/xf86-video-removal
Last change on this file since 4fb71d8e was 4fb71d8e, checked in by Randy McMurchy <randy@…>, 18 years ago

Standardized tar command parameters in the CompressDoc and TeX instructions

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@5312 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 17.2 KB
Line 
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
3 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6]>
7
8<sect1 id="compressdoc" xreflabel="Compressing man and info pages">
9 <?dbhtml filename="compressdoc.html"?>
10
11 <sect1info>
12 <othername>$LastChangedBy$</othername>
13 <date>$Date$</date>
14 </sect1info>
15
16 <title>Compressing Man and Info Pages</title>
17
18 <indexterm zone="compressdoc">
19 <primary sortas="b-compressdoc">compressdoc</primary>
20 </indexterm>
21
22 <para>Man and info reader programs can transparently process files compressed
23 with <command>gzip</command> or <command>bzip2</command>, a feature you can
24 use to free some disk space while keeping
25 your documentation available. However, things are not that simple; man
26 directories tend to contain links&mdash;hard and symbolic&mdash;which defeat
27 simple ideas like recursively calling <command>gzip</command> on them. A
28 better way to go is to use the script below.</para>
29
30<screen role="root"><userinput>cat &gt; /usr/sbin/compressdoc &lt;&lt; "EOF"
31<literal>#!/bin/bash
32# VERSION: 20050112.0027
33#
34# Compress (with bzip2 or gzip) all man pages in a hierarchy and
35# update symlinks - By Marc Heerdink &lt;marc @ koelkast.net&gt;
36#
37# Modified to be able to gzip or bzip2 files as an option and to deal
38# with all symlinks properly by Mark Hymers &lt;markh @ linuxfromscratch.org&gt;
39#
40# Modified 20030930 by Yann E. Morin &lt;yann.morin.1998 @ anciens.enib.fr&gt;
41# to accept compression/decompression, to correctly handle hard-links,
42# to allow for changing hard-links into soft- ones, to specify the
43# compression level, to parse the man.conf for all occurrences of MANPATH,
44# to allow for a backup, to allow to keep the newest version of a page.
45#
46# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the
47# script.
48# (Note: It is assumed that the script is in the user's PATH)
49#
50# Modified 20050112 by Randy McMurchy to shorten line lengths and
51# correct grammar errors.
52#
53# TODO:
54# - choose a default compress method to be based on the available
55# tool : gzip or bzip2;
56# - offer an option to automagically choose the best compression
57# methed on a per page basis (eg. check which of
58# gzip/bzip2/whatever is the most effective, page per page);
59# - when a MANPATH env var exists, use this instead of /etc/man.conf
60# (useful for users to (de)compress their man pages;
61# - offer an option to restore a previous backup;
62# - add other compression engines (compress, zip, etc?). Needed?
63
64# Funny enough, this function prints some help.
65function help ()
66{
67 if [ -n "$1" ]; then
68 echo "Unknown option : $1"
69 fi
70 ( echo "Usage: $MY_NAME &lt;comp_method&gt; [options] [dirs]" &amp;&amp; \
71 cat &lt;&lt; EOT
72Where comp_method is one of :
73 --gzip, --gz, -g
74 --bzip2, --bz2, -b
75 Compress using gzip or bzip2.
76
77 --decompress, -d
78 Decompress the man pages.
79
80 --backup Specify a .tar backup shall be done for all directories.
81 In case a backup already exists, it is saved as .tar.old
82 prior to making the new backup. If a .tar.old backup
83 exists, it is removed prior to saving the backup.
84 In backup mode, no other action is performed.
85
86And where options are :
87 -1 to -9, --fast, --best
88 The compression level, as accepted by gzip and bzip2.
89 When not specified, uses the default compression level
90 for the given method (-6 for gzip, and -9 for bzip2).
91 Not used when in backup or decompress modes.
92
93 --force, -F Force (re-)compression, even if the previous one was
94 the same method. Useful when changing the compression
95 ratio. By default, a page will not be re-compressed if
96 it ends with the same suffix as the method adds
97 (.bz2 for bzip2, .gz for gzip).
98
99 --soft, -S Change hard-links into soft-links. Use with _caution_
100 as the first encountered file will be used as a
101 reference. Not used when in backup mode.
102
103 --hard, -H Change soft-links into hard-links. Not used when in
104 backup mode.
105
106 --conf=dir, --conf dir
107 Specify the location of man.conf. Defaults to /etc.
108
109 --verbose, -v Verbose mode, print the name of the directory being
110 processed. Double the flag to turn it even more verbose,
111 and to print the name of the file being processed.
112
113 --fake, -f Fakes it. Print the actual parameters compman will use.
114
115 dirs A list of space-separated _absolute_ pathnames to the
116 man directories. When empty, and only then, parse
117 ${MAN_CONF}/man.conf for all occurrences of MANPATH.
118
119Note about compression:
120 There has been a discussion on blfs-support about compression ratios of
121 both gzip and bzip2 on man pages, taking into account the hosting fs,
122 the architecture, etc... On the overall, the conclusion was that gzip
123 was much more efficient on 'small' files, and bzip2 on 'big' files,
124 small and big being very dependent on the content of the files.
125
126 See the original post from Mickael A. Peters, titled
127 "Bootable Utility CD", dated 20030409.1816(+0200), and subsequent posts:
128 http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
129
130 On my system (x86, ext3), man pages were 35564KB before compression.
131 gzip -9 compressed them down to 20372KB (57.28%), bzip2 -9 got down to
132 19812KB (55.71%). That is a 1.57% gain in space. YMMV.
133
134 What was not taken into consideration was the decompression speed. But
135 does it make sense to? You gain fast access with uncompressed man
136 pages, or you gain space at the expense of a slight overhead in time.
137 Well, my P4-2.5GHz does not even let me notice this... :-)
138
139EOT
140) | less
141}
142
143# This function checks that the man page is unique amongst bzip2'd,
144# gzip'd and uncompressed versions.
145# $1 the directory in which the file resides
146# $2 the file name for the man page
147# Returns 0 (true) if the file is the latest and must be taken care of,
148# and 1 (false) if the file is not the latest (and has therefore been
149# deleted).
150function check_unique ()
151{
152 # NB. When there are hard-links to this file, these are
153 # _not_ deleted. In fact, if there are hard-links, they
154 # all have the same date/time, thus making them ready
155 # for deletion later on.
156
157 # Build the list of all man pages with the same name
158 DIR=$1
159 BASENAME=`basename "${2}" .bz2`
160 BASENAME=`basename "${BASENAME}" .gz`
161 GZ_FILE="$BASENAME".gz
162 BZ_FILE="$BASENAME".bz2
163
164 # Look for, and keep, the most recent one
165 LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" \
166 2&gt;/dev/null | tail -n 1)`
167 for i in "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}"; do
168 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
169 done
170
171 # In case the specified file was the latest, return 0
172 [ "$LATEST" = "$2" ] &amp;&amp; return 0
173 # If the file was not the latest, return 1
174 return 1
175}
176
177# Name of the script
178MY_NAME=`basename $0`
179
180# OK, parse the command-line for arguments, and initialize to some
181# sensible state, that is: don't change links state, parse
182# /etc/man.conf, be most silent, search man.conf in /etc, and don't
183# force (re-)compression.
184COMP_METHOD=
185COMP_SUF=
186COMP_LVL=
187FORCE_OPT=
188LN_OPT=
189MAN_DIR=
190VERBOSE_LVL=0
191BACKUP=no
192FAKE=no
193MAN_CONF=/etc
194while [ -n "$1" ]; do
195 case $1 in
196 --gzip|--gz|-g)
197 COMP_SUF=.gz
198 COMP_METHOD=$1
199 shift
200 ;;
201 --bzip2|--bz2|-b)
202 COMP_SUF=.bz2
203 COMP_METHOD=$1
204 shift
205 ;;
206 --decompress|-d)
207 COMP_SUF=
208 COMP_LVL=
209 COMP_METHOD=$1
210 shift
211 ;;
212 -[1-9]|--fast|--best)
213 COMP_LVL=$1
214 shift
215 ;;
216 --force|-F)
217 FORCE_OPT=-F
218 shift
219 ;;
220 --soft|-S)
221 LN_OPT=-S
222 shift
223 ;;
224 --hard|-H)
225 LN_OPT=-H
226 shift
227 ;;
228 --conf=*)
229 MAN_CONF=`echo $1 | cut -d '=' -f2-`
230 shift
231 ;;
232 --conf)
233 MAN_CONF="$2"
234 shift 2
235 ;;
236 --verbose|-v)
237 let VERBOSE_LVL++
238 shift
239 ;;
240 --backup)
241 BACKUP=yes
242 shift
243 ;;
244 --fake|-f)
245 FAKE=yes
246 shift
247 ;;
248 --help|-h)
249 help
250 exit 0
251 ;;
252 /*)
253 MAN_DIR="${MAN_DIR} ${1}"
254 shift
255 ;;
256 -*)
257 help $1
258 exit 1
259 ;;
260 *)
261 echo "\"$1\" is not an absolute path name"
262 exit 1
263 ;;
264 esac
265done
266
267# Redirections
268case $VERBOSE_LVL in
269 0)
270 # O, be silent
271 DEST_FD0=/dev/null
272 DEST_FD1=/dev/null
273 VERBOSE_OPT=
274 ;;
275 1)
276 # 1, be a bit verbose
277 DEST_FD0=/dev/stdout
278 DEST_FD1=/dev/null
279 VERBOSE_OPT=-v
280 ;;
281 *)
282 # 2 and above, be most verbose
283 DEST_FD0=/dev/stdout
284 DEST_FD1=/dev/stdout
285 VERBOSE_OPT="-v -v"
286 ;;
287esac
288
289# Note: on my machine, 'man --path' gives /usr/share/man twice, once
290# with a trailing '/', once without.
291if [ -z "$MAN_DIR" ]; then
292 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
293 | sed 's/:/\\n/g' \
294 | while read foo; do dirname "$foo"/.; done \
295 | sort -u \
296 | while read bar; do echo -n "$bar "; done`
297fi
298
299# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
300if [ -z "$MAN_DIR" ]; then
301 echo "No directory specified, and no directory found with \`man --path'"
302 exit 1
303fi
304
305# Fake?
306if [ "$FAKE" != "no" ]; then
307 echo "Actual parameters used:"
308 echo -n "Compression.......: "
309 case $COMP_METHOD in
310 --bzip2|--bz2|-b) echo -n "bzip2";;
311 --gzip|__gz|-g) echo -n "gzip";;
312 --decompress|-d) echo -n "decompressing";;
313 *) echo -n "unknown";;
314 esac
315 echo " ($COMP_METHOD)"
316 echo "Compression level.: $COMP_LVL"
317 echo "Compression suffix: $COMP_SUF"
318 echo -n "Force compression.: "
319 [ "foo$FORCE_OPT" = "foo-F" ] &amp;&amp; echo "yes" || echo "no"
320 echo "man.conf is.......: ${MAN_CONF}/man.conf"
321 echo -n "Hard-links........: "
322 [ "foo$LN_OPT" = "foo-S" ] &amp;&amp;
323 echo "convert to soft-links" || echo "leave as is"
324 echo -n "Soft-links........: "
325 [ "foo$LN_OPT" = "foo-H" ] &amp;&amp;
326 echo "convert to hard-links" || echo "leave as is"
327 echo "Backup............: $BACKUP"
328 echo "Faking (yes!).....: $FAKE"
329 echo "Directories.......: $MAN_DIR"
330 echo "Verbosity level...: $VERBOSE_LVL"
331 exit 0
332fi
333
334# If no method was specified, print help
335if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
336 help
337 exit 1
338fi
339
340# In backup mode, do the backup solely
341if [ "$BACKUP" = "yes" ]; then
342 for DIR in $MAN_DIR; do
343 cd "${DIR}/.."
344 DIR_NAME=`basename "${DIR}"`
345 echo "Backing up $DIR..." &gt; $DEST_FD0
346 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
347 [ -f "${DIR_NAME}.tar" ] &amp;&amp;
348 mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
349 tar -cvf "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
350 done
351 exit 0
352fi
353
354# I know MAN_DIR has only absolute path names
355# I need to take into account the localized man, so I'm going recursive
356for DIR in $MAN_DIR; do
357 MEM_DIR=`pwd`
358 cd "$DIR"
359 for FILE in *; do
360 # Fixes the case were the directory is empty
361 if [ "foo$FILE" = "foo*" ]; then continue; fi
362
363 # Fixes the case when hard-links see their compression scheme change
364 # (from not compressed to compressed, or from bz2 to gz, or from gz
365 # to bz2)
366 # Also fixes the case when multiple version of the page are present,
367 # which are either compressed or not.
368 if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
369
370 # Do not compress whatis files
371 if [ "$FILE" = "whatis" ]; then continue; fi
372
373 if [ -d "$FILE" ]; then
374 cd "${MEM_DIR}" # Go back to where we ran "$0",
375 # in case "$0"=="./compressdoc" ...
376 # We are going recursive to that directory
377 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
378 # I need not pass --conf, as I specify the directory to work on
379 # But I need exit in case of error
380 "$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT} \
381 ${FORCE_OPT} "${DIR}/${FILE}" || exit 1
382 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
383 cd "$DIR" # Needed for the next iteration of the loop
384
385 else # !dir
386 if ! check_unique "$DIR" "$FILE"; then continue; fi
387
388 # Check if the file is already compressed with the specified method
389 BASE_FILE=`basename "$FILE" .gz`
390 BASE_FILE=`basename "$BASE_FILE" .bz2`
391 if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" \
392 -a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
393
394 # If we have a symlink
395 if [ -h "$FILE" ]; then
396 case "$FILE" in
397 *.bz2)
398 EXT=bz2 ;;
399 *.gz)
400 EXT=gz ;;
401 *)
402 EXT=none ;;
403 esac
404
405 if [ ! "$EXT" = "none" ]; then
406 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 \
407 | tr -d " " | sed s/\.$EXT$//`
408 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
409 mv "$FILE" "$NEWNAME"
410 FILE="$NEWNAME"
411 else
412 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " "`
413 fi
414
415 if [ "$LN_OPT" = "-H" ]; then
416 # Change this soft-link into a hard- one
417 rm -f "$FILE" &amp;&amp; ln "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
418 chmod --reference "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
419 else
420 # Keep this soft-link a soft- one.
421 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
422 fi
423 echo "Relinked $FILE" &gt; $DEST_FD1
424
425 # else if we have a plain file
426 elif [ -f "$FILE" ]; then
427 # Take care of hard-links: build the list of files hard-linked
428 # to the one we are {de,}compressing.
429 # NB. This is not optimum has the file will eventually be
430 # compressed as many times it has hard-links. But for now,
431 # that's the safe way.
432 inode=`ls -li "$FILE" | awk '{print $1}'`
433 HLINKS=`find . \! -name "$FILE" -inum $inode`
434
435 if [ -n "$HLINKS" ]; then
436 # We have hard-links! Remove them now.
437 for i in $HLINKS; do rm -f "$i"; done
438 fi
439
440 # Now take care of the file that has no hard-link
441 # We do decompress first to re-compress with the selected
442 # compression ratio later on...
443 case "$FILE" in
444 *.bz2)
445 bunzip2 $FILE
446 FILE=`basename "$FILE" .bz2`
447 ;;
448 *.gz)
449 gunzip $FILE
450 FILE=`basename "$FILE" .gz`
451 ;;
452 esac
453
454 # Compress the file with the given compression ratio, if needed
455 case $COMP_SUF in
456 *bz2)
457 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
458 echo "Compressed $FILE" &gt; $DEST_FD1
459 ;;
460 *gz)
461 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
462 echo "Compressed $FILE" &gt; $DEST_FD1
463 ;;
464 *)
465 echo "Uncompressed $FILE" &gt; $DEST_FD1
466 ;;
467 esac
468
469 # If the file had hard-links, recreate those (either hard or soft)
470 if [ -n "$HLINKS" ]; then
471 for i in $HLINKS; do
472 NEWFILE=`echo "$i" | sed s/\.gz$// | sed s/\.bz2$//`
473 if [ "$LN_OPT" = "-S" ]; then
474 # Make this hard-link a soft- one
475 ln -s "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
476 else
477 # Keep the hard-link a hard- one
478 ln "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
479 fi
480 # Really work only for hard-links. Harmless for soft-links
481 chmod 644 "${NEWFILE}$COMP_SUF"
482 done
483 fi
484
485 else
486 # There is a problem when we get neither a symlink nor a plain
487 # file. Obviously, we shall never ever come here... :-(
488 echo -n "Whaooo... \"${DIR}/${FILE}\" is neither a symlink "
489 echo "nor a plain file. Please check:"
490 ls -l "${DIR}/${FILE}"
491 exit 1
492 fi
493 fi
494 done # for FILE
495done # for DIR</literal>
496
497EOF
498chmod 755 /usr/sbin/compressdoc</userinput></screen>
499
500 <para>Now, as <systemitem class="username">root</systemitem>, you can issue
501 the command <command>compressdoc --bz2</command> to compress all your system man
502 pages. You can also run <command>compressdoc --help</command> to get
503 comprehensive help about what the script is able to do.</para>
504
505 <para> Don't forget that a few programs, like the <application>X Window
506 System</application> and <application>XEmacs</application> also
507 install their documentation in non-standard places (such as
508 <filename class="directory">/usr/X11R6/man</filename>, etc.). Be sure
509 to add these locations to the file <filename>/etc/man.conf</filename>, as
510 <envar>MANPATH</envar> <replaceable>[/path]</replaceable> lines.</para>
511
512 <para> Example:</para>
513
514<screen><literal> ...
515 MANPATH /usr/share/man
516 MANPATH /usr/local/man
517 MANPATH /usr/X11R6/man
518 MANPATH /opt/qt/doc/man
519 ...</literal></screen>
520
521 <para>Generally, package installation systems do not compress man/info pages,
522 which means you will need to run the script again if you want to keep the size
523 of your documentation as small as possible. Also, note that running the script
524 after upgrading a package is safe; when you have several versions of a page
525 (for example, one compressed and one uncompressed), the most recent one is kept
526 and the others are deleted.</para>
527
528</sect1>
Note: See TracBrowser for help on using the repository browser.