source: postlfs/config/compressdoc.xml@ f8d632a

10.0 10.1 11.0 11.1 11.2 11.3 12.0 12.1 6.0 6.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 kea ken/TL2024 ken/inkscape-core-mods ken/tuningfonts krejzi/svn lazarus lxqt nosym perl-modules plabs/newcss plabs/python-mods python3.11 qt5new rahul/power-profiles-daemon renodr/vulkan-addition systemd-11177 systemd-13485 trunk upgradedb xry111/intltool xry111/llvm18 xry111/soup3 xry111/test-20220226 xry111/xf86-video-removal
Last change on this file since f8d632a was f8d632a, checked in by Bruce Dubbs <bdubbs@…>, 20 years ago

New XML Chapter 3

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@2287 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 16.6 KB
RevLine 
[f8d632a]1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6]>
7
[70919be]8<sect1 id="postlfs-config-compressdoc" xreflabel="compressdoc">
[bae6e15]9<?dbhtml filename="compressdoc.html"?>
[70919be]10<title>Compressing man and info pages</title>
11
[78b3cd61]12<para>Man and info reader programs can transparently process gzip'ed or
13bzip2'ed pages, a feature you can use to free some disk space while keeping
[0a472c6]14your documentation available. However, things are not that simple; man
[7edd170]15directories tend to contain links&mdash;hard and symbolic&mdash;which defeat simple
[78b3cd61]16ideas like recursively calling <command>gzip</command> on them. A better way
17to go is to use the script below.
[70919be]18</para>
19
[4067e17b]20<screen><userinput><command>cat &gt; /usr/sbin/compressdoc &lt;&lt; "EOF"</command>
[13f51bbc]21#!/bin/bash
[4067e17b]22# VERSION: 20040320.0026
[13f51bbc]23#
24# Compress (with bzip2 or gzip) all man pages in a hierarchy and
[2fa79e3b]25# update symlinks - By Marc Heerdink &lt;marc @ koelkast.net&gt;
[13f51bbc]26# Modified to be able to gzip or bzip2 files as an option and to deal
[2fa79e3b]27# with all symlinks properly by Mark Hymers &lt;markh @ linuxfromscratch.org&gt;
[13f51bbc]28#
[2fa79e3b]29# Modified 20030930 by Yann E. Morin &lt;yann.morin.1998 @ anciens.enib.fr&gt;
[13f51bbc]30# to accept compression/decompression, to correctly handle hard-links,
31# to allow for changing hard-links into soft- ones, to specify the
[666f6de]32# compression level, to parse the man.conf for all occurrences of MANPATH,
[13f51bbc]33# to allow for a backup, to allow to keep the newest version of a page.
[4067e17b]34# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the script.
35# (Note: It is assumed that the script is in the user's PATH)
[13f51bbc]36#
37# TODO:
38# - choose a default compress method to be based on the available
39# tool : gzip or bzip2;
[a72f9b7]40# - offer an option to automagically choose the best compression method
41# on a per page basis (eg. check which ofgzip/bzip2/whatever is the
42# most effective, page per page);
[13f51bbc]43# - when a MANPATH env var exists, use this instead of /etc/man.conf
[666f6de]44# (useful for users to (de)compress their man pages;
[13f51bbc]45# - offer an option to restore a previous backup;
46# - add other compression engines (compress, zip, etc?). Needed?
[70919be]47
[13f51bbc]48# Funny enough, this function prints some help.
49function help ()
[70919be]50{
[13f51bbc]51 if [ -n "$1" ]; then
52 echo "Unknown option : $1"
53 fi
[4067e17b]54 ( echo "Usage: $MY_NAME &lt;comp_method&gt; [options] [dirs]" &amp;&amp; \
[13f51bbc]55 cat &lt;&lt; EOT
[e6da9e5]56Where comp_method is one of :
[13f51bbc]57 --gzip, --gz, -g
58 --bzip2, --bz2, -b
59 Compress using gzip or bzip2.
60
61 --decompress, -d
62 Decompress the man pages.
63
64 --backup Specify a .tar backup shall be done for every directories.
65 In case a backup already exists, it is saved as .tar.old prior
66 to making the new backup. If an .tar.old backup exist, it is
67 removed prior to saving the backup.
68 In backup mode, no other action is performed.
69
[e6da9e5]70And where options are :
[13f51bbc]71 -1 to -9, --fast, --best
72 The compression level, as accepted by gzip and bzip2. When not
73 specified, uses the default compression level for the given
74 method (-6 for gzip, and -9 for bzip2). Not used when in backup
75 or decompress modes.
76
[e6da9e5]77 --force, -F Force (re-)compression, even if the previous one was the same
[a4be499]78 method. Useful when changing the compression ratio. By default,
[e6da9e5]79 a page will not be re-compressed if it ends with the same suffix
80 as the method adds (.bz2 for bzip2, .gz for gzip).
81
[2fa79e3b]82 --soft, -S Change hard-links into soft-links. Use with _caution_ as the
[13f51bbc]83 first encountered file will be used as a reference. Not used
84 when in backup mode.
85
[2fa79e3b]86 --hard, -H Change soft-links into hard-links. Not used when in backup mode.
87
[13f51bbc]88 --conf=dir, --conf dir
89 Specify the location of man.conf. Defaults to /etc.
90
[e6da9e5]91 --verbose, -v Verbose mode, print the name of the directory being processed.
92 Double the flag to turn it even more verbose, and to print the
93 name of the file being processed.
[13f51bbc]94
95 --fake, -f Fakes it. Print the actual parameters compman will use.
96
97 dirs A list of space-separated _absolute_ pathname to the man
98 directories.
99 When empty, and only then, parse ${MAN_CONF}/man.conf for all
[666f6de]100 occurrences of MANPATH.
[13f51bbc]101
102Note about compression
103 There has been a discussion on blfs-support about compression ratios of
104 both gzip and bzip2 on man pages, taking into account the hosting fs,
105 the architecture, etc... On the overall, the conclusion was that gzip
106 was much efficient on 'small' files, and bzip2 on 'big' files, small and
107 big being very dependent on the content of the files.
108
[e6da9e5]109 See the original post from Mickael A. Peters, titled "Bootable Utility CD",
110 and dated 20030409.1816(+0200), and subsequent posts:
111 http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
[13f51bbc]112
113 On my system (x86, ext3), man pages were 35564kiB before compression. gzip -9
114 compressed them down to 20372kiB (57.28%), bzip2 -9 got down to 19812kiB
115 (55.71%). That is a 1.57% gain in space. YMMV.
116
117 What was not taken into consideration was the decompression speed. But does
118 it make sense to? You gain fast access with uncompressed man pages, or you
119 gain space at the expense of a slight overhead in time. Well, my P4-2.5GHz
120 does not even let me notice this... :-)
121EOT
[2fa79e3b]122) | less
[13f51bbc]123}
124
125# This function checks that the man page is unique amongst bzip2'd, gzip'd and
[e6da9e5]126# uncompressed versions.
[13f51bbc]127# $1 the directory in which the file resides
128# $2 the file name for the man page
[e6da9e5]129# Returns 0 (true) if the file is the latest and must be taken care of, and 1
130# (false) if the file is not the latest (and has therefore been deleted).
[13f51bbc]131function check_unique ()
132{
[e6da9e5]133 # NB. When there are hard-links to this file, these are
134 # _not_ deleted. In fact, if there are hard-links, they
[13f51bbc]135 # all have the same date/time, thus making them ready
136 # for deletion later on.
137
[e6da9e5]138 # Build the list of all man pages with the same name
139 DIR=$1
[13f51bbc]140 BASENAME=`basename "${2}" .bz2`
141 BASENAME=`basename "${BASENAME}" .gz`
[408e76d]142 GZ_FILE="$BASENAME".gz
[b614eb09]143 BZ_FILE="$BASENAME".bz2
[13f51bbc]144
145 # Look for, and keep, the most recent one
[d85847c]146 LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" 2&gt;/dev/null | tail -n 1)`
[a72f9b7]147 for i in "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}"; do
[e6da9e5]148 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
[13f51bbc]149 done
150
151 # In case the specified file was the latest, return 0
[e6da9e5]152 [ "$LATEST" = "$2" ] &amp;&amp; return 0
[13f51bbc]153 # If the file was not the latest, return 1
154 return 1
[70919be]155}
156
[4067e17b]157# Name of the script
158MY_NAME=`basename $0`
159
[b7d0bb4]160# OK, parse the command-line for arguments, and initialize to some sensible
[2fa79e3b]161# state, that is : don't change links state, parse /etc/man.conf, be most
162# silent, search man.conf in /etc, and don't force (re-)compression.
[13f51bbc]163COMP_METHOD=
164COMP_SUF=
165COMP_LVL=
[2fa79e3b]166FORCE_OPT=
[13f51bbc]167LN_OPT=
168MAN_DIR=
[e6da9e5]169VERBOSE_LVL=0
[13f51bbc]170BACKUP=no
171FAKE=no
172MAN_CONF=/etc
173while [ -n "$1" ]; do
174 case $1 in
175 --gzip|--gz|-g)
176 COMP_SUF=.gz
177 COMP_METHOD=$1
178 shift
179 ;;
180 --bzip2|--bz2|-b)
181 COMP_SUF=.bz2
182 COMP_METHOD=$1
183 shift
184 ;;
185 --decompress|-d)
186 COMP_SUF=
187 COMP_LVL=
188 COMP_METHOD=$1
189 shift
190 ;;
191 -[1-9]|--fast|--best)
192 COMP_LVL=$1
193 shift
194 ;;
[e6da9e5]195 --force|-F)
[2fa79e3b]196 FORCE_OPT=-F
197 shift
198 ;;
199 --soft|-S)
200 LN_OPT=-S
[e6da9e5]201 shift
202 ;;
[2fa79e3b]203 --hard|-H)
204 LN_OPT=-H
[13f51bbc]205 shift
206 ;;
207 --conf=*)
208 MAN_CONF=`echo $1 | cut -d '=' -f2-`
209 shift
210 ;;
211 --conf)
212 MAN_CONF="$2"
213 shift 2
214 ;;
[e6da9e5]215 --verbose|-v)
216 let VERBOSE_LVL++
[13f51bbc]217 shift
218 ;;
219 --backup)
220 BACKUP=yes
221 shift
222 ;;
223 --fake|-f)
224 FAKE=yes
225 shift
226 ;;
227 --help|-h)
228 help
229 exit 0
230 ;;
231 /*)
232 MAN_DIR="${MAN_DIR} ${1}"
233 shift
234 ;;
235 -*)
236 help $1
237 exit 1
238 ;;
239 *)
[e6da9e5]240 echo "\"$1\" is not an absolute path name"
[13f51bbc]241 exit 1
242 ;;
243 esac
244done
245
246# Redirections
[e6da9e5]247case $VERBOSE_LVL in
[13f51bbc]248 0)
[e6da9e5]249 # O, be silent
250 DEST_FD0=/dev/null
251 DEST_FD1=/dev/null
252 VERBOSE_OPT=
[13f51bbc]253 ;;
254 1)
[e6da9e5]255 # 1, be a bit verbose
[13f51bbc]256 DEST_FD0=/dev/stdout
257 DEST_FD1=/dev/null
[e6da9e5]258 VERBOSE_OPT=-v
[13f51bbc]259 ;;
260 *)
[e6da9e5]261 # 2 and above, be most verbose
262 DEST_FD0=/dev/stdout
263 DEST_FD1=/dev/stdout
264 VERBOSE_OPT="-v -v"
[13f51bbc]265 ;;
266esac
[70919be]267
[13f51bbc]268# Note: on my machine, 'man --path' gives /usr/share/man twice, once with a trailing '/', once without.
269if [ -z "$MAN_DIR" ]; then
270 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
271 | sed 's/:/\\n/g' \
272 | while read foo; do dirname "$foo"/.; done \
273 | sort -u \
274 | while read bar; do echo -n "$bar "; done`
[70919be]275fi
276
[13f51bbc]277# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
278if [ -z "$MAN_DIR" ]; then
[e6da9e5]279 echo "No directory specified, and no directory found with \`man --path'"
[13f51bbc]280 exit 1
281fi
[70919be]282
[13f51bbc]283# Fake?
284if [ "$FAKE" != "no" ]; then
285 echo "Actual parameters used:"
286 echo -n "Compression.......: "
287 case $COMP_METHOD in
288 --bzip2|--bz2|-b) echo -n "bzip2";;
289 --gzip|__gz|-g) echo -n "gzip";;
290 --decompress|-d) echo -n "decompressing";;
291 *) echo -n "unknown";;
292 esac
293 echo " ($COMP_METHOD)"
294 echo "Compression level.: $COMP_LVL"
295 echo "Compression suffix: $COMP_SUF"
[2fa79e3b]296 echo -n "Force compression.: "
297 [ "foo$FORCE_OPT" = "foo-F" ] &amp;&amp; echo "yes" || echo "no"
298 echo "man.conf is.......: ${MAN_CONF}/man.conf"
299 echo -n "Hard-links........: "
300 [ "foo$LN_OPT" = "foo-S" ] &amp;&amp; echo "convert to soft-links" || echo "leave as is"
301 echo -n "Soft-links........: "
302 [ "foo$LN_OPT" = "foo-H" ] &amp;&amp; echo "convert to hard-links" || echo "leave as is"
[13f51bbc]303 echo "Backup............: $BACKUP"
304 echo "Faking (yes!).....: $FAKE"
305 echo "Directories.......: $MAN_DIR"
[2fa79e3b]306 echo "Verbosity level...: $VERBOSE_LVL"
[13f51bbc]307 exit 0
308fi
[70919be]309
[13f51bbc]310# If no method was specified, print help
311if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
312 help
313 exit 1
[70919be]314fi
315
[666f6de]316# In backup mode, do the backup solely
[13f51bbc]317if [ "$BACKUP" = "yes" ]; then
318 for DIR in $MAN_DIR; do
319 cd "${DIR}/.."
320 DIR_NAME=`basename "${DIR}"`
321 echo "Backing up $DIR..." &gt; $DEST_FD0
322 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
323 [ -f "${DIR_NAME}.tar" ] &amp;&amp; mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
324 tar cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
325 done
326 exit 0
327fi
328
329# I know MAN_DIR has only absolute path names
330# I need to take into account the localized man, so I'm going recursive
331for DIR in $MAN_DIR; do
[2fa79e3b]332 MEM_DIR=`pwd`
[13f51bbc]333 cd "$DIR"
334 for FILE in *; do
[e6da9e5]335 # Fixes the case were the directory is empty
[13f51bbc]336 if [ "foo$FILE" = "foo*" ]; then continue; fi
[e6da9e5]337
338 # Fixes the case when hard-links see their compression scheme change
339 # (from not compressed to compressed, or from bz2 to gz, or from gz to bz2)
340 # Also fixes the case when multiple version of the page are present, which
341 # are either compressed or not.
342 if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
343
[a72f9b7]344 # Do not compress whatis files
345 if [ "$FILE" = "whatis" ]; then continue; fi
346
[13f51bbc]347 if [ -d "$FILE" ]; then
[2fa79e3b]348 cd "${MEM_DIR}" # Go back to where we ran "$0", in case "$0"=="./compressdoc" ...
[13f51bbc]349 # We are going recursive to that directory
350 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
351 # I need not pass --conf, as I specify the directory to work on
352 # But I need exit in case of error
[4067e17b]353 "$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT} ${FORCE_OPT} "${DIR}/${FILE}" || exit 1
[13f51bbc]354 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
[2fa79e3b]355 cd "$DIR" # Needed for the next iteration of the loop
[e6da9e5]356
[13f51bbc]357 else # !dir
[e6da9e5]358 if ! check_unique "$DIR" "$FILE"; then continue; fi
359
360 # Check if the file is already compressed with the specified method
[b614eb09]361 BASE_FILE=`basename "$FILE" .gz`
[408e76d]362 BASE_FILE=`basename "$BASE_FILE" .bz2`
[2fa79e3b]363 if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" -a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
[13f51bbc]364
365 # If we have a symlink
366 if [ -h "$FILE" ]; then
[b614eb09]367 case "$FILE" in
[13f51bbc]368 *.bz2)
369 EXT=bz2 ;;
370 *.gz)
371 EXT=gz ;;
372 *)
373 EXT=none ;;
374 esac
375
[e6da9e5]376 if [ ! "$EXT" = "none" ]; then
[b614eb09]377 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " " | sed s/\.$EXT$//`
[13f51bbc]378 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
379 mv "$FILE" "$NEWNAME"
380 FILE="$NEWNAME"
381 else
[b614eb09]382 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " "`
[13f51bbc]383 fi
384
[2fa79e3b]385 if [ "$LN_OPT" = "-H" ]; then
386 # Change this soft-link into a hard- one
387 rm -f "$FILE" &amp;&amp; ln "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
388 chmod --reference "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
389 else
390 # Keep this soft-link a soft- one.
391 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
392 fi
[13f51bbc]393 echo "Relinked $FILE" &gt; $DEST_FD1
394
395 # else if we have a plain file
396 elif [ -f "$FILE" ]; then
397 # Take care of hard-links: build the list of files hard-linked
398 # to the one we are {de,}compressing.
399 # NB. This is not optimum has the file will eventually be compressed
400 # as many times it has hard-links. But for now, that's the safe way.
401 inode=`ls -li "$FILE" | awk '{print $1}'`
402 HLINKS=`find . \! -name "$FILE" -inum $inode`
403
404 if [ -n "$HLINKS" ]; then
405 # We have hard-links! Remove them now.
406 for i in $HLINKS; do rm -f "$i"; done
407 fi
408
409 # Now take care of the file that has no hard-link
[666f6de]410 # We do decompress first to re-compress with the selected
[13f51bbc]411 # compression ratio later on...
[b614eb09]412 case "$FILE" in
[13f51bbc]413 *.bz2)
414 bunzip2 $FILE
[e6da9e5]415 FILE=`basename "$FILE" .bz2`
[13f51bbc]416 ;;
417 *.gz)
418 gunzip $FILE
[e6da9e5]419 FILE=`basename "$FILE" .gz`
[13f51bbc]420 ;;
421 esac
422
[2fa79e3b]423 # Compress the file with the given compression ratio, if needed
[13f51bbc]424 case $COMP_SUF in
425 *bz2)
426 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
427 echo "Compressed $FILE" &gt; $DEST_FD1
428 ;;
429 *gz)
430 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
431 echo "Compressed $FILE" &gt; $DEST_FD1
432 ;;
433 *)
434 echo "Uncompressed $FILE" &gt; $DEST_FD1
435 ;;
436 esac
437
438 # If the file had hard-links, recreate those (either hard or soft)
439 if [ -n "$HLINKS" ]; then
440 for i in $HLINKS; do
[b614eb09]441 NEWFILE=`echo "$i" | sed s/\.gz$// | sed s/\.bz2$//`
[2fa79e3b]442 if [ "$LN_OPT" = "-S" ]; then
443 # Make this hard-link a soft- one
444 ln -s "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
445 else
446 # Keep the hard-link a hard- one
447 ln "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
448 fi
[13f51bbc]449 chmod 644 "${NEWFILE}$COMP_SUF" # Really work only for hard-links. Harmless for soft-links
450 done
451 fi
452
453 else
454 # There is a problem when we get neither a symlink nor a plain file
455 # Obviously, we shall never ever come here... :-(
456 echo "Whaooo... \"${DIR}/${FILE}\" is neither a symlink nor a plain file. Please check:"
[b614eb09]457 ls -l "${DIR}/${FILE}"
[13f51bbc]458 exit 1
459 fi
460 fi
461 done # for FILE
462done # for DIR
[70919be]463<command>EOF
[4067e17b]464chmod 755 /usr/sbin/compressdoc</command></userinput></screen>
[70919be]465
[13f51bbc]466<para>Now, as root, you can issue a
[4067e17b]467<command>compressdoc --bz2</command> to compress all your system man
468pages. You can also run <command>compressdoc --help</command> to get
[1ea79a1]469comprehensive help about what the script is able to do.</para>
[13f51bbc]470
471<para> Don't forget that a few programs, like the <application>X</application>
[0a472c6]472Window System and <application>XEmacs</application> also install their
[e6da9e5]473documentation in non standard places (such as <filename class="directory">
[1ea79a1]474/usr/X11R6/man</filename>, etc...). Be sure to add these locations to the
[3df86b66]475file <filename>/etc/man.conf</filename>, as a
476<envar>MANPATH</envar>=<replaceable>/path</replaceable> section.</para>
[a4acd463]477<para> Example:</para><screen><userinput>
[13f51bbc]478 ...
479 MANPATH=/usr/share/man
480 MANPATH=/usr/local/man
481 MANPATH=/usr/X11R6/man
482 MANPATH=/opt/qt/doc/man
[a4acd463]483 ...</userinput></screen>
[70919be]484
[b94cd51]485<para>Generally, package installation systems do not compress man/info pages,
486which means you will need to run the script again if you want to keep the size
487of your documentation as small as possible. Also, note that running the script
[0a472c6]488after upgrading a package is safe; when you have several versions of a page
[b94cd51]489(for example, one compressed and one uncompressed), the most recent one is kept
490and the others deleted.</para>
[70919be]491
492</sect1>
493
Note: See TracBrowser for help on using the repository browser.