source: postlfs/config/compressdoc.xml@ 4ee1c44

10.0 10.1 11.0 6.0 6.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 ken/refactor-virt krejzi/svn lazarus nosym perl-modules qt5new systemd-11177 systemd-13485 trunk xry111/git-date xry111/git-date-for-trunk xry111/git-date-test
Last change on this file since 4ee1c44 was 4ee1c44, checked in by Randy McMurchy <randy@…>, 17 years ago

Shortened line lengths in the compressdoc script

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@3271 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 17.1 KB
Line 
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6]>
7
8<sect1 id="compressdoc" xreflabel="compressdoc">
9<sect1info>
10<othername>$LastChangedBy$</othername>
11<date>$Date$</date>
12</sect1info>
13<?dbhtml filename="compressdoc.html"?>
14<title>Compressing man and info pages</title>
15<indexterm zone="compressdoc">
16<primary sortas="b-compressdoc">compressdoc</primary></indexterm>
17
18<para>Man and info reader programs can transparently process gzip'ed or
19bzip2'ed pages, a feature you can use to free some disk space while keeping
20your documentation available. However, things are not that simple; man
21directories tend to contain links&mdash;hard and symbolic&mdash;which defeat
22simple ideas like recursively calling <command>gzip</command> on them. A
23better way to go is to use the script below.
24</para>
25
26<screen><userinput><command>cat &gt; /usr/sbin/compressdoc &lt;&lt; "EOF"</command>
27#!/bin/bash
28# VERSION: 20050112.0027
29#
30# Compress (with bzip2 or gzip) all man pages in a hierarchy and
31# update symlinks - By Marc Heerdink &lt;marc @ koelkast.net&gt;
32#
33# Modified to be able to gzip or bzip2 files as an option and to deal
34# with all symlinks properly by Mark Hymers &lt;markh @ linuxfromscratch.org&gt;
35#
36# Modified 20030930 by Yann E. Morin &lt;yann.morin.1998 @ anciens.enib.fr&gt;
37# to accept compression/decompression, to correctly handle hard-links,
38# to allow for changing hard-links into soft- ones, to specify the
39# compression level, to parse the man.conf for all occurrences of MANPATH,
40# to allow for a backup, to allow to keep the newest version of a page.
41#
42# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the
43# script.
44# (Note: It is assumed that the script is in the user's PATH)
45#
46# Modified 20050112 by Randy McMurchy to shorten line lengths and
47# correct grammar errors.
48#
49# TODO:
50# - choose a default compress method to be based on the available
51# tool : gzip or bzip2;
52# - offer an option to automagically choose the best compression
53# methed on a per page basis (eg. check which of
54# gzip/bzip2/whatever is the most effective, page per page);
55# - when a MANPATH env var exists, use this instead of /etc/man.conf
56# (useful for users to (de)compress their man pages;
57# - offer an option to restore a previous backup;
58# - add other compression engines (compress, zip, etc?). Needed?
59
60# Funny enough, this function prints some help.
61function help ()
62{
63 if [ -n "$1" ]; then
64 echo "Unknown option : $1"
65 fi
66 ( echo "Usage: $MY_NAME &lt;comp_method&gt; [options] [dirs]" &amp;&amp; \
67 cat &lt;&lt; EOT
68Where comp_method is one of :
69 --gzip, --gz, -g
70 --bzip2, --bz2, -b
71 Compress using gzip or bzip2.
72
73 --decompress, -d
74 Decompress the man pages.
75
76 --backup Specify a .tar backup shall be done for all directories.
77 In case a backup already exists, it is saved as .tar.old
78 prior to making the new backup. If a .tar.old backup
79 exists, it is removed prior to saving the backup.
80 In backup mode, no other action is performed.
81
82And where options are :
83 -1 to -9, --fast, --best
84 The compression level, as accepted by gzip and bzip2.
85 When not specified, uses the default compression level
86 for the given method (-6 for gzip, and -9 for bzip2).
87 Not used when in backup or decompress modes.
88
89 --force, -F Force (re-)compression, even if the previous one was
90 the same method. Useful when changing the compression
91 ratio. By default, a page will not be re-compressed if
92 it ends with the same suffix as the method adds
93 (.bz2 for bzip2, .gz for gzip).
94
95 --soft, -S Change hard-links into soft-links. Use with _caution_
96 as the first encountered file will be used as a
97 reference. Not used when in backup mode.
98
99 --hard, -H Change soft-links into hard-links. Not used when in
100 backup mode.
101
102 --conf=dir, --conf dir
103 Specify the location of man.conf. Defaults to /etc.
104
105 --verbose, -v Verbose mode, print the name of the directory being
106 processed. Double the flag to turn it even more verbose,
107 and to print the name of the file being processed.
108
109 --fake, -f Fakes it. Print the actual parameters compman will use.
110
111 dirs A list of space-separated _absolute_ pathnames to the
112 man directories. When empty, and only then, parse
113 ${MAN_CONF}/man.conf for all occurrences of MANPATH.
114
115Note about compression:
116 There has been a discussion on blfs-support about compression ratios of
117 both gzip and bzip2 on man pages, taking into account the hosting fs,
118 the architecture, etc... On the overall, the conclusion was that gzip
119 was much more efficient on 'small' files, and bzip2 on 'big' files,
120 small and big being very dependent on the content of the files.
121
122 See the original post from Mickael A. Peters, titled
123 "Bootable Utility CD", dated 20030409.1816(+0200), and subsequent posts:
124 http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
125
126 On my system (x86, ext3), man pages were 35564KB before compression.
127 gzip -9 compressed them down to 20372KB (57.28%), bzip2 -9 got down to
128 19812KB (55.71%). That is a 1.57% gain in space. YMMV.
129
130 What was not taken into consideration was the decompression speed. But
131 does it make sense to? You gain fast access with uncompressed man
132 pages, or you gain space at the expense of a slight overhead in time.
133 Well, my P4-2.5GHz does not even let me notice this... :-)
134
135EOT
136) | less
137}
138
139# This function checks that the man page is unique amongst bzip2'd,
140# gzip'd and uncompressed versions.
141# $1 the directory in which the file resides
142# $2 the file name for the man page
143# Returns 0 (true) if the file is the latest and must be taken care of,
144# and 1 (false) if the file is not the latest (and has therefore been
145# deleted).
146function check_unique ()
147{
148 # NB. When there are hard-links to this file, these are
149 # _not_ deleted. In fact, if there are hard-links, they
150 # all have the same date/time, thus making them ready
151 # for deletion later on.
152
153 # Build the list of all man pages with the same name
154 DIR=$1
155 BASENAME=`basename "${2}" .bz2`
156 BASENAME=`basename "${BASENAME}" .gz`
157 GZ_FILE="$BASENAME".gz
158 BZ_FILE="$BASENAME".bz2
159
160 # Look for, and keep, the most recent one
161 LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" \
162 2&gt;/dev/null | tail -n 1)`
163 for i in "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}"; do
164 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
165 done
166
167 # In case the specified file was the latest, return 0
168 [ "$LATEST" = "$2" ] &amp;&amp; return 0
169 # If the file was not the latest, return 1
170 return 1
171}
172
173# Name of the script
174MY_NAME=`basename $0`
175
176# OK, parse the command-line for arguments, and initialize to some
177# sensible state, that is: don't change links state, parse
178# /etc/man.conf, be most silent, search man.conf in /etc, and don't
179# force (re-)compression.
180COMP_METHOD=
181COMP_SUF=
182COMP_LVL=
183FORCE_OPT=
184LN_OPT=
185MAN_DIR=
186VERBOSE_LVL=0
187BACKUP=no
188FAKE=no
189MAN_CONF=/etc
190while [ -n "$1" ]; do
191 case $1 in
192 --gzip|--gz|-g)
193 COMP_SUF=.gz
194 COMP_METHOD=$1
195 shift
196 ;;
197 --bzip2|--bz2|-b)
198 COMP_SUF=.bz2
199 COMP_METHOD=$1
200 shift
201 ;;
202 --decompress|-d)
203 COMP_SUF=
204 COMP_LVL=
205 COMP_METHOD=$1
206 shift
207 ;;
208 -[1-9]|--fast|--best)
209 COMP_LVL=$1
210 shift
211 ;;
212 --force|-F)
213 FORCE_OPT=-F
214 shift
215 ;;
216 --soft|-S)
217 LN_OPT=-S
218 shift
219 ;;
220 --hard|-H)
221 LN_OPT=-H
222 shift
223 ;;
224 --conf=*)
225 MAN_CONF=`echo $1 | cut -d '=' -f2-`
226 shift
227 ;;
228 --conf)
229 MAN_CONF="$2"
230 shift 2
231 ;;
232 --verbose|-v)
233 let VERBOSE_LVL++
234 shift
235 ;;
236 --backup)
237 BACKUP=yes
238 shift
239 ;;
240 --fake|-f)
241 FAKE=yes
242 shift
243 ;;
244 --help|-h)
245 help
246 exit 0
247 ;;
248 /*)
249 MAN_DIR="${MAN_DIR} ${1}"
250 shift
251 ;;
252 -*)
253 help $1
254 exit 1
255 ;;
256 *)
257 echo "\"$1\" is not an absolute path name"
258 exit 1
259 ;;
260 esac
261done
262
263# Redirections
264case $VERBOSE_LVL in
265 0)
266 # O, be silent
267 DEST_FD0=/dev/null
268 DEST_FD1=/dev/null
269 VERBOSE_OPT=
270 ;;
271 1)
272 # 1, be a bit verbose
273 DEST_FD0=/dev/stdout
274 DEST_FD1=/dev/null
275 VERBOSE_OPT=-v
276 ;;
277 *)
278 # 2 and above, be most verbose
279 DEST_FD0=/dev/stdout
280 DEST_FD1=/dev/stdout
281 VERBOSE_OPT="-v -v"
282 ;;
283esac
284
285# Note: on my machine, 'man --path' gives /usr/share/man twice, once
286# with a trailing '/', once without.
287if [ -z "$MAN_DIR" ]; then
288 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
289 | sed 's/:/\\n/g' \
290 | while read foo; do dirname "$foo"/.; done \
291 | sort -u \
292 | while read bar; do echo -n "$bar "; done`
293fi
294
295# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
296if [ -z "$MAN_DIR" ]; then
297 echo "No directory specified, and no directory found with \`man --path'"
298 exit 1
299fi
300
301# Fake?
302if [ "$FAKE" != "no" ]; then
303 echo "Actual parameters used:"
304 echo -n "Compression.......: "
305 case $COMP_METHOD in
306 --bzip2|--bz2|-b) echo -n "bzip2";;
307 --gzip|__gz|-g) echo -n "gzip";;
308 --decompress|-d) echo -n "decompressing";;
309 *) echo -n "unknown";;
310 esac
311 echo " ($COMP_METHOD)"
312 echo "Compression level.: $COMP_LVL"
313 echo "Compression suffix: $COMP_SUF"
314 echo -n "Force compression.: "
315 [ "foo$FORCE_OPT" = "foo-F" ] &amp;&amp; echo "yes" || echo "no"
316 echo "man.conf is.......: ${MAN_CONF}/man.conf"
317 echo -n "Hard-links........: "
318 [ "foo$LN_OPT" = "foo-S" ] &amp;&amp;
319 echo "convert to soft-links" || echo "leave as is"
320 echo -n "Soft-links........: "
321 [ "foo$LN_OPT" = "foo-H" ] &amp;&amp;
322 echo "convert to hard-links" || echo "leave as is"
323 echo "Backup............: $BACKUP"
324 echo "Faking (yes!).....: $FAKE"
325 echo "Directories.......: $MAN_DIR"
326 echo "Verbosity level...: $VERBOSE_LVL"
327 exit 0
328fi
329
330# If no method was specified, print help
331if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
332 help
333 exit 1
334fi
335
336# In backup mode, do the backup solely
337if [ "$BACKUP" = "yes" ]; then
338 for DIR in $MAN_DIR; do
339 cd "${DIR}/.."
340 DIR_NAME=`basename "${DIR}"`
341 echo "Backing up $DIR..." &gt; $DEST_FD0
342 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
343 [ -f "${DIR_NAME}.tar" ] &amp;&amp;
344 mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
345 tar cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
346 done
347 exit 0
348fi
349
350# I know MAN_DIR has only absolute path names
351# I need to take into account the localized man, so I'm going recursive
352for DIR in $MAN_DIR; do
353 MEM_DIR=`pwd`
354 cd "$DIR"
355 for FILE in *; do
356 # Fixes the case were the directory is empty
357 if [ "foo$FILE" = "foo*" ]; then continue; fi
358
359 # Fixes the case when hard-links see their compression scheme change
360 # (from not compressed to compressed, or from bz2 to gz, or from gz
361 # to bz2)
362 # Also fixes the case when multiple version of the page are present,
363 # which are either compressed or not.
364 if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
365
366 # Do not compress whatis files
367 if [ "$FILE" = "whatis" ]; then continue; fi
368
369 if [ -d "$FILE" ]; then
370 cd "${MEM_DIR}" # Go back to where we ran "$0",
371 # in case "$0"=="./compressdoc" ...
372 # We are going recursive to that directory
373 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
374 # I need not pass --conf, as I specify the directory to work on
375 # But I need exit in case of error
376 "$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT}
377 ${FORCE_OPT} "${DIR}/${FILE}" || exit 1
378 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
379 cd "$DIR" # Needed for the next iteration of the loop
380
381 else # !dir
382 if ! check_unique "$DIR" "$FILE"; then continue; fi
383
384 # Check if the file is already compressed with the specified method
385 BASE_FILE=`basename "$FILE" .gz`
386 BASE_FILE=`basename "$BASE_FILE" .bz2`
387 if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" \
388 -a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
389
390 # If we have a symlink
391 if [ -h "$FILE" ]; then
392 case "$FILE" in
393 *.bz2)
394 EXT=bz2 ;;
395 *.gz)
396 EXT=gz ;;
397 *)
398 EXT=none ;;
399 esac
400
401 if [ ! "$EXT" = "none" ]; then
402 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 \
403 | tr -d " " | sed s/\.$EXT$//`
404 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
405 mv "$FILE" "$NEWNAME"
406 FILE="$NEWNAME"
407 else
408 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " "`
409 fi
410
411 if [ "$LN_OPT" = "-H" ]; then
412 # Change this soft-link into a hard- one
413 rm -f "$FILE" &amp;&amp; ln "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
414 chmod --reference "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
415 else
416 # Keep this soft-link a soft- one.
417 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
418 fi
419 echo "Relinked $FILE" &gt; $DEST_FD1
420
421 # else if we have a plain file
422 elif [ -f "$FILE" ]; then
423 # Take care of hard-links: build the list of files hard-linked
424 # to the one we are {de,}compressing.
425 # NB. This is not optimum has the file will eventually be
426 # compressed as many times it has hard-links. But for now,
427 # that's the safe way.
428 inode=`ls -li "$FILE" | awk '{print $1}'`
429 HLINKS=`find . \! -name "$FILE" -inum $inode`
430
431 if [ -n "$HLINKS" ]; then
432 # We have hard-links! Remove them now.
433 for i in $HLINKS; do rm -f "$i"; done
434 fi
435
436 # Now take care of the file that has no hard-link
437 # We do decompress first to re-compress with the selected
438 # compression ratio later on...
439 case "$FILE" in
440 *.bz2)
441 bunzip2 $FILE
442 FILE=`basename "$FILE" .bz2`
443 ;;
444 *.gz)
445 gunzip $FILE
446 FILE=`basename "$FILE" .gz`
447 ;;
448 esac
449
450 # Compress the file with the given compression ratio, if needed
451 case $COMP_SUF in
452 *bz2)
453 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
454 echo "Compressed $FILE" &gt; $DEST_FD1
455 ;;
456 *gz)
457 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
458 echo "Compressed $FILE" &gt; $DEST_FD1
459 ;;
460 *)
461 echo "Uncompressed $FILE" &gt; $DEST_FD1
462 ;;
463 esac
464
465 # If the file had hard-links, recreate those (either hard or soft)
466 if [ -n "$HLINKS" ]; then
467 for i in $HLINKS; do
468 NEWFILE=`echo "$i" | sed s/\.gz$// | sed s/\.bz2$//`
469 if [ "$LN_OPT" = "-S" ]; then
470 # Make this hard-link a soft- one
471 ln -s "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
472 else
473 # Keep the hard-link a hard- one
474 ln "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
475 fi
476 # Really work only for hard-links. Harmless for soft-links
477 chmod 644 "${NEWFILE}$COMP_SUF"
478 done
479 fi
480
481 else
482 # There is a problem when we get neither a symlink nor a plain
483 # file. Obviously, we shall never ever come here... :-(
484 echo -n "Whaooo... \"${DIR}/${FILE}\" is neither a symlink "
485 echo "nor a plain file. Please check:"
486 ls -l "${DIR}/${FILE}"
487 exit 1
488 fi
489 fi
490 done # for FILE
491done # for DIR
492
493<command>EOF
494chmod 755 /usr/sbin/compressdoc</command></userinput></screen>
495
496<para>Now, as root, you can issue a
497<command>compressdoc --bz2</command> to compress all your system man
498pages. You can also run <command>compressdoc --help</command> to get
499comprehensive help about what the script is able to do.</para>
500
501<para> Don't forget that a few programs, like the <application>X</application>
502Window System and <application>XEmacs</application> also install their
503documentation in non standard places (such as
504<filename class="directory">/usr/X11R6/man</filename>, etc...). Be sure to add
505these locations to the file <filename>/etc/man.conf</filename>, as a
506<envar>MANPATH</envar>=<replaceable>[/path]</replaceable> section.</para>
507
508<para> Example:</para>
509
510<screen><userinput> ...
511 MANPATH=/usr/share/man
512 MANPATH=/usr/local/man
513 MANPATH=/usr/X11R6/man
514 MANPATH=/opt/qt/doc/man
515 ...</userinput></screen>
516
517<para>Generally, package installation systems do not compress man/info pages,
518which means you will need to run the script again if you want to keep the size
519of your documentation as small as possible. Also, note that running the script
520after upgrading a package is safe; when you have several versions of a page
521(for example, one compressed and one uncompressed), the most recent one is kept
522and the others deleted.</para>
523
524</sect1>
525
Note: See TracBrowser for help on using the repository browser.