source: postlfs/config/compressdoc.xml@ 53819cbb

10.0 10.1 11.0 11.1 11.2 11.3 12.0 12.1 6.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 kea ken/TL2024 ken/inkscape-core-mods ken/tuningfonts krejzi/svn lazarus lxqt nosym perl-modules plabs/newcss plabs/python-mods python3.11 qt5new rahul/power-profiles-daemon renodr/vulkan-addition systemd-11177 systemd-13485 trunk upgradedb xry111/intltool xry111/llvm18 xry111/soup3 xry111/test-20220226 xry111/xf86-video-removal
Last change on this file since 53819cbb was 53819cbb, checked in by Manuel Canales Esparcia <manuel@…>, 19 years ago

Tagged compressdoc.xml

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@4170 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 17.2 KB
Line 
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
3 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6]>
7
8<sect1 id="compressdoc" xreflabel="Compressing man and info pages">
9 <?dbhtml filename="compressdoc.html"?>
10
11 <sect1info>
12 <othername>$LastChangedBy$</othername>
13 <date>$Date$</date>
14 </sect1info>
15
16 <title>Compressing Man and Info Pages</title>
17
18 <indexterm zone="compressdoc">
19 <primary sortas="b-compressdoc">compressdoc</primary>
20 </indexterm>
21
22 <para>Man and info reader programs can transparently process gzip'ed or
23 bzip2'ed pages, a feature you can use to free some disk space while keeping
24 your documentation available. However, things are not that simple; man
25 directories tend to contain links&mdash;hard and symbolic&mdash;which defeat
26 simple ideas like recursively calling <command>gzip</command> on them. A
27 better way to go is to use the script below.</para>
28
29<screen role="root"><userinput>cat &gt; /usr/sbin/compressdoc &lt;&lt; "EOF"
30<literal>#!/bin/bash
31# VERSION: 20050112.0027
32#
33# Compress (with bzip2 or gzip) all man pages in a hierarchy and
34# update symlinks - By Marc Heerdink &lt;marc @ koelkast.net&gt;
35#
36# Modified to be able to gzip or bzip2 files as an option and to deal
37# with all symlinks properly by Mark Hymers &lt;markh @ linuxfromscratch.org&gt;
38#
39# Modified 20030930 by Yann E. Morin &lt;yann.morin.1998 @ anciens.enib.fr&gt;
40# to accept compression/decompression, to correctly handle hard-links,
41# to allow for changing hard-links into soft- ones, to specify the
42# compression level, to parse the man.conf for all occurrences of MANPATH,
43# to allow for a backup, to allow to keep the newest version of a page.
44#
45# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the
46# script.
47# (Note: It is assumed that the script is in the user's PATH)
48#
49# Modified 20050112 by Randy McMurchy to shorten line lengths and
50# correct grammar errors.
51#
52# TODO:
53# - choose a default compress method to be based on the available
54# tool : gzip or bzip2;
55# - offer an option to automagically choose the best compression
56# methed on a per page basis (eg. check which of
57# gzip/bzip2/whatever is the most effective, page per page);
58# - when a MANPATH env var exists, use this instead of /etc/man.conf
59# (useful for users to (de)compress their man pages;
60# - offer an option to restore a previous backup;
61# - add other compression engines (compress, zip, etc?). Needed?
62
63# Funny enough, this function prints some help.
64function help ()
65{
66 if [ -n "$1" ]; then
67 echo "Unknown option : $1"
68 fi
69 ( echo "Usage: $MY_NAME &lt;comp_method&gt; [options] [dirs]" &amp;&amp; \
70 cat &lt;&lt; EOT
71Where comp_method is one of :
72 --gzip, --gz, -g
73 --bzip2, --bz2, -b
74 Compress using gzip or bzip2.
75
76 --decompress, -d
77 Decompress the man pages.
78
79 --backup Specify a .tar backup shall be done for all directories.
80 In case a backup already exists, it is saved as .tar.old
81 prior to making the new backup. If a .tar.old backup
82 exists, it is removed prior to saving the backup.
83 In backup mode, no other action is performed.
84
85And where options are :
86 -1 to -9, --fast, --best
87 The compression level, as accepted by gzip and bzip2.
88 When not specified, uses the default compression level
89 for the given method (-6 for gzip, and -9 for bzip2).
90 Not used when in backup or decompress modes.
91
92 --force, -F Force (re-)compression, even if the previous one was
93 the same method. Useful when changing the compression
94 ratio. By default, a page will not be re-compressed if
95 it ends with the same suffix as the method adds
96 (.bz2 for bzip2, .gz for gzip).
97
98 --soft, -S Change hard-links into soft-links. Use with _caution_
99 as the first encountered file will be used as a
100 reference. Not used when in backup mode.
101
102 --hard, -H Change soft-links into hard-links. Not used when in
103 backup mode.
104
105 --conf=dir, --conf dir
106 Specify the location of man.conf. Defaults to /etc.
107
108 --verbose, -v Verbose mode, print the name of the directory being
109 processed. Double the flag to turn it even more verbose,
110 and to print the name of the file being processed.
111
112 --fake, -f Fakes it. Print the actual parameters compman will use.
113
114 dirs A list of space-separated _absolute_ pathnames to the
115 man directories. When empty, and only then, parse
116 ${MAN_CONF}/man.conf for all occurrences of MANPATH.
117
118Note about compression:
119 There has been a discussion on blfs-support about compression ratios of
120 both gzip and bzip2 on man pages, taking into account the hosting fs,
121 the architecture, etc... On the overall, the conclusion was that gzip
122 was much more efficient on 'small' files, and bzip2 on 'big' files,
123 small and big being very dependent on the content of the files.
124
125 See the original post from Mickael A. Peters, titled
126 "Bootable Utility CD", dated 20030409.1816(+0200), and subsequent posts:
127 http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
128
129 On my system (x86, ext3), man pages were 35564KB before compression.
130 gzip -9 compressed them down to 20372KB (57.28%), bzip2 -9 got down to
131 19812KB (55.71%). That is a 1.57% gain in space. YMMV.
132
133 What was not taken into consideration was the decompression speed. But
134 does it make sense to? You gain fast access with uncompressed man
135 pages, or you gain space at the expense of a slight overhead in time.
136 Well, my P4-2.5GHz does not even let me notice this... :-)
137
138EOT
139) | less
140}
141
142# This function checks that the man page is unique amongst bzip2'd,
143# gzip'd and uncompressed versions.
144# $1 the directory in which the file resides
145# $2 the file name for the man page
146# Returns 0 (true) if the file is the latest and must be taken care of,
147# and 1 (false) if the file is not the latest (and has therefore been
148# deleted).
149function check_unique ()
150{
151 # NB. When there are hard-links to this file, these are
152 # _not_ deleted. In fact, if there are hard-links, they
153 # all have the same date/time, thus making them ready
154 # for deletion later on.
155
156 # Build the list of all man pages with the same name
157 DIR=$1
158 BASENAME=`basename "${2}" .bz2`
159 BASENAME=`basename "${BASENAME}" .gz`
160 GZ_FILE="$BASENAME".gz
161 BZ_FILE="$BASENAME".bz2
162
163 # Look for, and keep, the most recent one
164 LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" \
165 2&gt;/dev/null | tail -n 1)`
166 for i in "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}"; do
167 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
168 done
169
170 # In case the specified file was the latest, return 0
171 [ "$LATEST" = "$2" ] &amp;&amp; return 0
172 # If the file was not the latest, return 1
173 return 1
174}
175
176# Name of the script
177MY_NAME=`basename $0`
178
179# OK, parse the command-line for arguments, and initialize to some
180# sensible state, that is: don't change links state, parse
181# /etc/man.conf, be most silent, search man.conf in /etc, and don't
182# force (re-)compression.
183COMP_METHOD=
184COMP_SUF=
185COMP_LVL=
186FORCE_OPT=
187LN_OPT=
188MAN_DIR=
189VERBOSE_LVL=0
190BACKUP=no
191FAKE=no
192MAN_CONF=/etc
193while [ -n "$1" ]; do
194 case $1 in
195 --gzip|--gz|-g)
196 COMP_SUF=.gz
197 COMP_METHOD=$1
198 shift
199 ;;
200 --bzip2|--bz2|-b)
201 COMP_SUF=.bz2
202 COMP_METHOD=$1
203 shift
204 ;;
205 --decompress|-d)
206 COMP_SUF=
207 COMP_LVL=
208 COMP_METHOD=$1
209 shift
210 ;;
211 -[1-9]|--fast|--best)
212 COMP_LVL=$1
213 shift
214 ;;
215 --force|-F)
216 FORCE_OPT=-F
217 shift
218 ;;
219 --soft|-S)
220 LN_OPT=-S
221 shift
222 ;;
223 --hard|-H)
224 LN_OPT=-H
225 shift
226 ;;
227 --conf=*)
228 MAN_CONF=`echo $1 | cut -d '=' -f2-`
229 shift
230 ;;
231 --conf)
232 MAN_CONF="$2"
233 shift 2
234 ;;
235 --verbose|-v)
236 let VERBOSE_LVL++
237 shift
238 ;;
239 --backup)
240 BACKUP=yes
241 shift
242 ;;
243 --fake|-f)
244 FAKE=yes
245 shift
246 ;;
247 --help|-h)
248 help
249 exit 0
250 ;;
251 /*)
252 MAN_DIR="${MAN_DIR} ${1}"
253 shift
254 ;;
255 -*)
256 help $1
257 exit 1
258 ;;
259 *)
260 echo "\"$1\" is not an absolute path name"
261 exit 1
262 ;;
263 esac
264done
265
266# Redirections
267case $VERBOSE_LVL in
268 0)
269 # O, be silent
270 DEST_FD0=/dev/null
271 DEST_FD1=/dev/null
272 VERBOSE_OPT=
273 ;;
274 1)
275 # 1, be a bit verbose
276 DEST_FD0=/dev/stdout
277 DEST_FD1=/dev/null
278 VERBOSE_OPT=-v
279 ;;
280 *)
281 # 2 and above, be most verbose
282 DEST_FD0=/dev/stdout
283 DEST_FD1=/dev/stdout
284 VERBOSE_OPT="-v -v"
285 ;;
286esac
287
288# Note: on my machine, 'man --path' gives /usr/share/man twice, once
289# with a trailing '/', once without.
290if [ -z "$MAN_DIR" ]; then
291 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
292 | sed 's/:/\\n/g' \
293 | while read foo; do dirname "$foo"/.; done \
294 | sort -u \
295 | while read bar; do echo -n "$bar "; done`
296fi
297
298# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
299if [ -z "$MAN_DIR" ]; then
300 echo "No directory specified, and no directory found with \`man --path'"
301 exit 1
302fi
303
304# Fake?
305if [ "$FAKE" != "no" ]; then
306 echo "Actual parameters used:"
307 echo -n "Compression.......: "
308 case $COMP_METHOD in
309 --bzip2|--bz2|-b) echo -n "bzip2";;
310 --gzip|__gz|-g) echo -n "gzip";;
311 --decompress|-d) echo -n "decompressing";;
312 *) echo -n "unknown";;
313 esac
314 echo " ($COMP_METHOD)"
315 echo "Compression level.: $COMP_LVL"
316 echo "Compression suffix: $COMP_SUF"
317 echo -n "Force compression.: "
318 [ "foo$FORCE_OPT" = "foo-F" ] &amp;&amp; echo "yes" || echo "no"
319 echo "man.conf is.......: ${MAN_CONF}/man.conf"
320 echo -n "Hard-links........: "
321 [ "foo$LN_OPT" = "foo-S" ] &amp;&amp;
322 echo "convert to soft-links" || echo "leave as is"
323 echo -n "Soft-links........: "
324 [ "foo$LN_OPT" = "foo-H" ] &amp;&amp;
325 echo "convert to hard-links" || echo "leave as is"
326 echo "Backup............: $BACKUP"
327 echo "Faking (yes!).....: $FAKE"
328 echo "Directories.......: $MAN_DIR"
329 echo "Verbosity level...: $VERBOSE_LVL"
330 exit 0
331fi
332
333# If no method was specified, print help
334if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
335 help
336 exit 1
337fi
338
339# In backup mode, do the backup solely
340if [ "$BACKUP" = "yes" ]; then
341 for DIR in $MAN_DIR; do
342 cd "${DIR}/.."
343 DIR_NAME=`basename "${DIR}"`
344 echo "Backing up $DIR..." &gt; $DEST_FD0
345 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
346 [ -f "${DIR_NAME}.tar" ] &amp;&amp;
347 mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
348 tar -cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
349 done
350 exit 0
351fi
352
353# I know MAN_DIR has only absolute path names
354# I need to take into account the localized man, so I'm going recursive
355for DIR in $MAN_DIR; do
356 MEM_DIR=`pwd`
357 cd "$DIR"
358 for FILE in *; do
359 # Fixes the case were the directory is empty
360 if [ "foo$FILE" = "foo*" ]; then continue; fi
361
362 # Fixes the case when hard-links see their compression scheme change
363 # (from not compressed to compressed, or from bz2 to gz, or from gz
364 # to bz2)
365 # Also fixes the case when multiple version of the page are present,
366 # which are either compressed or not.
367 if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
368
369 # Do not compress whatis files
370 if [ "$FILE" = "whatis" ]; then continue; fi
371
372 if [ -d "$FILE" ]; then
373 cd "${MEM_DIR}" # Go back to where we ran "$0",
374 # in case "$0"=="./compressdoc" ...
375 # We are going recursive to that directory
376 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
377 # I need not pass --conf, as I specify the directory to work on
378 # But I need exit in case of error
379 "$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT} \
380 ${FORCE_OPT} "${DIR}/${FILE}" || exit 1
381 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
382 cd "$DIR" # Needed for the next iteration of the loop
383
384 else # !dir
385 if ! check_unique "$DIR" "$FILE"; then continue; fi
386
387 # Check if the file is already compressed with the specified method
388 BASE_FILE=`basename "$FILE" .gz`
389 BASE_FILE=`basename "$BASE_FILE" .bz2`
390 if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" \
391 -a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
392
393 # If we have a symlink
394 if [ -h "$FILE" ]; then
395 case "$FILE" in
396 *.bz2)
397 EXT=bz2 ;;
398 *.gz)
399 EXT=gz ;;
400 *)
401 EXT=none ;;
402 esac
403
404 if [ ! "$EXT" = "none" ]; then
405 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 \
406 | tr -d " " | sed s/\.$EXT$//`
407 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
408 mv "$FILE" "$NEWNAME"
409 FILE="$NEWNAME"
410 else
411 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " "`
412 fi
413
414 if [ "$LN_OPT" = "-H" ]; then
415 # Change this soft-link into a hard- one
416 rm -f "$FILE" &amp;&amp; ln "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
417 chmod --reference "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
418 else
419 # Keep this soft-link a soft- one.
420 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
421 fi
422 echo "Relinked $FILE" &gt; $DEST_FD1
423
424 # else if we have a plain file
425 elif [ -f "$FILE" ]; then
426 # Take care of hard-links: build the list of files hard-linked
427 # to the one we are {de,}compressing.
428 # NB. This is not optimum has the file will eventually be
429 # compressed as many times it has hard-links. But for now,
430 # that's the safe way.
431 inode=`ls -li "$FILE" | awk '{print $1}'`
432 HLINKS=`find . \! -name "$FILE" -inum $inode`
433
434 if [ -n "$HLINKS" ]; then
435 # We have hard-links! Remove them now.
436 for i in $HLINKS; do rm -f "$i"; done
437 fi
438
439 # Now take care of the file that has no hard-link
440 # We do decompress first to re-compress with the selected
441 # compression ratio later on...
442 case "$FILE" in
443 *.bz2)
444 bunzip2 $FILE
445 FILE=`basename "$FILE" .bz2`
446 ;;
447 *.gz)
448 gunzip $FILE
449 FILE=`basename "$FILE" .gz`
450 ;;
451 esac
452
453 # Compress the file with the given compression ratio, if needed
454 case $COMP_SUF in
455 *bz2)
456 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
457 echo "Compressed $FILE" &gt; $DEST_FD1
458 ;;
459 *gz)
460 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
461 echo "Compressed $FILE" &gt; $DEST_FD1
462 ;;
463 *)
464 echo "Uncompressed $FILE" &gt; $DEST_FD1
465 ;;
466 esac
467
468 # If the file had hard-links, recreate those (either hard or soft)
469 if [ -n "$HLINKS" ]; then
470 for i in $HLINKS; do
471 NEWFILE=`echo "$i" | sed s/\.gz$// | sed s/\.bz2$//`
472 if [ "$LN_OPT" = "-S" ]; then
473 # Make this hard-link a soft- one
474 ln -s "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
475 else
476 # Keep the hard-link a hard- one
477 ln "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
478 fi
479 # Really work only for hard-links. Harmless for soft-links
480 chmod 644 "${NEWFILE}$COMP_SUF"
481 done
482 fi
483
484 else
485 # There is a problem when we get neither a symlink nor a plain
486 # file. Obviously, we shall never ever come here... :-(
487 echo -n "Whaooo... \"${DIR}/${FILE}\" is neither a symlink "
488 echo "nor a plain file. Please check:"
489 ls -l "${DIR}/${FILE}"
490 exit 1
491 fi
492 fi
493 done # for FILE
494done # for DIR</literal>
495
496EOF
497chmod 755 /usr/sbin/compressdoc</userinput></screen>
498
499 <para>Now, as <systemitem class="username">root</systemitem>, you can issue a
500 <command>compressdoc --bz2</command> to compress all your system man
501 pages. You can also run <command>compressdoc --help</command> to get
502 comprehensive help about what the script is able to do.</para>
503
504 <para> Don't forget that a few programs, like the <application>X Window
505 System</application> and <application>XEmacs</application> also
506 install their documentation in non standard places (such as
507 <filename class="directory">/usr/X11R6/man</filename>, etc...). Be sure
508 to add these locations to the file <filename>/etc/man.conf</filename>, as a
509 <envar>MANPATH</envar>=<replaceable>[/path]</replaceable> section.</para>
510
511 <para> Example:</para>
512
513<screen><literal> ...
514 MANPATH=/usr/share/man
515 MANPATH=/usr/local/man
516 MANPATH=/usr/X11R6/man
517 MANPATH=/opt/qt/doc/man
518 ...</literal></screen>
519
520 <para>Generally, package installation systems do not compress man/info pages,
521 which means you will need to run the script again if you want to keep the size
522 of your documentation as small as possible. Also, note that running the script
523 after upgrading a package is safe; when you have several versions of a page
524 (for example, one compressed and one uncompressed), the most recent one is kept
525 and the others deleted.</para>
526
527</sect1>
528
Note: See TracBrowser for help on using the repository browser.