source: postlfs/config/compressdoc.xml@ 5cd0959d

10.0 10.1 11.0 6.0 6.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 ken/refactor-virt krejzi/svn lazarus nosym perl-modules qt5new systemd-11177 systemd-13485 trunk xry111/git-date xry111/git-date-for-trunk xry111/git-date-test
Last change on this file since 5cd0959d was 5cd0959d, checked in by Archaic <archaic@…>, 17 years ago

Resetting keywords

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@2592 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 16.7 KB
Line 
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../../general.ent">
5 %general-entities;
6]>
7
8<sect1 id="postlfs-config-compressdoc" xreflabel="compressdoc">
9<sect1info>
10<othername>$LastChangedBy$</othername>
11<date>$Date$</date>
12</sect1info>
13<?dbhtml filename="compressdoc.html"?>
14<title>Compressing man and info pages</title>
15
16<para>Man and info reader programs can transparently process gzip'ed or
17bzip2'ed pages, a feature you can use to free some disk space while keeping
18your documentation available. However, things are not that simple; man
19directories tend to contain links&mdash;hard and symbolic&mdash;which defeat simple
20ideas like recursively calling <command>gzip</command> on them. A better way
21to go is to use the script below.
22</para>
23
24<screen><userinput><command>cat &gt; /usr/sbin/compressdoc &lt;&lt; "EOF"</command>
25#!/bin/bash
26# VERSION: 20040320.0026
27#
28# Compress (with bzip2 or gzip) all man pages in a hierarchy and
29# update symlinks - By Marc Heerdink &lt;marc @ koelkast.net&gt;
30# Modified to be able to gzip or bzip2 files as an option and to deal
31# with all symlinks properly by Mark Hymers &lt;markh @ linuxfromscratch.org&gt;
32#
33# Modified 20030930 by Yann E. Morin &lt;yann.morin.1998 @ anciens.enib.fr&gt;
34# to accept compression/decompression, to correctly handle hard-links,
35# to allow for changing hard-links into soft- ones, to specify the
36# compression level, to parse the man.conf for all occurrences of MANPATH,
37# to allow for a backup, to allow to keep the newest version of a page.
38# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the script.
39# (Note: It is assumed that the script is in the user's PATH)
40#
41# TODO:
42# - choose a default compress method to be based on the available
43# tool : gzip or bzip2;
44# - offer an option to automagically choose the best compression method
45# on a per page basis (eg. check which ofgzip/bzip2/whatever is the
46# most effective, page per page);
47# - when a MANPATH env var exists, use this instead of /etc/man.conf
48# (useful for users to (de)compress their man pages;
49# - offer an option to restore a previous backup;
50# - add other compression engines (compress, zip, etc?). Needed?
51
52# Funny enough, this function prints some help.
53function help ()
54{
55 if [ -n "$1" ]; then
56 echo "Unknown option : $1"
57 fi
58 ( echo "Usage: $MY_NAME &lt;comp_method&gt; [options] [dirs]" &amp;&amp; \
59 cat &lt;&lt; EOT
60Where comp_method is one of :
61 --gzip, --gz, -g
62 --bzip2, --bz2, -b
63 Compress using gzip or bzip2.
64
65 --decompress, -d
66 Decompress the man pages.
67
68 --backup Specify a .tar backup shall be done for every directories.
69 In case a backup already exists, it is saved as .tar.old prior
70 to making the new backup. If an .tar.old backup exist, it is
71 removed prior to saving the backup.
72 In backup mode, no other action is performed.
73
74And where options are :
75 -1 to -9, --fast, --best
76 The compression level, as accepted by gzip and bzip2. When not
77 specified, uses the default compression level for the given
78 method (-6 for gzip, and -9 for bzip2). Not used when in backup
79 or decompress modes.
80
81 --force, -F Force (re-)compression, even if the previous one was the same
82 method. Useful when changing the compression ratio. By default,
83 a page will not be re-compressed if it ends with the same suffix
84 as the method adds (.bz2 for bzip2, .gz for gzip).
85
86 --soft, -S Change hard-links into soft-links. Use with _caution_ as the
87 first encountered file will be used as a reference. Not used
88 when in backup mode.
89
90 --hard, -H Change soft-links into hard-links. Not used when in backup mode.
91
92 --conf=dir, --conf dir
93 Specify the location of man.conf. Defaults to /etc.
94
95 --verbose, -v Verbose mode, print the name of the directory being processed.
96 Double the flag to turn it even more verbose, and to print the
97 name of the file being processed.
98
99 --fake, -f Fakes it. Print the actual parameters compman will use.
100
101 dirs A list of space-separated _absolute_ pathname to the man
102 directories.
103 When empty, and only then, parse ${MAN_CONF}/man.conf for all
104 occurrences of MANPATH.
105
106Note about compression
107 There has been a discussion on blfs-support about compression ratios of
108 both gzip and bzip2 on man pages, taking into account the hosting fs,
109 the architecture, etc... On the overall, the conclusion was that gzip
110 was much efficient on 'small' files, and bzip2 on 'big' files, small and
111 big being very dependent on the content of the files.
112
113 See the original post from Mickael A. Peters, titled "Bootable Utility CD",
114 and dated 20030409.1816(+0200), and subsequent posts:
115 http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
116
117 On my system (x86, ext3), man pages were 35564kiB before compression. gzip -9
118 compressed them down to 20372kiB (57.28%), bzip2 -9 got down to 19812kiB
119 (55.71%). That is a 1.57% gain in space. YMMV.
120
121 What was not taken into consideration was the decompression speed. But does
122 it make sense to? You gain fast access with uncompressed man pages, or you
123 gain space at the expense of a slight overhead in time. Well, my P4-2.5GHz
124 does not even let me notice this... :-)
125EOT
126) | less
127}
128
129# This function checks that the man page is unique amongst bzip2'd, gzip'd and
130# uncompressed versions.
131# $1 the directory in which the file resides
132# $2 the file name for the man page
133# Returns 0 (true) if the file is the latest and must be taken care of, and 1
134# (false) if the file is not the latest (and has therefore been deleted).
135function check_unique ()
136{
137 # NB. When there are hard-links to this file, these are
138 # _not_ deleted. In fact, if there are hard-links, they
139 # all have the same date/time, thus making them ready
140 # for deletion later on.
141
142 # Build the list of all man pages with the same name
143 DIR=$1
144 BASENAME=`basename "${2}" .bz2`
145 BASENAME=`basename "${BASENAME}" .gz`
146 GZ_FILE="$BASENAME".gz
147 BZ_FILE="$BASENAME".bz2
148
149 # Look for, and keep, the most recent one
150 LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" 2&gt;/dev/null | tail -n 1)`
151 for i in "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}"; do
152 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
153 done
154
155 # In case the specified file was the latest, return 0
156 [ "$LATEST" = "$2" ] &amp;&amp; return 0
157 # If the file was not the latest, return 1
158 return 1
159}
160
161# Name of the script
162MY_NAME=`basename $0`
163
164# OK, parse the command-line for arguments, and initialize to some sensible
165# state, that is : don't change links state, parse /etc/man.conf, be most
166# silent, search man.conf in /etc, and don't force (re-)compression.
167COMP_METHOD=
168COMP_SUF=
169COMP_LVL=
170FORCE_OPT=
171LN_OPT=
172MAN_DIR=
173VERBOSE_LVL=0
174BACKUP=no
175FAKE=no
176MAN_CONF=/etc
177while [ -n "$1" ]; do
178 case $1 in
179 --gzip|--gz|-g)
180 COMP_SUF=.gz
181 COMP_METHOD=$1
182 shift
183 ;;
184 --bzip2|--bz2|-b)
185 COMP_SUF=.bz2
186 COMP_METHOD=$1
187 shift
188 ;;
189 --decompress|-d)
190 COMP_SUF=
191 COMP_LVL=
192 COMP_METHOD=$1
193 shift
194 ;;
195 -[1-9]|--fast|--best)
196 COMP_LVL=$1
197 shift
198 ;;
199 --force|-F)
200 FORCE_OPT=-F
201 shift
202 ;;
203 --soft|-S)
204 LN_OPT=-S
205 shift
206 ;;
207 --hard|-H)
208 LN_OPT=-H
209 shift
210 ;;
211 --conf=*)
212 MAN_CONF=`echo $1 | cut -d '=' -f2-`
213 shift
214 ;;
215 --conf)
216 MAN_CONF="$2"
217 shift 2
218 ;;
219 --verbose|-v)
220 let VERBOSE_LVL++
221 shift
222 ;;
223 --backup)
224 BACKUP=yes
225 shift
226 ;;
227 --fake|-f)
228 FAKE=yes
229 shift
230 ;;
231 --help|-h)
232 help
233 exit 0
234 ;;
235 /*)
236 MAN_DIR="${MAN_DIR} ${1}"
237 shift
238 ;;
239 -*)
240 help $1
241 exit 1
242 ;;
243 *)
244 echo "\"$1\" is not an absolute path name"
245 exit 1
246 ;;
247 esac
248done
249
250# Redirections
251case $VERBOSE_LVL in
252 0)
253 # O, be silent
254 DEST_FD0=/dev/null
255 DEST_FD1=/dev/null
256 VERBOSE_OPT=
257 ;;
258 1)
259 # 1, be a bit verbose
260 DEST_FD0=/dev/stdout
261 DEST_FD1=/dev/null
262 VERBOSE_OPT=-v
263 ;;
264 *)
265 # 2 and above, be most verbose
266 DEST_FD0=/dev/stdout
267 DEST_FD1=/dev/stdout
268 VERBOSE_OPT="-v -v"
269 ;;
270esac
271
272# Note: on my machine, 'man --path' gives /usr/share/man twice, once with a trailing '/', once without.
273if [ -z "$MAN_DIR" ]; then
274 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
275 | sed 's/:/\\n/g' \
276 | while read foo; do dirname "$foo"/.; done \
277 | sort -u \
278 | while read bar; do echo -n "$bar "; done`
279fi
280
281# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
282if [ -z "$MAN_DIR" ]; then
283 echo "No directory specified, and no directory found with \`man --path'"
284 exit 1
285fi
286
287# Fake?
288if [ "$FAKE" != "no" ]; then
289 echo "Actual parameters used:"
290 echo -n "Compression.......: "
291 case $COMP_METHOD in
292 --bzip2|--bz2|-b) echo -n "bzip2";;
293 --gzip|__gz|-g) echo -n "gzip";;
294 --decompress|-d) echo -n "decompressing";;
295 *) echo -n "unknown";;
296 esac
297 echo " ($COMP_METHOD)"
298 echo "Compression level.: $COMP_LVL"
299 echo "Compression suffix: $COMP_SUF"
300 echo -n "Force compression.: "
301 [ "foo$FORCE_OPT" = "foo-F" ] &amp;&amp; echo "yes" || echo "no"
302 echo "man.conf is.......: ${MAN_CONF}/man.conf"
303 echo -n "Hard-links........: "
304 [ "foo$LN_OPT" = "foo-S" ] &amp;&amp; echo "convert to soft-links" || echo "leave as is"
305 echo -n "Soft-links........: "
306 [ "foo$LN_OPT" = "foo-H" ] &amp;&amp; echo "convert to hard-links" || echo "leave as is"
307 echo "Backup............: $BACKUP"
308 echo "Faking (yes!).....: $FAKE"
309 echo "Directories.......: $MAN_DIR"
310 echo "Verbosity level...: $VERBOSE_LVL"
311 exit 0
312fi
313
314# If no method was specified, print help
315if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
316 help
317 exit 1
318fi
319
320# In backup mode, do the backup solely
321if [ "$BACKUP" = "yes" ]; then
322 for DIR in $MAN_DIR; do
323 cd "${DIR}/.."
324 DIR_NAME=`basename "${DIR}"`
325 echo "Backing up $DIR..." &gt; $DEST_FD0
326 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
327 [ -f "${DIR_NAME}.tar" ] &amp;&amp; mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
328 tar cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
329 done
330 exit 0
331fi
332
333# I know MAN_DIR has only absolute path names
334# I need to take into account the localized man, so I'm going recursive
335for DIR in $MAN_DIR; do
336 MEM_DIR=`pwd`
337 cd "$DIR"
338 for FILE in *; do
339 # Fixes the case were the directory is empty
340 if [ "foo$FILE" = "foo*" ]; then continue; fi
341
342 # Fixes the case when hard-links see their compression scheme change
343 # (from not compressed to compressed, or from bz2 to gz, or from gz to bz2)
344 # Also fixes the case when multiple version of the page are present, which
345 # are either compressed or not.
346 if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
347
348 # Do not compress whatis files
349 if [ "$FILE" = "whatis" ]; then continue; fi
350
351 if [ -d "$FILE" ]; then
352 cd "${MEM_DIR}" # Go back to where we ran "$0", in case "$0"=="./compressdoc" ...
353 # We are going recursive to that directory
354 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
355 # I need not pass --conf, as I specify the directory to work on
356 # But I need exit in case of error
357 "$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT} ${FORCE_OPT} "${DIR}/${FILE}" || exit 1
358 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
359 cd "$DIR" # Needed for the next iteration of the loop
360
361 else # !dir
362 if ! check_unique "$DIR" "$FILE"; then continue; fi
363
364 # Check if the file is already compressed with the specified method
365 BASE_FILE=`basename "$FILE" .gz`
366 BASE_FILE=`basename "$BASE_FILE" .bz2`
367 if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" -a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
368
369 # If we have a symlink
370 if [ -h "$FILE" ]; then
371 case "$FILE" in
372 *.bz2)
373 EXT=bz2 ;;
374 *.gz)
375 EXT=gz ;;
376 *)
377 EXT=none ;;
378 esac
379
380 if [ ! "$EXT" = "none" ]; then
381 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " " | sed s/\.$EXT$//`
382 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
383 mv "$FILE" "$NEWNAME"
384 FILE="$NEWNAME"
385 else
386 LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " "`
387 fi
388
389 if [ "$LN_OPT" = "-H" ]; then
390 # Change this soft-link into a hard- one
391 rm -f "$FILE" &amp;&amp; ln "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
392 chmod --reference "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
393 else
394 # Keep this soft-link a soft- one.
395 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
396 fi
397 echo "Relinked $FILE" &gt; $DEST_FD1
398
399 # else if we have a plain file
400 elif [ -f "$FILE" ]; then
401 # Take care of hard-links: build the list of files hard-linked
402 # to the one we are {de,}compressing.
403 # NB. This is not optimum has the file will eventually be compressed
404 # as many times it has hard-links. But for now, that's the safe way.
405 inode=`ls -li "$FILE" | awk '{print $1}'`
406 HLINKS=`find . \! -name "$FILE" -inum $inode`
407
408 if [ -n "$HLINKS" ]; then
409 # We have hard-links! Remove them now.
410 for i in $HLINKS; do rm -f "$i"; done
411 fi
412
413 # Now take care of the file that has no hard-link
414 # We do decompress first to re-compress with the selected
415 # compression ratio later on...
416 case "$FILE" in
417 *.bz2)
418 bunzip2 $FILE
419 FILE=`basename "$FILE" .bz2`
420 ;;
421 *.gz)
422 gunzip $FILE
423 FILE=`basename "$FILE" .gz`
424 ;;
425 esac
426
427 # Compress the file with the given compression ratio, if needed
428 case $COMP_SUF in
429 *bz2)
430 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
431 echo "Compressed $FILE" &gt; $DEST_FD1
432 ;;
433 *gz)
434 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
435 echo "Compressed $FILE" &gt; $DEST_FD1
436 ;;
437 *)
438 echo "Uncompressed $FILE" &gt; $DEST_FD1
439 ;;
440 esac
441
442 # If the file had hard-links, recreate those (either hard or soft)
443 if [ -n "$HLINKS" ]; then
444 for i in $HLINKS; do
445 NEWFILE=`echo "$i" | sed s/\.gz$// | sed s/\.bz2$//`
446 if [ "$LN_OPT" = "-S" ]; then
447 # Make this hard-link a soft- one
448 ln -s "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
449 else
450 # Keep the hard-link a hard- one
451 ln "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
452 fi
453 chmod 644 "${NEWFILE}$COMP_SUF" # Really work only for hard-links. Harmless for soft-links
454 done
455 fi
456
457 else
458 # There is a problem when we get neither a symlink nor a plain file
459 # Obviously, we shall never ever come here... :-(
460 echo "Whaooo... \"${DIR}/${FILE}\" is neither a symlink nor a plain file. Please check:"
461 ls -l "${DIR}/${FILE}"
462 exit 1
463 fi
464 fi
465 done # for FILE
466done # for DIR
467<command>EOF
468chmod 755 /usr/sbin/compressdoc</command></userinput></screen>
469
470<para>Now, as root, you can issue a
471<command>compressdoc --bz2</command> to compress all your system man
472pages. You can also run <command>compressdoc --help</command> to get
473comprehensive help about what the script is able to do.</para>
474
475<para> Don't forget that a few programs, like the <application>X</application>
476Window System and <application>XEmacs</application> also install their
477documentation in non standard places (such as <filename class="directory">
478/usr/X11R6/man</filename>, etc...). Be sure to add these locations to the
479file <filename>/etc/man.conf</filename>, as a
480<envar>MANPATH</envar>=<replaceable>/path</replaceable> section.</para>
481<para> Example:</para><screen><userinput>
482 ...
483 MANPATH=/usr/share/man
484 MANPATH=/usr/local/man
485 MANPATH=/usr/X11R6/man
486 MANPATH=/opt/qt/doc/man
487 ...</userinput></screen>
488
489<para>Generally, package installation systems do not compress man/info pages,
490which means you will need to run the script again if you want to keep the size
491of your documentation as small as possible. Also, note that running the script
492after upgrading a package is safe; when you have several versions of a page
493(for example, one compressed and one uncompressed), the most recent one is kept
494and the others deleted.</para>
495
496</sect1>
497
Note: See TracBrowser for help on using the repository browser.