source: postlfs/config/compressdoc.xml@ b7d0bb4

10.0 10.1 11.0 11.1 11.2 11.3 12.0 12.1 6.0 6.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 kea ken/TL2024 ken/inkscape-core-mods ken/tuningfonts krejzi/svn lazarus lxqt nosym perl-modules plabs/newcss plabs/python-mods python3.11 qt5new rahul/power-profiles-daemon renodr/vulkan-addition systemd-11177 systemd-13485 trunk upgradedb v5_0 v5_0-pre1 v5_1 v5_1-pre1 xry111/intltool xry111/llvm18 xry111/soup3 xry111/test-20220226 xry111/xf86-video-removal
Last change on this file since b7d0bb4 was b7d0bb4, checked in by Larry Lawrence <larry@…>, 21 years ago

standardizing compound words

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@1294 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 15.2 KB
Line 
1<sect1 id="postlfs-config-compressdoc" xreflabel="compressdoc">
2<?dbhtml filename="compressdoc.html" dir="postlfs"?>
3<title>Compressing man and info pages</title>
4
5<para>Man and info reader programs can transparently process gzip'ed or
6bzip2'ed pages, a feature you can use to free some disk space while keeping
7your documentation available. However, things are not that simple: man
8directories tend to contain links - hard and symbolic - which defeat simple
9ideas like recursively calling <command>gzip</command> on them. A better way
10to go is to use the script below.
11</para>
12
13<screen><userinput><command>cat &gt; /usr/bin/compressdoc &lt;&lt; "EOF"</command>
14
15#!/bin/bash
16#
17# Compress (with bzip2 or gzip) all man pages in a hierarchy and
18# update symlinks - By Marc Heerdink &lt;marc@koelkast.net&gt;.
19# Modified to be able to gzip or bzip2 files as an option and to deal
20# with all symlinks properly by Mark Hymers # &lt;markh@linuxfromscratch.org&gt;
21#
22# Modified 20030925 by Yann E. Morin &lt;yann.morin.1998 @ # anciens.enib.fr&gt;
23# to accept compression/decompression, to correctly handle hard-links,
24# to allow for changing hard-links into soft- ones, to specify the
25# compression level, to parse the man.conf for all occurrences of MANPATH,
26# to allow for a backup, to allow to keep the newest version of a page.
27#
28# TODO:
29# - inverse the quiet option into a verbose one, so as to be silent
30# by default;
31# - choose a default compress method to be based on the available
32# tool : gzip or bzip2;
33# - when a MANPATH env var exists, use this instead of /etc/man.conf
34# (useful for users to (de)compress their man pages;
35# - offer an option to restore a previous backup;
36# - add other compression engines (compress, zip, etc?). Needed?
37
38# Funny enough, this function prints some help.
39function help ()
40{
41 if [ -n "$1" ]; then
42 echo "Unknown option : $1"
43 fi
44 echo "Usage: $0 &lt;comp_method&gt; [options] [dirs]"
45 cat &lt;&lt; EOT
46Where comp_method is one of :
47 --gzip, --gz, -g
48 --bzip2, --bz2, -b
49 Compress using gzip or bzip2.
50
51 --decompress, -d
52 Decompress the man pages.
53
54 --backup Specify a .tar backup shall be done for every directories.
55 In case a backup already exists, it is saved as .tar.old prior
56 to making the new backup. If an .tar.old backup exist, it is
57 removed prior to saving the backup.
58 In backup mode, no other action is performed.
59
60And where options are :
61 -1 to -9, --fast, --best
62 The compression level, as accepted by gzip and bzip2. When not
63 specified, uses the default compression level for the given
64 method (-6 for gzip, and -9 for bzip2). Not used when in backup
65 or decompress modes.
66
67 --force, -F Force (re-)compression, even if the previous one was the same
68 method. Useful when changing the compression ratio. By default,
69 a page will not be re-compressed if it ends with the same suffix
70 as the method adds (.bz2 for bzip2, .gz for gzip).
71
72 -s Change hard-links into soft-links. Use with _caution_ as the
73 first encountered file will be used as a reference. Not used
74 when in backup mode.
75
76 --conf=dir, --conf dir
77 Specify the location of man.conf. Defaults to /etc.
78
79 --verbose, -v Verbose mode, print the name of the directory being processed.
80 Double the flag to turn it even more verbose, and to print the
81 name of the file being processed.
82
83 --fake, -f Fakes it. Print the actual parameters compman will use.
84
85 dirs A list of space-separated _absolute_ pathname to the man
86 directories.
87 When empty, and only then, parse ${MAN_CONF}/man.conf for all
88 occurrences of MANPATH.
89
90Note about compression
91 There has been a discussion on blfs-support about compression ratios of
92 both gzip and bzip2 on man pages, taking into account the hosting fs,
93 the architecture, etc... On the overall, the conclusion was that gzip
94 was much efficient on 'small' files, and bzip2 on 'big' files, small and
95 big being very dependent on the content of the files.
96
97 See the original post from Mickael A. Peters, titled "Bootable Utility CD",
98 and dated 20030409.1816(+0200), and subsequent posts:
99 http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
100
101 On my system (x86, ext3), man pages were 35564kiB before compression. gzip -9
102 compressed them down to 20372kiB (57.28%), bzip2 -9 got down to 19812kiB
103 (55.71%). That is a 1.57% gain in space. YMMV.
104
105 What was not taken into consideration was the decompression speed. But does
106 it make sense to? You gain fast access with uncompressed man pages, or you
107 gain space at the expense of a slight overhead in time. Well, my P4-2.5GHz
108 does not even let me notice this... :-)
109EOT
110}
111
112# This function checks that the man page is unique amongst bzip2'd, gzip'd and
113# uncompressed versions.
114# $1 the directory in which the file resides
115# $2 the file name for the man page
116# Returns 0 (true) if the file is the latest and must be taken care of, and 1
117# (false) if the file is not the latest (and has therefore been deleted).
118function check_unique ()
119{
120 # NB. When there are hard-links to this file, these are
121 # _not_ deleted. In fact, if there are hard-links, they
122 # all have the same date/time, thus making them ready
123 # for deletion later on.
124
125 # Build the list of all man pages with the same name
126 DIR=$1
127 BASENAME=`basename "${2}" .bz2`
128 BASENAME=`basename "${BASENAME}" .gz`
129 LIST=
130 [ -f "$DIR"/"${BASENAME}" -o -L "$DIR"/"${BASENAME}" ] &amp;&amp; LIST="${LIST} ${BASENAME}"
131 [ -f "$DIR"/"${BASENAME}".gz -o -L "$DIR"/"${BASENAME}".gz ] &amp;&amp; LIST="${LIST} ${BASENAME}.gz"
132 [ -f "$DIR"/"${BASENAME}".bz2 -o -L "$DIR"/"${BASENAME}".bz2 ] &amp;&amp; LIST="${LIST} ${BASENAME}.bz2"
133
134 # Look for, and keep, the most recent one
135 LATEST=`(cd "$DIR"; ls -1rt $LIST | tail -1)`
136 for i in $LIST; do
137 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
138 done
139
140 # In case the specified file was the latest, return 0
141 [ "$LATEST" = "$2" ] &amp;&amp; return 0
142 # If the file was not the latest, return 1
143 return 1
144}
145
146# OK, parse the command-line for arguments, and initialize to some sensible
147# state, that is keep hardlinks, parse /etc/man.conf, be most silent, search
148# man.conf in /etc, and don't force (re-)compression.
149COMP_METHOD=
150COMP_SUF=
151COMP_LVL=
152FORCE_COMP=no
153LN_OPT=
154MAN_DIR=
155VERBOSE_LVL=0
156BACKUP=no
157FAKE=no
158MAN_CONF=/etc
159while [ -n "$1" ]; do
160 case $1 in
161 --gzip|--gz|-g)
162 COMP_SUF=.gz
163 COMP_METHOD=$1
164 shift
165 ;;
166 --bzip2|--bz2|-b)
167 COMP_SUF=.bz2
168 COMP_METHOD=$1
169 shift
170 ;;
171 --decompress|-d)
172 COMP_SUF=
173 COMP_LVL=
174 COMP_METHOD=$1
175 shift
176 ;;
177 -[1-9]|--fast|--best)
178 COMP_LVL=$1
179 shift
180 ;;
181 --force|-F)
182 FORCE_COMP=yes
183 shift
184 ;;
185 --soft|-s)
186 LN_OPT=-s
187 shift
188 ;;
189 --conf=*)
190 MAN_CONF=`echo $1 | cut -d '=' -f2-`
191 shift
192 ;;
193 --conf)
194 MAN_CONF="$2"
195 shift 2
196 ;;
197 --verbose|-v)
198 let VERBOSE_LVL++
199 shift
200 ;;
201 --backup)
202 BACKUP=yes
203 shift
204 ;;
205 --fake|-f)
206 FAKE=yes
207 shift
208 ;;
209 --help|-h)
210 help
211 exit 0
212 ;;
213 /*)
214 MAN_DIR="${MAN_DIR} ${1}"
215 shift
216 ;;
217 -*)
218 help $1
219 exit 1
220 ;;
221 *)
222 echo "\"$1\" is not an absolute path name"
223 exit 1
224 ;;
225 esac
226done
227
228# Redirections
229case $VERBOSE_LVL in
230 0)
231 # O, be silent
232 DEST_FD0=/dev/null
233 DEST_FD1=/dev/null
234 VERBOSE_OPT=
235 ;;
236 1)
237 # 1, be a bit verbose
238 DEST_FD0=/dev/stdout
239 DEST_FD1=/dev/null
240 VERBOSE_OPT=-v
241 ;;
242 *)
243 # 2 and above, be most verbose
244 DEST_FD0=/dev/stdout
245 DEST_FD1=/dev/stdout
246 VERBOSE_OPT="-v -v"
247 ;;
248esac
249
250# Note: on my machine, 'man --path' gives /usr/share/man twice, once with a trailing '/', once without.
251if [ -z "$MAN_DIR" ]; then
252 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
253 | sed 's/:/\\n/g' \
254 | while read foo; do dirname "$foo"/.; done \
255 | sort -u \
256 | while read bar; do echo -n "$bar "; done`
257fi
258
259# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
260if [ -z "$MAN_DIR" ]; then
261 echo "No directory specified, and no directory found with \`man --path'"
262 exit 1
263fi
264
265# Fake?
266if [ "$FAKE" != "no" ]; then
267 echo "Actual parameters used:"
268 echo -n "Compression.......: "
269 case $COMP_METHOD in
270 --bzip2|--bz2|-b) echo -n "bzip2";;
271 --gzip|__gz|-g) echo -n "gzip";;
272 --decompress|-d) echo -n "decompressing";;
273 *) echo -n "unknown";;
274 esac
275 echo " ($COMP_METHOD)"
276 echo "Compression level.: $COMP_LVL"
277 echo "Compression suffix: $COMP_SUF"
278 echo "Force compression.: $FORCE_COMP"
279 echo "man.conf is.......: ${MAN_CONF}/man.conf ($MAN_CONF)"
280 echo -n "Hard links........: "
281 [ "$LN_OPT" = "-s" -o "$LN_OPT" = "--soft" ] &amp;&amp; echo -n "Convert to symlinks" || echo -n "Keep hardlinks"
282 echo " ($LN_OPT)"
283 echo "Backup............: $BACKUP"
284 echo "Faking (yes!).....: $FAKE"
285 echo "Directories.......: $MAN_DIR"
286 echo "Silence level.....: $VERBOSE_LVL ($VERBOSE_OPT)"
287 exit 0
288fi
289
290# If no method was specified, print help
291if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
292 help
293 exit 1
294fi
295
296# In backup mode, do the backup solely
297if [ "$BACKUP" = "yes" ]; then
298 for DIR in $MAN_DIR; do
299 cd "${DIR}/.."
300 DIR_NAME=`basename "${DIR}"`
301 echo "Backing up $DIR..." &gt; $DEST_FD0
302 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
303 [ -f "${DIR_NAME}.tar" ] &amp;&amp; mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
304 tar cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
305 done
306 exit 0
307fi
308
309# I know MAN_DIR has only absolute path names
310# I need to take into account the localized man, so I'm going recursive
311for DIR in $MAN_DIR; do
312 cd "$DIR"
313 for FILE in *; do
314 # Fixes the case were the directory is empty
315 if [ "foo$FILE" = "foo*" ]; then continue; fi
316
317 # Fixes the case when hard-links see their compression scheme change
318 # (from not compressed to compressed, or from bz2 to gz, or from gz to bz2)
319 # Also fixes the case when multiple version of the page are present, which
320 # are either compressed or not.
321 if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
322
323 if [ -d "$FILE" ]; then
324 # We are going recursive to that directory
325 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
326 # I need not pass --conf, as I specify the directory to work on
327 # But I need exit in case of error
328 "$0" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT} "${DIR}/${FILE}" || exit 1
329 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
330
331 else # !dir
332 if ! check_unique "$DIR" "$FILE"; then continue; fi
333
334 # Check if the file is already compressed with the specified method
335 BASE_FILE=`basename \`basename "$FILE" .bz2\` .gz`
336 if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" -a "${FORCE_COMP}" = "no" ]; then continue; fi
337
338 # If we have a symlink
339 if [ -h "$FILE" ]; then
340 case $FILE in
341 *.bz2)
342 EXT=bz2 ;;
343 *.gz)
344 EXT=gz ;;
345 *)
346 EXT=none ;;
347 esac
348
349 if [ ! "$EXT" = "none" ]; then
350 LINK=`ls -l $FILE | cut -d "&gt;" -f2 | tr -d " " | sed s/\.$EXT$//`
351 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
352 mv "$FILE" "$NEWNAME"
353 FILE="$NEWNAME"
354 else
355 LINK=`ls -l $FILE | cut -d "&gt;" -f2 | tr -d " "`
356 fi
357
358 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
359 echo "Relinked $FILE" &gt; $DEST_FD1
360
361 # else if we have a plain file
362 elif [ -f "$FILE" ]; then
363 # Take care of hard-links: build the list of files hard-linked
364 # to the one we are {de,}compressing.
365 # NB. This is not optimum has the file will eventually be compressed
366 # as many times it has hard-links. But for now, that's the safe way.
367 inode=`ls -li "$FILE" | awk '{print $1}'`
368 HLINKS=`find . \! -name "$FILE" -inum $inode`
369
370 if [ -n "$HLINKS" ]; then
371 # We have hard-links! Remove them now.
372 for i in $HLINKS; do rm -f "$i"; done
373 fi
374
375 # Now take care of the file that has no hard-link
376 # We do decompress first to re-compress with the selected
377 # compression ratio later on...
378 case $FILE in
379 *.bz2)
380 bunzip2 $FILE
381 FILE=`basename "$FILE" .bz2`
382 ;;
383 *.gz)
384 gunzip $FILE
385 FILE=`basename "$FILE" .gz`
386 ;;
387 esac
388
389 # Compress the file with the highest compression ratio, if needed
390 case $COMP_SUF in
391 *bz2)
392 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
393 echo "Compressed $FILE" &gt; $DEST_FD1
394 ;;
395 *gz)
396 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
397 echo "Compressed $FILE" &gt; $DEST_FD1
398 ;;
399 *)
400 echo "Uncompressed $FILE" &gt; $DEST_FD1
401 ;;
402 esac
403
404 # If the file had hard-links, recreate those (either hard or soft)
405 if [ -n "$HLINKS" ]; then
406 for i in $HLINKS; do
407 NEWFILE=`echo $i | sed s/\.gz$// | sed s/\.bz2$//`
408 ln ${LN_OPT} "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
409 chmod 644 "${NEWFILE}$COMP_SUF" # Really work only for hard-links. Harmless for soft-links
410 done
411 fi
412
413 else
414 # There is a problem when we get neither a symlink nor a plain file
415 # Obviously, we shall never ever come here... :-(
416 echo "Whaooo... \"${DIR}/${FILE}\" is neither a symlink nor a plain file. Please check:"
417 ls -l ${DIR}/${FILE}
418 exit 1
419 fi
420 fi
421 done # for FILE
422done # for DIR
423<command>EOF
424chmod 755 /usr/bin/compressdoc</command></userinput></screen>
425
426<para>Now, as root, you can issue a
427<command>/usr/bin/compressdoc --bz2</command> to compress all your system man
428pages. You can also run <command>/usr/bin/compressdoc --help</command> to get
429a comprehensive help about what the script is able to do.</para>
430
431<para> Don't forget that a few programs, like the <application>X</application>
432Window system, <application>XEmacs</application>, also install their
433documentation in non standard places (such as <filename class="directory">
434/usr/X11R6/man</filename>, etc...). Don't forget to add those locations in the
435file <filename>/etc/man.conf</filename>, as a
436<envar>MANPATH</envar>=<replaceable>/path</replaceable> section.</para>
437<para> Example:<screen><userinput>
438 ...
439 MANPATH=/usr/share/man
440 MANPATH=/usr/local/man
441 MANPATH=/usr/X11R6/man
442 MANPATH=/opt/qt/doc/man
443 ...</userinput></screen></para>
444
445<para>Generally, package installation systems do not compress man/info pages,
446which means you will need to run the script again if you want to keep the size
447of your documentation as small as possible. Also, note that running the script
448after upgrading a package is safe: when you have several versions of a page
449(for example, one compressed and one uncompressed), the most recent one is kept
450and the others deleted.</para>
451
452</sect1>
453
Note: See TracBrowser for help on using the repository browser.