source: postlfs/config/compressdoc.xml@ 3df86b66

10.0 10.1 11.0 11.1 11.2 11.3 12.0 12.1 6.0 6.1 6.2 6.2.0 6.2.0-rc1 6.2.0-rc2 6.3 6.3-rc1 6.3-rc2 6.3-rc3 7.10 7.4 7.5 7.6 7.6-blfs 7.6-systemd 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 9.0 9.1 basic bdubbs/svn elogind gnome kde5-13430 kde5-14269 kde5-14686 kea ken/TL2024 ken/inkscape-core-mods ken/tuningfonts krejzi/svn lazarus lxqt nosym perl-modules plabs/newcss plabs/python-mods python3.11 qt5new rahul/power-profiles-daemon renodr/vulkan-addition systemd-11177 systemd-13485 trunk upgradedb v5_0 v5_0-pre1 v5_1 v5_1-pre1 xry111/intltool xry111/llvm18 xry111/soup3 xry111/test-20220226 xry111/xf86-video-removal
Last change on this file since 3df86b66 was 3df86b66, checked in by Larry Lawrence <larry@…>, 21 years ago

incorporate option and parameter tags

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@1249 af4574ff-66df-0310-9fd7-8a98e5e911e0

  • Property mode set to 100644
File size: 14.0 KB
Line 
1<sect1 id="postlfs-config-compressdoc" xreflabel="compressdoc">
2<?dbhtml filename="compressdoc.html" dir="postlfs"?>
3<title>Compressing man and info pages</title>
4
5<para>Man and info reader programs can transparently process gzip'ed or
6bzip2'ed pages, a feature you can use to free some disk space while keeping
7your documentation available. However, things are not that simple: man
8directories tend to contain links - hard and symbolic - which defeat simple
9ideas like recursively calling <command>gzip</command> on them. A better way
10to go is to use the script below.
11</para>
12
13<screen><userinput><command>cat &gt; /usr/bin/compressdoc &lt;&lt; "EOF"</command>
14
15#!/bin/bash
16#
17# Compress (with bzip2 or gzip) all man pages in a hierarchy and
18# update symlinks - By Marc Heerdink &lt;marc@koelkast.net&gt;.
19# Modified to be able to gzip or bzip2 files as an option and to deal
20# with all symlinks properly by Mark Hymers # &lt;markh@linuxfromscratch.org&gt;
21#
22# Modified 20030925 by Yann E. Morin &lt;yann.morin.1998 @ # anciens.enib.fr&gt;
23# to accept compression/decompression, to correctly handle hard-links,
24# to allow for changing hard-links into soft- ones, to specify the
25# compression level, to parse the man.conf for all occurences of MANPATH,
26# to allow for a backup, to allow to keep the newest version of a page.
27#
28# TODO:
29# - inverse the quiet option into a verbose one, so as to be silent
30# by default;
31# - choose a default compress method to be based on the available
32# tool : gzip or bzip2;
33# - when a MANPATH env var exists, use this instead of /etc/man.conf
34# (usefull for users to (de)compress their man pages;
35# - offer an option to restore a previous backup;
36# - add other compression engines (compress, zip, etc?). Needed?
37
38# Funny enough, this function prints some help.
39function help ()
40{
41 if [ -n "$1" ]; then
42 echo "Unknown option : $1"
43 fi
44 echo "Usage: $0 &lt;comp_method&gt; [options] [dirs]"
45 cat &lt;&lt; EOT
46 Where comp_method is one of :
47
48 --gzip, --gz, -g
49 --bzip2, --bz2, -b
50 Compress using gzip or bzip2.
51
52 --decompress, -d
53 Decompress the man pages.
54
55 --backup Specify a .tar backup shall be done for every directories.
56 In case a backup already exists, it is saved as .tar.old prior
57 to making the new backup. If an .tar.old backup exist, it is
58 removed prior to saving the backup.
59 In backup mode, no other action is performed.
60
61 And where options are :
62
63 -1 to -9, --fast, --best
64 The compression level, as accepted by gzip and bzip2. When not
65 specified, uses the default compression level for the given
66 method (-6 for gzip, and -9 for bzip2). Not used when in backup
67 or decompress modes.
68
69 -s Change hard-links into soft-links. Use with _caution_ as the
70 first encountered file will be used as a reference. Not used
71 when in backup mode.
72
73 --conf=dir, --conf dir
74 Specify the location of man.conf. Defaults to /etc.
75
76 --quiet, -q Quiet mode, only print the name of the directory being
77 processed. Add another -q flag to turn it absolutely silent.
78
79 --fake, -f Fakes it. Print the actual parameters compman will use.
80
81 dirs A list of space-separated _absolute_ pathname to the man
82 directories.
83 When empty, and only then, parse ${MAN_CONF}/man.conf for all
84 occurences of MANPATH.
85
86Note about compression
87 There has been a discussion on blfs-support about compression ratios of
88 both gzip and bzip2 on man pages, taking into account the hosting fs,
89 the architecture, etc... On the overall, the conclusion was that gzip
90 was much efficient on 'small' files, and bzip2 on 'big' files, small and
91 big being very dependent on the content of the files.
92
93 See the original thread begining at :
94http://archive.linuxfromscratch.org/mail-archives/blfs-support/2003/04/0424.html
95
96 On my system (x86, ext3), man pages were 35564kiB before compression. gzip -9
97 compressed them down to 20372kiB (57.28%), bzip2 -9 got down to 19812kiB
98 (55.71%). That is a 1.57% gain in space. YMMV.
99
100 What was not taken into consideration was the decompression speed. But does
101 it make sense to? You gain fast access with uncompressed man pages, or you
102 gain space at the expense of a slight overhead in time. Well, my P4-2.5GHz
103 does not even let me notice this... :-)
104EOT
105}
106
107# This function checks that the path is absolute
108# $1 : the path to check
109# $2 : path to man.conf if $1 was extracted from it
110function check_path ()
111{
112 echo checking path $1
113 if [ -n "`echo $1 | cut -d '/' -f1`" ]; then
114 echo "Path \"$1\" is not absolute."
115 [ -n "$2" ] &amp;&amp; echo "Check your $2"
116 exit 1
117 fi
118}
119
120# This function checks that the man page is unique amongst bzip2'd, gzip'd and
121# the uncompressed versions.
122# $1 the directory in which the file resides
123# $2 the file name for the man page
124function check_unique ()
125{
126 # NB. When there are hardlink to this file, these are
127 # _not_ deleted. In fact, if there are hardlinks, they
128 # all have the same date/time, thus making them ready
129 # for deletion later on.
130
131 # Build the list of all man page with the same name
132 BASENAME=`basename "${2}" .bz2`
133 BASENAME=`basename "${BASENAME}" .gz`
134 LIST=
135 [ -f "$DIR"/"${BASENAME}" ] &amp;&amp; LIST="${LIST} ${BASENAME}"
136 [ -f "$DIR"/"${BASENAME}".gz ] &amp;&amp; LIST="${LIST} ${BASENAME}.gz"
137 [ -f "$DIR"/"${BASENAME}".bz2 ] &amp;&amp; LIST="${LIST} ${BASENAME}.bz2"
138
139 # Look for, and keep, the most recent one
140 LATEST=`(cd "$DIR"; ls -1rt $LIST)`
141 for i in $LIST; do
142 [ "$LATEST" != "$i" ] &amp;&amp; rm -f "$i"
143 done
144
145 # In case the specified file was the latest, return 0
146 [ "$LATEST" = "$1" ] &amp;&amp; return 0
147 # If the file was not the latest, return 1
148 return 1
149}
150
151# OK, parse the command line for arguments, and initialize to some sensible
152# state, that is keep hardlinks, parse /etc/man.conf, be most verbose, and
153# search man.conf in /etc
154COMP_METHOD=
155COMP_SUF=
156COMP_LVL=
157LN_OPT=
158MAN_DIR=
159QUIET_OPT=
160QUIET_LVL=0
161BACKUP=no
162FAKE=no
163MAN_CONF=/etc
164while [ -n "$1" ]; do
165 case $1 in
166 --gzip|--gz|-g)
167 COMP_SUF=.gz
168 COMP_METHOD=$1
169 shift
170 ;;
171 --bzip2|--bz2|-b)
172 COMP_SUF=.bz2
173 COMP_METHOD=$1
174 shift
175 ;;
176 --decompress|-d)
177 COMP_SUF=
178 COMP_LVL=
179 COMP_METHOD=$1
180 shift
181 ;;
182 -[1-9]|--fast|--best)
183 COMP_LVL=$1
184 shift
185 ;;
186 --soft|-s)
187 LN_OPT=-s
188 shift
189 ;;
190 --conf=*)
191 MAN_CONF=`echo $1 | cut -d '=' -f2-`
192 shift
193 ;;
194 --conf)
195 MAN_CONF="$2"
196 shift 2
197 ;;
198 --quiet|-q)
199 let QUIET_LVL++
200 QUIET_OPT="$QUIET_OPT -q"
201 shift
202 ;;
203 --backup)
204 BACKUP=yes
205 shift
206 ;;
207 --fake|-f)
208 FAKE=yes
209 shift
210 ;;
211 --help|-h)
212 help
213 exit 0
214 ;;
215 /*)
216 MAN_DIR="${MAN_DIR} ${1}"
217 shift
218 ;;
219 -*)
220 help $1
221 exit 1
222 ;;
223 *)
224 check_path $1
225 # We shall never return in that case! None the less, do exit
226 exit 1
227 ;;
228 esac
229done
230
231# Redirections
232case $QUIET_LVL in
233 0)
234 DEST_FD0=/dev/stdout
235 DEST_FD1=/dev/stdout
236 ;;
237 1)
238 DEST_FD0=/dev/stdout
239 DEST_FD1=/dev/null
240 ;;
241 *)
242 #2 and above, be silent
243 DEST_FD0=/dev/null
244 DEST_FD1=/dev/null
245 ;;
246esac
247
248# Note: on my machine, 'man --path' gives /usr/share/man twice, once with a trailing '/', once without.
249if [ -z "$MAN_DIR" ]; then
250 MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
251 | sed 's/:/\\n/g' \
252 | while read foo; do dirname "$foo"/.; done \
253 | sort -u \
254 | while read bar; do echo -n "$bar "; done`
255fi
256
257# If no MANPATH in ${MAN_CONF}/man.conf, abort as well
258if [ -z "$MAN_DIR" ]; then
259 echo "No directory specified, and no directory found in \"${MAN_CONF}/man.conf\""
260 exit 1
261fi
262
263# Fake?
264if [ "$FAKE" != "no" ]; then
265 echo "Actual parameters used:"
266 echo -n "Compression.......: "
267 case $COMP_METHOD in
268 --bzip2|--bz2|-b) echo -n "bzip2";;
269 --gzip|__gz|-g) echo -n "gzip";;
270 --decompress|-d) echo -n "decompressing";;
271 *) echo -n "unknown";;
272 esac
273 echo " ($COMP_METHOD)"
274 echo "Compression level.: $COMP_LVL"
275 echo "Compression suffix: $COMP_SUF"
276 echo "man.conf is.......: ${MAN_CONF}/man.conf ($MAN_CONF)"
277 echo -n "Hard links........: "
278 [ "$LN_OPT" = "-s" -o "$LN_OPT" = "--soft" ] &amp;&amp; echo -n "Convert to symlinks" || echo -n "Keep hardlinks"
279 echo " ($LN_OPT)"
280 echo "Backup............: $BACKUP"
281 echo "Faking (yes!).....: $FAKE"
282 echo "Directories.......: $MAN_DIR"
283 echo "Silence level.....: $QUIET_LVL ($QUIET_OPT)"
284 exit 0
285fi
286
287# If no method was specified, print help
288if [ -z "${COMP_METHOD}" -a "${BACKUP}" = "no" ]; then
289 help
290 exit 1
291fi
292
293# In backup mode, do the backup sollely
294if [ "$BACKUP" = "yes" ]; then
295 for DIR in $MAN_DIR; do
296 cd "${DIR}/.."
297 DIR_NAME=`basename "${DIR}"`
298 echo "Backing up $DIR..." &gt; $DEST_FD0
299 [ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
300 [ -f "${DIR_NAME}.tar" ] &amp;&amp; mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
301 tar cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
302 done
303 exit 0
304fi
305
306# I know MAN_DIR has only absolute path names
307# I need to take into account the localized man, so I'm going recursive
308for DIR in $MAN_DIR; do
309 cd "$DIR"
310 for FILE in *; do
311 if [ "foo$FILE" = "foo*" ]; then continue; fi
312 if [ -d "$FILE" ]; then
313 # We are going recursive to that directory
314 echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
315 # I need not pass --conf, as I specify the directory to work on
316 # But I need exit in case of error
317 "$0" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${QUIET_OPT} "${DIR}/${FILE}" || exit 1
318 echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
319 else # !dir
320 if check_unique "$DIR" "$FILE"; then continue; fi
321
322 # If we have a symlink
323 if [ -h "$FILE" ]; then
324 case $FILE in
325 *.bz2)
326 EXT=bz2 ;;
327 *.gz)
328 EXT=gz ;;
329 *)
330 EXT=none ;;
331 esac
332
333 if [ "$EXT" != "none" ]; then
334 LINK=`ls -l $FILE | cut -d "&gt;" -f2 | tr -d " " | sed s/\.$EXT$//`
335 NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
336 mv "$FILE" "$NEWNAME"
337 FILE="$NEWNAME"
338 else
339 LINK=`ls -l $FILE | cut -d "&gt;" -f2 | tr -d " "`
340 fi
341
342 rm -f "$FILE" &amp;&amp; ln -s "${LINK}$COMP_SUF" "${FILE}$COMP_SUF"
343 echo "Relinked $FILE" &gt; $DEST_FD1
344
345 # else if we have a plain file
346 elif [ -f "$FILE" ]; then
347 # Take care of hard-links: build the list of files hard-linked
348 # to the one we are {de,}compressing.
349 # NB. This is not optimum has the file will eventually be compressed
350 # as many times it has hard-links. But for now, that's the safe way.
351 inode=`ls -li "$FILE" | awk '{print $1}'`
352 HLINKS=`find . \! -name "$FILE" -inum $inode`
353
354 if [ -n "$HLINKS" ]; then
355 # We have hard-links! Remove them now.
356 for i in $HLINKS; do rm -f "$i"; done
357 fi
358
359 # Now take care of the file that has no hard-link
360 # We do decompress first to recompress with the selected
361 # compression ratio later on...
362 case $FILE in
363 *.bz2)
364 bunzip2 $FILE
365 FILE=`echo $FILE | sed s/\.bz2$//`
366 ;;
367 *.gz)
368 gunzip $FILE
369 FILE=`echo $FILE | sed s/\.gz$//`
370 ;;
371 esac
372
373 # Compress the file with the highest compression ratio, if needed
374 case $COMP_SUF in
375 *bz2)
376 bzip2 ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
377 echo "Compressed $FILE" &gt; $DEST_FD1
378 ;;
379 *gz)
380 gzip ${COMP_LVL} "$FILE" &amp;&amp; chmod 644 "${FILE}${COMP_SUF}"
381 echo "Compressed $FILE" &gt; $DEST_FD1
382 ;;
383 *)
384 echo "Uncompressed $FILE" &gt; $DEST_FD1
385 ;;
386 esac
387
388 # If the file had hard-links, recreate those (either hard or soft)
389 if [ -n "$HLINKS" ]; then
390 for i in $HLINKS; do
391 NEWFILE=`echo $i | sed s/\.gz$// | sed s/\.bz2$//`
392 ln ${LN_OPT} "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
393 chmod 644 "${NEWFILE}$COMP_SUF" # Really work only for hard-links. Harmless for soft-links
394 done
395 fi
396
397 else
398 # There is a problem when we get neither a symlink nor a plain file
399 # Obviously, we shall never ever come here... :-(
400 echo "Whaooo... \"${DIR}/${FILE}\" is neither a symlink nor a plain file. Please check:"
401 ls -l ${DIR}/${FILE}
402 exit 1
403 fi
404 fi
405 done # for FILE
406done # for DIR
407
408<command>EOF
409chmod 755 /usr/bin/compressdoc</command></userinput></screen>
410
411<para>Now, as root, you can issue a
412<command>/usr/bin/compressdoc --bz2</command> to compress all your system man
413pages. You can also run <command>/usr/bin/compressdoc --help</command> to get
414a comprehensive help about what the script is able to do.</para>
415
416<para> Don't forget that a few programs, like the <application>X</application>
417Window system, <application>XEmacs</application>, also install their
418documentation in nonstandard places (such as <filename class="directory">
419/usr/X11R6/man</filename>, etc...). Don't forget to add those locations in the
420file <filename>/etc/man.conf</filename>, as a
421<envar>MANPATH</envar>=<replaceable>/path</replaceable> section.</para>
422<para> Example:<screen><userinput>
423 ...
424 MANPATH=/usr/share/man
425 MANPATH=/usr/local/man
426 MANPATH=/usr/X11R6/man
427 MANPATH=/opt/qt/doc/man
428 ...</userinput></screen></para>
429
430<para>Generally, package installation systems do not compress man/info pages,
431which means you will need to run the script again if you want to keep the size
432of your documentation as small as possible. Also, note that running the script
433after upgrading a package is safe: when you have several versions of a page
434(for example, one compressed and one uncompressed), the most recent one is kept
435and the others deleted.</para>
436
437</sect1>
438
Note: See TracBrowser for help on using the repository browser.