Changeset 7f89db8
- Timestamp:
- 10/25/2008 09:31:14 PM (16 years ago)
- Branches:
- 10.0, 10.0-rc1, 10.1, 10.1-rc1, 11.0, 11.0-rc1, 11.0-rc2, 11.0-rc3, 11.1, 11.1-rc1, 11.2, 11.2-rc1, 11.3, 11.3-rc1, 12.0, 12.0-rc1, 12.1, 12.1-rc1, 6.4, 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.5-systemd, 7.6, 7.6-systemd, 7.7, 7.7-systemd, 7.8, 7.8-systemd, 7.9, 7.9-systemd, 8.0, 8.1, 8.2, 8.3, 8.4, 9.0, 9.1, arm, bdubbs/gcc13, ml-11.0, multilib, renodr/libudev-from-systemd, s6-init, trunk, xry111/arm64, xry111/arm64-12.0, xry111/clfs-ng, xry111/lfs-next, xry111/loongarch, xry111/loongarch-12.0, xry111/loongarch-12.1, xry111/mips64el, xry111/pip3, xry111/rust-wip-20221008, xry111/update-glibc
- Children:
- bc81164
- Parents:
- 7d6c9a6
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
chapter01/changelog.xml
r7d6c9a6 r7f89db8 38 38 --> 39 39 <listitem> 40 <para>2008-10-25</para> 41 <itemizedlist> 42 <listitem> 43 <para>[dj] - Updated the text on the Man-DB page to accout for recent 44 changes in Man-DB. Thanks to Alexander Patrakov for providing most 45 of the included text, explanations, and examples.</para> 46 </listitem> 47 </itemizedlist> 48 </listitem> 49 50 <listitem> 40 51 <para>2008-10-23</para> 41 52 <itemizedlist> -
chapter06/man-db.xml
r7d6c9a6 r7f89db8 112 112 <screen><userinput remap="install">make install</userinput></screen> 113 113 114 </sect2> 115 116 <sect2> 117 <title>Non-English Manual Pages in LFS</title> 118 <!-- 114 119 <para>Some packages provide UTF-8 manual pages, which previous versions of 115 <application>Man-DB</application> were unable to display. This limitation 116 has been fixed in recent versions, and <application>Man-DB</application> 117 can now convert manual pages from legacy encodings to UTF-8 118 (and vice-versa) on the fly. This used to be a rather annoying 119 problem across different distributions, as packages written for one 120 distribution would require changes to work on another. The following 121 script will allow you to convert manual pages to and from legacy and UTF-8 122 encodings.</para> 120 <application>Man-DB</application> were unable to display correctly because 121 the expected (8-bit) encoding for each language was hard-coded in the 122 source of <application>Man-DB</application>. 123 <application>Man-DB</application> now uses the extension of the directory 124 name in order to determine the encoding of the manual pages stored within. 125 If no extension exists, <application>Man-DB</application> uses a built-in 126 table (see below) to determine the encoding. E.g., because of "UTF-8" in 127 the directory name, it knows that all manual pages residing in 128 <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8 129 encoded and, according to the built-in table, expects all manual pages 130 residing in <filename class="directory">/usr/share/man/ru</filename> to 131 be encoded using KOI8-R.</para> 132 133 <para>Linux distributions have different policies concerning the character 134 encoding in which manual pages are stored in the filesystem. E.g., RedHat 135 stores all manual pages in UTF-8, while Debian previously used 136 language-specific (mostly 8-bit) encodings. Many other distributions simply 137 ignore the problem all together. LFS also used the legacy encodings in 138 previuos versions of the book. This was chosen because of the ease of 139 configuration associated with <application>Man-DB</application>. 140 Additionally, <application>Man-DB</application> provided support for 141 Chinese and Japanese locales, and limited support for Korean, whereas 142 <application>Man</application> did not at that time.</para> 143 144 <para>In contrast, the setup in Fedora Core expects all manual pages 145 to be UTF-8 encoded, and stored in directories without suffixes. 146 Disagreement about the expected encoding of manual pages amongst 147 distribution vendors, has led to confusion for upstream package maintainers. 148 Some packages contain, UTF-8 manual pages, while others ship with manual 149 pages in legacy encodings. Unlike the 150 <application>Man</application>/<application>Groff</application> setup in 151 Fedora Core, <application>Man-DB</application> can make very good decisions 152 about the on disk encoding and present the information to the user in their 153 prefered format, without complex configurations.</para> 154 155 <para><application>Man-DB</application> has, for the most part, made this 156 problem completely transparent to end users, as long as the manual pages 157 are installed into the correct directory. There may be times, however, 158 where one encoding is preferred over the other. For this purpose, the 159 <command>convert-mans</command> script was written. It will convert manual 160 pages to another encoding before (or after) installation. Install the 161 <command>convert-mans</command> script with the following 162 instructions:</para> 163 --> 164 <para>Some packages provide non-English manual pages. They are displayed 165 correctly only if their location and encoding matches the expectation of 166 the "man" program. However, different Linux distributions have different 167 policies (expressed in the choice of the <command>man</command> program, 168 its configuration and patches applied to it) concerning the character 169 encoding in which manual pages are stored in the filesystem.</para> 170 171 <para>E.g., Debian previously required Russian manual pages to be encoded 172 in KOI8-R and to be placed in 173 <filename class="directory">/usr/share/man/ru</filename>. Now, in addition, 174 their <command>man</command> program (<application>Man-DB</application>) 175 searches for UTF-8 encoded Russian manual pages in 176 <filename class="directory">/usr/share/man/ru.UTF-8</filename>. On the 177 other hand, Fedora uses UTF-8 encoded manual pages exclusively. Russian 178 manual pages are found in 179 <filename class="directory">/usr/share/man/ru</filename> and their 180 <command>man</command> program doesn't acknowledge 181 <filename class="directory">/usr/share/man/ru.UTF-8</filename>. Many 182 other distributions ignore the on disk encodings completely, leaving the 183 end user with a mix of improperly encoded manual pages for their 184 configuration. When <command>man</command> processes the requtested page, 185 it will display the contents as configured, resulting in completely 186 unreadable text if the on disk encoding is not what is expected for that 187 configuration.</para> 188 189 <para>Disagreement about the expected encoding of manual pages amongst 190 distribution vendors, has led to confusion for upstream package 191 maintainers. One package may contain UTF-8 manual pages, while another 192 ships with manual pages in legacy encodings. <command>man</command> 193 searches for manual pages based on the user's locale settings. 194 <application>Man-DB</application> uses a built-in table (see below) to 195 determine the on disk encoding of manual pages found for a user's 196 locale, only if the directories found do not have an extension that 197 describes the encoding. E.g., because of ".UTF-8" in the directory name, 198 <application>Man-DB</application> knows that all manual pages residing in 199 <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8 200 encoded and, according to the built-in table, expects all manual pages 201 residing in <filename class="directory">/usr/share/man/ru</filename> to 202 be encoded using KOI8-R.</para> 203 204 <!-- Origin: man-db-2.5.2/src/encodings.c --> 205 <table> 206 <title>Expected character encoding of legacy 8-bit manual pages</title> 207 <?dbfo table-width="2.5in" ?> 208 209 <tgroup cols="2"> 210 211 <colspec colnum="1" colwidth="1.5in"/> 212 <colspec colnum="2" colwidth="1in"/> 213 214 <thead> 215 <row> 216 <entry>Language (code)</entry> 217 <entry>Encoding</entry> 218 </row> 219 </thead> 220 221 <tbody> 222 <row> 223 <entry>Danish (da)</entry> 224 <entry>ISO-8859-1</entry> 225 </row> 226 <row> 227 <entry>German (de)</entry> 228 <entry>ISO-8859-1</entry> 229 </row> 230 <row> 231 <entry>English (en)</entry> 232 <entry>ISO-8859-1</entry> 233 </row> 234 <row> 235 <entry>Spanish (es)</entry> 236 <entry>ISO-8859-1</entry> 237 </row> 238 <row> 239 <entry>Finnish (fi)</entry> 240 <entry>ISO-8859-1</entry> 241 </row> 242 <row> 243 <entry>French (fr)</entry> 244 <entry>ISO-8859-1</entry> 245 </row> 246 <row> 247 <entry>Irish (ga)</entry> 248 <entry>ISO-8859-1</entry> 249 </row> 250 <row> 251 <entry>Galician (gl)</entry> 252 <entry>ISO-8859-1</entry> 253 </row> 254 <row> 255 <entry>Indonesian (id)</entry> 256 <entry>ISO-8859-1</entry> 257 </row> 258 <row> 259 <entry>Icelandic (is)</entry> 260 <entry>ISO-8859-1</entry> 261 </row> 262 <row> 263 <entry>Italian (it)</entry> 264 <entry>ISO-8859-1</entry> 265 </row> 266 <row> 267 <entry>Dutch (nl)</entry> 268 <entry>ISO-8859-1</entry> 269 </row> 270 <!-- FIXME: BUG: "no" is deprecated, should use "nb" or "nn" and 271 symlinks --> 272 <row> 273 <entry>Norwegian (no)</entry> 274 <entry>ISO-8859-1</entry> 275 </row> 276 <!-- END BUG --> 277 <row> 278 <entry>Portuguese (pt)</entry> 279 <entry>ISO-8859-1</entry> 280 </row> 281 <row> 282 <entry>Swedish (sv)</entry> 283 <entry>ISO-8859-1</entry> 284 </row> 285 <!-- Languages below require patched groff --> 286 <row> 287 <entry>Bulgarian (bg)</entry> 288 <entry>CP1251</entry> 289 </row> 290 <row> 291 <entry>Czech (cs)</entry> 292 <entry>ISO-8859-2</entry> 293 </row> 294 <row> 295 <entry>Croatian (hr)</entry> 296 <entry>ISO-8859-2</entry> 297 </row> 298 <row> 299 <entry>Hungarian (hu)</entry> 300 <entry>ISO-8859-2</entry> 301 </row> 302 <row> 303 <entry>Japanese (ja)</entry> 304 <entry>EUC-JP</entry> 305 </row> 306 <row> 307 <entry>Korean (ko)</entry> 308 <entry>EUC-KR</entry> 309 </row> 310 <row> 311 <entry>Polish (pl)</entry> 312 <entry>ISO-8859-2</entry> 313 </row> 314 <row> 315 <entry>Russian (ru)</entry> 316 <entry>KOI8-R</entry> 317 </row> 318 <row> 319 <entry>Slovak (sk)</entry> 320 <entry>ISO-8859-2</entry> 321 </row> 322 <row> 323 <entry>Serbian (sr)</entry> 324 <entry>ISO-8859-5</entry> 325 </row> 326 <row> 327 <entry>Turkish (tr)</entry> 328 <entry>ISO-8859-9</entry> 329 </row> 330 <row> 331 <entry>Simplified Chinese (zh_CN)</entry> 332 <entry>GBK</entry> 333 </row> 334 <row> 335 <entry>Simplified Chinese, Singapore (zh_SG)</entry> 336 <entry>GBK</entry> 337 </row> 338 <row> 339 <entry>Traditional Chinese (zh_TW)</entry> 340 <entry>BIG5</entry> 341 </row> 342 <row> 343 <entry>Traditional Chinese, Hong Kong (zh_HK)</entry> 344 <entry>BIG5HKSCS</entry> 345 </row> 346 </tbody> 347 348 </tgroup> 349 350 </table> 351 352 <note> 353 <para>Manual pages in languages not in the list are not supported. 354 Norwegian does not work because of the transition from no_NO to 355 nb_NO locale, and will be fixed in the next release of 356 <application>Man-DB</application>. Korean is currently non functional 357 because of incomplete fixes in the Debian 358 <application>Groff</application> patch applied in LFS.</para> 359 </note> 360 361 <para>Packages may install manual pages into an improperly named directory, 362 depending on which distributions the author develops the package for. To 363 assist in the conversion of the manual pages to the proper encoding for the 364 directory in which they are installed, the <command>convert-mans</command> 365 script was written. It will convert manual pages to another encoding before 366 (or after) installation. Install the <command>convert-mans</command> 367 script with the following instructions:</para> 123 368 124 369 <screen><userinput remap="install">cat >> convert-mans << "EOF" … … 137 382 install -m755 convert-mans /usr/bin</userinput></screen> 138 383 139 <para>Additional information regarding the compression of 140 man and info pages can be found in the BLFS book at 141 <ulink url="&blfs-root;view/cvs/postlfs/compressdoc.html"/>.</para> 142 143 </sect2> 144 145 <sect2> 146 <title>Non-English Manual Pages in LFS</title> 147 148 <para>Linux distributions have different policies concerning the character 149 encoding in which manual pages are stored in the filesystem. E.g., RedHat 150 stores all manual pages in UTF-8, while Debian previously used 151 language-specific (mostly 8-bit) encodings. As mentioned above, this leads 152 to incompatibility of packages with manual pages designed for different 153 distributions.</para> 154 155 <para>LFS previously used the same convention as Debian. This was chosen 156 because <application>Man-DB</application> did not understand manual pages 157 stored in UTF-8 at the time of its introduction into LFS. For our purposes 158 at that time, <application>Man-DB</application> was preferable to 159 <application>Man</application> as it worked without any additional 160 configuration in any locale. This is still true today as 161 <application>Man-DB</application> with Debian patched 162 <application>Groff</application> will now dynamically convert UTF-8 encoded 163 manual pages to the user's locale. Additionally, this combination provides 164 support for Chinese and Japanese locales, and limited support for Korean, 165 whereas <application>Man</application> does not. The current offering of 166 <application>Man</application> as used in RedHat requires major 167 modifications to both the <application>Man</application> and 168 <application>Groff</application> packages, and still falls short on 169 Chinese, Japanese, and Korean encodings.</para> 170 171 <para>Finally, most distributions, including Debian, are rapidly migrating 172 to all UTF-8 encoded manual pages. Upstream packagers will very likely drop 173 legacy encodings in favor of UTF-8, though adoption has been slow due to 174 the hacks required to make the current <application>Man</application> and 175 <application>Groff</application> packages work correctly together.</para> 176 177 <para>The relationship between language codes and the expected encoding 178 of legacy manual pages is listed below.</para> 179 180 <!-- Origin: man-db-2.5.2/src/encodings.c --> 181 <table> 182 <title>Expected character encoding of legacy 8-bit manual pages</title> 183 <?dbfo table-width="2.5in" ?> 184 185 <tgroup cols="2"> 186 187 <colspec colnum="1" colwidth="1.5in"/> 188 <colspec colnum="2" colwidth="1in"/> 189 190 <thead> 191 <row> 192 <entry>Language (code)</entry> 193 <entry>Encoding</entry> 194 </row> 195 </thead> 196 197 <tbody> 198 <row> 199 <entry>Danish (da)</entry> 200 <entry>ISO-8859-1</entry> 201 </row> 202 <row> 203 <entry>German (de)</entry> 204 <entry>ISO-8859-1</entry> 205 </row> 206 <row> 207 <entry>English (en)</entry> 208 <entry>ISO-8859-1</entry> 209 </row> 210 <row> 211 <entry>Spanish (es)</entry> 212 <entry>ISO-8859-1</entry> 213 </row> 214 <row> 215 <entry>Finnish (fi)</entry> 216 <entry>ISO-8859-1</entry> 217 </row> 218 <row> 219 <entry>French (fr)</entry> 220 <entry>ISO-8859-1</entry> 221 </row> 222 <row> 223 <entry>Irish (ga)</entry> 224 <entry>ISO-8859-1</entry> 225 </row> 226 <row> 227 <entry>Galician (gl)</entry> 228 <entry>ISO-8859-1</entry> 229 </row> 230 <row> 231 <entry>Indonesian (id)</entry> 232 <entry>ISO-8859-1</entry> 233 </row> 234 <row> 235 <entry>Icelandic (is)</entry> 236 <entry>ISO-8859-1</entry> 237 </row> 238 <row> 239 <entry>Italian (it)</entry> 240 <entry>ISO-8859-1</entry> 241 </row> 242 <row> 243 <entry>Dutch (nl)</entry> 244 <entry>ISO-8859-1</entry> 245 </row> 246 <!-- FIXME: BUG: "no" is deprecated, should use "nb" or "nn" and 247 symlinks --> 248 <row> 249 <entry>Norwegian (no)</entry> 250 <entry>ISO-8859-1</entry> 251 </row> 252 <!-- END BUG --> 253 <row> 254 <entry>Portuguese (pt)</entry> 255 <entry>ISO-8859-1</entry> 256 </row> 257 <row> 258 <entry>Swedish (sv)</entry> 259 <entry>ISO-8859-1</entry> 260 </row> 261 <!-- Languages below require patched groff --> 262 <row> 263 <entry>Bulgarian (bg)</entry> 264 <entry>CP1251</entry> 265 </row> 266 <row> 267 <entry>Czech (cs)</entry> 268 <entry>ISO-8859-2</entry> 269 </row> 270 <row> 271 <entry>Croatian (hr)</entry> 272 <entry>ISO-8859-2</entry> 273 </row> 274 <row> 275 <entry>Hungarian (hu)</entry> 276 <entry>ISO-8859-2</entry> 277 </row> 278 <row> 279 <entry>Japanese (ja)</entry> 280 <entry>EUC-JP</entry> 281 </row> 282 <row> 283 <entry>Korean (ko)</entry> 284 <entry>EUC-KR</entry> 285 </row> 286 <row> 287 <entry>Polish (pl)</entry> 288 <entry>ISO-8859-2</entry> 289 </row> 290 <row> 291 <entry>Russian (ru)</entry> 292 <entry>KOI8-R</entry> 293 </row> 294 <row> 295 <entry>Slovak (sk)</entry> 296 <entry>ISO-8859-2</entry> 297 </row> 298 <row> 299 <entry>Serbian (sr)</entry> 300 <entry>ISO-8859-5</entry> 301 </row> 302 <row> 303 <entry>Turkish (tr)</entry> 304 <entry>ISO-8859-9</entry> 305 </row> 306 <row> 307 <entry>Simplified Chinese (zh_CN)</entry> 308 <entry>GBK</entry> 309 </row> 310 <row> 311 <entry>Simplified Chinese,Singapore} (zh_SG)</entry> 312 <entry>GBK</entry> 313 </row> 314 <row> 315 <entry>Traditional Chinese (zh_TW)</entry> 316 <entry>BIG5</entry> 317 </row> 318 <row> 319 <entry>Traditional Chinese, Hong Kong (zh_HK)</entry> 320 <entry>BIG5HKSCS</entry> 321 </row> 322 </tbody> 323 324 </tgroup> 325 326 </table> 327 328 <note> 329 <para>Manual pages in languages not in the list are not supported. 330 Norwegian does not work because of the transition from no_NO to 331 nb_NO locale, and will be fixed in the next release of 332 <application>Man-DB</application>. Korean is currently non functional 333 because of incomplete fixes in the Groff patch.</para> 334 </note> 335 336 337 <para>If upstream distributes the manual pages in a legacy encoding, 338 the manual pages can simply be copied to 384 385 <para>If upstream distributes the manual pages in a legacy encoding, the 386 manual pages can simply be copied to 339 387 <filename class="directory">/usr/share/man/<replaceable><language 340 388 code></replaceable></filename>. For example, <ulink … … 354 402 355 403 <para>For example, to install <ulink 356 url="http:// ditec.um.es/~piernas/manpages-es/man-pages-es-1.55.tar.bz2">357 Spanish manual pages</ulink> in the legacy encoding, use the following404 url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2"> 405 French manual pages</ulink> in the legacy encoding, use the following 358 406 commands:</para> 359 407 360 <screen role="nodump"><userinput>mv man7/iso_8859-7.7{,X} 361 convert-mans UTF-8 ISO-8859-1 man?/*.? 362 mv man7/iso_8859-7.7{X,} 363 make install</userinput></screen> 364 365 <note> 366 <para>The <filename>man7/iso_8859-7.7</filename> file needs to be 367 exclueded from the conversion process because it is already in 368 ISO-8859-1 format. This is a packaging bug in man-pages-es-1.55. 369 Future versions should not require this workaround.</para> 370 </note> 371 372 <para>Finally, as an example installation of UTF-8 manual pages, the <ulink 373 url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2"> 374 French manual pages</ulink> can be installed with the following 375 commands:</para> 408 <screen role="nodump"><userinput>convert-mans UTF-8 ISO-8859-1 man?/*.? 409 mkdir -p /usr/share/man/fr 410 cp -rv man? /usr/share/man/fr</userinput></screen> 411 412 <note><para>The French manual pages ship with ready made scripts to do the 413 same conversion. The above instructions are used only as an example for 414 use of the <command>convert-mans</command> script.</para></note> 415 416 <para>Finally, as an example installation of UTF-8 manual pages, again, the 417 French manual pages could be installed with the following commands:</para> 376 418 377 419 <screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8
Note:
See TracChangeset
for help on using the changeset viewer.