Changeset 7f89db8


Ignore:
Timestamp:
10/25/2008 09:31:14 PM (16 years ago)
Author:
DJ Lucas <dj@…>
Branches:
10.0, 10.0-rc1, 10.1, 10.1-rc1, 11.0, 11.0-rc1, 11.0-rc2, 11.0-rc3, 11.1, 11.1-rc1, 11.2, 11.2-rc1, 11.3, 11.3-rc1, 12.0, 12.0-rc1, 12.1, 12.1-rc1, 6.4, 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.5-systemd, 7.6, 7.6-systemd, 7.7, 7.7-systemd, 7.8, 7.8-systemd, 7.9, 7.9-systemd, 8.0, 8.1, 8.2, 8.3, 8.4, 9.0, 9.1, arm, bdubbs/gcc13, ml-11.0, multilib, renodr/libudev-from-systemd, s6-init, trunk, xry111/arm64, xry111/arm64-12.0, xry111/clfs-ng, xry111/lfs-next, xry111/loongarch, xry111/loongarch-12.0, xry111/loongarch-12.1, xry111/mips64el, xry111/pip3, xry111/rust-wip-20221008, xry111/update-glibc
Children:
bc81164
Parents:
7d6c9a6
Message:

Updated Man-DB text to account for recent Man-DB development. Many thanks to Alexander Patrakov for patientely guiding me through this.

git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@8698 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689

Files:
2 edited

Legend:

Unmodified
Added
Removed
  • chapter01/changelog.xml

    r7d6c9a6 r7f89db8  
    3838-->
    3939    <listitem>
     40      <para>2008-10-25</para>
     41      <itemizedlist>
     42        <listitem>
     43          <para>[dj] - Updated the text on the Man-DB page to accout for recent
     44          changes in Man-DB.  Thanks to Alexander Patrakov for providing most
     45          of the included text, explanations, and examples.</para>
     46        </listitem>
     47      </itemizedlist>
     48    </listitem>
     49
     50    <listitem>
    4051      <para>2008-10-23</para>
    4152      <itemizedlist>
  • chapter06/man-db.xml

    r7d6c9a6 r7f89db8  
    112112<screen><userinput remap="install">make install</userinput></screen>
    113113
     114  </sect2>
     115
     116  <sect2>
     117    <title>Non-English Manual Pages in LFS</title>
     118<!--
    114119    <para>Some packages provide UTF-8 manual pages, which previous versions of
    115     <application>Man-DB</application> were unable to display.  This limitation
    116     has been fixed in recent versions, and <application>Man-DB</application>
    117     can now convert manual pages from legacy encodings to UTF-8
    118     (and vice-versa) on the fly.  This used to be a rather annoying
    119     problem across different distributions, as packages written for one
    120     distribution would require changes to work on another. The following
    121     script will allow you to convert manual pages to and from legacy and UTF-8
    122     encodings.</para>
     120    <application>Man-DB</application> were unable to display correctly because
     121    the expected (8-bit) encoding for each language was hard-coded in the
     122    source of <application>Man-DB</application>.
     123    <application>Man-DB</application> now uses the extension of the directory
     124    name in order to determine the encoding of the manual pages stored within.
     125    If no extension exists, <application>Man-DB</application> uses a built-in
     126    table (see below) to determine the encoding.  E.g., because of "UTF-8" in
     127    the directory name, it knows that all manual pages residing in
     128    <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8
     129    encoded and, according to the built-in table, expects all manual pages
     130    residing in <filename class="directory">/usr/share/man/ru</filename> to
     131    be encoded using KOI8-R.</para>
     132
     133    <para>Linux distributions have different policies concerning the character
     134    encoding in which manual pages are stored in the filesystem. E.g., RedHat
     135    stores all manual pages in UTF-8, while Debian previously used
     136    language-specific (mostly 8-bit) encodings. Many other distributions simply
     137    ignore the problem all together.  LFS also used the legacy encodings in
     138    previuos versions of the book. This was chosen because of the ease of
     139    configuration associated with <application>Man-DB</application>.
     140    Additionally, <application>Man-DB</application> provided support for
     141    Chinese and Japanese locales, and limited support for Korean, whereas
     142    <application>Man</application> did not at that time.</para>
     143
     144    <para>In contrast, the setup in Fedora Core expects all manual pages
     145    to be UTF-8 encoded, and stored in directories without suffixes.
     146    Disagreement about the expected encoding of manual pages amongst
     147    distribution vendors, has led to confusion for upstream package maintainers.
     148    Some packages contain, UTF-8 manual pages, while others ship with manual
     149    pages in legacy encodings.  Unlike the
     150    <application>Man</application>/<application>Groff</application> setup in
     151    Fedora Core, <application>Man-DB</application> can make very good decisions
     152    about the on disk encoding and present the information to the user in their
     153    prefered format, without complex configurations.</para>
     154
     155    <para><application>Man-DB</application> has, for the most part, made this
     156    problem completely transparent to end users, as long as the manual pages
     157    are installed into the correct directory.  There may be times, however,
     158    where one encoding is preferred over the other.  For this purpose, the
     159    <command>convert-mans</command> script was written. It will convert manual
     160    pages to another encoding before (or after) installation.  Install the
     161    <command>convert-mans</command> script with the following
     162    instructions:</para>
     163-->
     164    <para>Some packages provide non-English manual pages. They are displayed
     165    correctly only if their location and encoding matches the expectation of
     166    the "man" program. However, different Linux distributions have different
     167    policies (expressed in the choice of the <command>man</command> program,
     168    its configuration and patches applied to it) concerning the character
     169    encoding in which manual pages are stored in the filesystem.</para>
     170
     171    <para>E.g., Debian previously required Russian manual pages to be encoded
     172    in KOI8-R and to be placed in
     173    <filename class="directory">/usr/share/man/ru</filename>. Now, in addition,
     174    their <command>man</command> program (<application>Man-DB</application>)
     175    searches for UTF-8 encoded Russian manual pages in
     176    <filename class="directory">/usr/share/man/ru.UTF-8</filename>. On the
     177    other hand, Fedora uses UTF-8 encoded manual pages exclusively. Russian
     178    manual pages  are found in
     179    <filename class="directory">/usr/share/man/ru</filename> and their
     180    <command>man</command> program doesn't acknowledge
     181    <filename class="directory">/usr/share/man/ru.UTF-8</filename>.  Many
     182    other distributions ignore the on disk encodings completely, leaving the
     183    end user with a mix of improperly encoded manual pages for their
     184    configuration. When <command>man</command> processes the requtested page,
     185    it will display the contents as configured, resulting in completely
     186    unreadable text if the on disk encoding is not what is expected for that
     187    configuration.</para>
     188
     189    <para>Disagreement about the expected encoding of manual pages amongst
     190    distribution vendors, has led to confusion for upstream package
     191    maintainers. One package may contain UTF-8 manual pages, while another
     192    ships with manual pages in legacy encodings. <command>man</command>
     193    searches for manual pages based on the user's locale settings.
     194    <application>Man-DB</application> uses a built-in table (see below) to
     195    determine the on disk encoding of manual pages found for a user's
     196    locale, only if the directories found do not have an extension that
     197    describes the encoding. E.g., because of ".UTF-8" in the directory name,
     198    <application>Man-DB</application> knows that all manual pages residing in
     199    <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8
     200    encoded and, according to the built-in table, expects all manual pages
     201    residing in <filename class="directory">/usr/share/man/ru</filename> to
     202    be encoded using KOI8-R.</para>
     203
     204    <!-- Origin: man-db-2.5.2/src/encodings.c -->
     205    <table>
     206      <title>Expected character encoding of legacy 8-bit manual pages</title>
     207      <?dbfo table-width="2.5in" ?>
     208
     209      <tgroup cols="2">
     210
     211        <colspec colnum="1" colwidth="1.5in"/>
     212        <colspec colnum="2" colwidth="1in"/>
     213
     214        <thead>
     215          <row>
     216            <entry>Language (code)</entry>
     217            <entry>Encoding</entry>
     218          </row>
     219        </thead>
     220
     221        <tbody>
     222          <row>
     223            <entry>Danish (da)</entry>
     224            <entry>ISO-8859-1</entry>
     225          </row>
     226          <row>
     227            <entry>German (de)</entry>
     228            <entry>ISO-8859-1</entry>
     229          </row>
     230          <row>
     231            <entry>English (en)</entry>
     232            <entry>ISO-8859-1</entry>
     233          </row>
     234          <row>
     235            <entry>Spanish (es)</entry>
     236            <entry>ISO-8859-1</entry>
     237          </row>
     238          <row>
     239            <entry>Finnish (fi)</entry>
     240            <entry>ISO-8859-1</entry>
     241          </row>
     242          <row>
     243            <entry>French (fr)</entry>
     244            <entry>ISO-8859-1</entry>
     245          </row>
     246          <row>
     247            <entry>Irish (ga)</entry>
     248            <entry>ISO-8859-1</entry>
     249          </row>
     250          <row>
     251            <entry>Galician (gl)</entry>
     252            <entry>ISO-8859-1</entry>
     253          </row>
     254          <row>
     255            <entry>Indonesian (id)</entry>
     256            <entry>ISO-8859-1</entry>
     257          </row>
     258          <row>
     259            <entry>Icelandic (is)</entry>
     260            <entry>ISO-8859-1</entry>
     261          </row>
     262          <row>
     263            <entry>Italian (it)</entry>
     264            <entry>ISO-8859-1</entry>
     265          </row>
     266          <row>
     267            <entry>Dutch (nl)</entry>
     268            <entry>ISO-8859-1</entry>
     269          </row>
     270          <!-- FIXME: BUG: "no" is deprecated, should use "nb" or "nn" and
     271          symlinks -->
     272          <row>
     273            <entry>Norwegian (no)</entry>
     274            <entry>ISO-8859-1</entry>
     275          </row>
     276          <!-- END BUG -->
     277          <row>
     278            <entry>Portuguese (pt)</entry>
     279            <entry>ISO-8859-1</entry>
     280          </row>
     281          <row>
     282            <entry>Swedish (sv)</entry>
     283            <entry>ISO-8859-1</entry>
     284          </row>
     285          <!-- Languages below require patched groff -->
     286          <row>
     287            <entry>Bulgarian (bg)</entry>
     288            <entry>CP1251</entry>
     289          </row>
     290          <row>
     291            <entry>Czech (cs)</entry>
     292            <entry>ISO-8859-2</entry>
     293          </row>
     294          <row>
     295            <entry>Croatian (hr)</entry>
     296            <entry>ISO-8859-2</entry>
     297          </row>
     298          <row>
     299            <entry>Hungarian (hu)</entry>
     300            <entry>ISO-8859-2</entry>
     301          </row>
     302          <row>
     303            <entry>Japanese (ja)</entry>
     304            <entry>EUC-JP</entry>
     305          </row>
     306          <row>
     307            <entry>Korean (ko)</entry>
     308            <entry>EUC-KR</entry>
     309          </row>
     310          <row>
     311            <entry>Polish (pl)</entry>
     312            <entry>ISO-8859-2</entry>
     313          </row>
     314          <row>
     315            <entry>Russian (ru)</entry>
     316            <entry>KOI8-R</entry>
     317          </row>
     318          <row>
     319            <entry>Slovak (sk)</entry>
     320            <entry>ISO-8859-2</entry>
     321          </row>
     322          <row>
     323            <entry>Serbian (sr)</entry>
     324            <entry>ISO-8859-5</entry>
     325          </row>
     326          <row>
     327            <entry>Turkish (tr)</entry>
     328            <entry>ISO-8859-9</entry>
     329          </row>
     330          <row>
     331            <entry>Simplified Chinese (zh_CN)</entry>
     332            <entry>GBK</entry>
     333          </row>
     334          <row>
     335            <entry>Simplified Chinese, Singapore (zh_SG)</entry>
     336            <entry>GBK</entry>
     337          </row>
     338          <row>
     339            <entry>Traditional Chinese (zh_TW)</entry>
     340            <entry>BIG5</entry>
     341          </row>
     342          <row>
     343            <entry>Traditional Chinese, Hong Kong (zh_HK)</entry>
     344            <entry>BIG5HKSCS</entry>
     345          </row>
     346        </tbody>
     347
     348      </tgroup>
     349
     350    </table>
     351
     352    <note>
     353      <para>Manual pages in languages not in the list are not supported.
     354      Norwegian does not work because of the transition from no_NO to
     355      nb_NO locale, and will be fixed in the next release of
     356      <application>Man-DB</application>.  Korean is currently non functional
     357      because of incomplete fixes in the Debian
     358      <application>Groff</application> patch applied in LFS.</para>
     359    </note>
     360
     361    <para>Packages may install manual pages into an improperly named directory,
     362    depending on which distributions the author develops the package for. To
     363    assist in the conversion of the manual pages to the proper encoding for the
     364    directory in which they are installed, the <command>convert-mans</command>
     365    script was written. It will convert manual pages to another encoding before
     366    (or after) installation.  Install the <command>convert-mans</command>
     367    script with the following instructions:</para>
    123368
    124369<screen><userinput remap="install">cat &gt;&gt; convert-mans &lt;&lt; "EOF"
     
    137382install -m755 convert-mans  /usr/bin</userinput></screen>
    138383
    139     <para>Additional information regarding the compression of
    140     man and info pages can be found in the BLFS book at
    141     <ulink url="&blfs-root;view/cvs/postlfs/compressdoc.html"/>.</para>
    142 
    143   </sect2>
    144 
    145   <sect2>
    146     <title>Non-English Manual Pages in LFS</title>
    147 
    148     <para>Linux distributions have different policies concerning the character
    149     encoding in which manual pages are stored in the filesystem. E.g., RedHat
    150     stores all manual pages in UTF-8, while Debian previously used
    151     language-specific (mostly 8-bit) encodings. As mentioned above, this leads
    152     to incompatibility of packages with manual pages designed for different
    153     distributions.</para>
    154 
    155     <para>LFS previously used the same convention as Debian. This was chosen
    156     because <application>Man-DB</application> did not understand manual pages
    157     stored in UTF-8 at the time of its introduction into LFS.  For our purposes
    158     at that time, <application>Man-DB</application> was preferable to
    159     <application>Man</application> as it worked without any additional
    160     configuration in any locale.  This is still true today as
    161     <application>Man-DB</application> with Debian patched
    162     <application>Groff</application> will now dynamically convert UTF-8 encoded
    163     manual pages to the user's locale.  Additionally, this combination provides
    164     support for Chinese and Japanese locales, and limited support for Korean,
    165     whereas <application>Man</application> does not. The current offering of
    166     <application>Man</application> as used in RedHat requires major
    167     modifications to both the <application>Man</application> and
    168     <application>Groff</application> packages, and still falls short on
    169     Chinese, Japanese, and Korean encodings.</para>
    170 
    171     <para>Finally, most distributions, including Debian, are rapidly migrating
    172     to all UTF-8 encoded manual pages. Upstream packagers will very likely drop
    173     legacy encodings in favor of UTF-8, though adoption has been slow due to
    174     the hacks required to make the current <application>Man</application> and
    175     <application>Groff</application> packages work correctly together.</para>
    176 
    177     <para>The relationship between language codes and the expected encoding
    178     of legacy manual pages is listed below.</para>
    179 
    180     <!-- Origin: man-db-2.5.2/src/encodings.c -->
    181     <table>
    182       <title>Expected character encoding of legacy 8-bit manual pages</title>
    183       <?dbfo table-width="2.5in" ?>
    184 
    185       <tgroup cols="2">
    186 
    187         <colspec colnum="1" colwidth="1.5in"/>
    188         <colspec colnum="2" colwidth="1in"/>
    189 
    190         <thead>
    191           <row>
    192             <entry>Language (code)</entry>
    193             <entry>Encoding</entry>
    194           </row>
    195         </thead>
    196 
    197         <tbody>
    198           <row>
    199             <entry>Danish (da)</entry>
    200             <entry>ISO-8859-1</entry>
    201           </row>
    202           <row>
    203             <entry>German (de)</entry>
    204             <entry>ISO-8859-1</entry>
    205           </row>
    206           <row>
    207             <entry>English (en)</entry>
    208             <entry>ISO-8859-1</entry>
    209           </row>
    210           <row>
    211             <entry>Spanish (es)</entry>
    212             <entry>ISO-8859-1</entry>
    213           </row>
    214           <row>
    215             <entry>Finnish (fi)</entry>
    216             <entry>ISO-8859-1</entry>
    217           </row>
    218           <row>
    219             <entry>French (fr)</entry>
    220             <entry>ISO-8859-1</entry>
    221           </row>
    222           <row>
    223             <entry>Irish (ga)</entry>
    224             <entry>ISO-8859-1</entry>
    225           </row>
    226           <row>
    227             <entry>Galician (gl)</entry>
    228             <entry>ISO-8859-1</entry>
    229           </row>
    230           <row>
    231             <entry>Indonesian (id)</entry>
    232             <entry>ISO-8859-1</entry>
    233           </row>
    234           <row>
    235             <entry>Icelandic (is)</entry>
    236             <entry>ISO-8859-1</entry>
    237           </row>
    238           <row>
    239             <entry>Italian (it)</entry>
    240             <entry>ISO-8859-1</entry>
    241           </row>
    242           <row>
    243             <entry>Dutch (nl)</entry>
    244             <entry>ISO-8859-1</entry>
    245           </row>
    246           <!-- FIXME: BUG: "no" is deprecated, should use "nb" or "nn" and
    247           symlinks -->
    248           <row>
    249             <entry>Norwegian (no)</entry>
    250             <entry>ISO-8859-1</entry>
    251           </row>
    252           <!-- END BUG -->
    253           <row>
    254             <entry>Portuguese (pt)</entry>
    255             <entry>ISO-8859-1</entry>
    256           </row>
    257           <row>
    258             <entry>Swedish (sv)</entry>
    259             <entry>ISO-8859-1</entry>
    260           </row>
    261           <!-- Languages below require patched groff -->
    262           <row>
    263             <entry>Bulgarian (bg)</entry>
    264             <entry>CP1251</entry>
    265           </row>
    266           <row>
    267             <entry>Czech (cs)</entry>
    268             <entry>ISO-8859-2</entry>
    269           </row>
    270           <row>
    271             <entry>Croatian (hr)</entry>
    272             <entry>ISO-8859-2</entry>
    273           </row>
    274           <row>
    275             <entry>Hungarian (hu)</entry>
    276             <entry>ISO-8859-2</entry>
    277           </row>
    278           <row>
    279             <entry>Japanese (ja)</entry>
    280             <entry>EUC-JP</entry>
    281           </row>
    282           <row>
    283             <entry>Korean (ko)</entry>
    284             <entry>EUC-KR</entry>
    285           </row>
    286           <row>
    287             <entry>Polish (pl)</entry>
    288             <entry>ISO-8859-2</entry>
    289           </row>
    290           <row>
    291             <entry>Russian (ru)</entry>
    292             <entry>KOI8-R</entry>
    293           </row>
    294           <row>
    295             <entry>Slovak (sk)</entry>
    296             <entry>ISO-8859-2</entry>
    297           </row>
    298           <row>
    299             <entry>Serbian (sr)</entry>
    300             <entry>ISO-8859-5</entry>
    301           </row>
    302           <row>
    303             <entry>Turkish (tr)</entry>
    304             <entry>ISO-8859-9</entry>
    305           </row>
    306           <row>
    307             <entry>Simplified Chinese (zh_CN)</entry>
    308             <entry>GBK</entry>
    309           </row>
    310           <row>
    311             <entry>Simplified Chinese,Singapore} (zh_SG)</entry>
    312             <entry>GBK</entry>
    313           </row>
    314           <row>
    315             <entry>Traditional Chinese (zh_TW)</entry>
    316             <entry>BIG5</entry>
    317           </row>
    318           <row>
    319             <entry>Traditional Chinese, Hong Kong (zh_HK)</entry>
    320             <entry>BIG5HKSCS</entry>
    321           </row>
    322         </tbody>
    323 
    324       </tgroup>
    325 
    326     </table>
    327 
    328     <note>
    329       <para>Manual pages in languages not in the list are not supported.
    330       Norwegian does not work because of the transition from no_NO to
    331       nb_NO locale, and will be fixed in the next release of
    332       <application>Man-DB</application>.  Korean is currently non functional
    333       because of incomplete fixes in the Groff patch.</para>
    334     </note>
    335 
    336 
    337     <para>If upstream distributes the manual pages in a legacy encoding,
    338     the manual pages can simply be copied to
     384
     385    <para>If upstream distributes the manual pages in a legacy encoding, the
     386    manual pages can simply be copied to
    339387    <filename class="directory">/usr/share/man/<replaceable>&lt;language
    340388    code&gt;</replaceable></filename>. For example, <ulink
     
    354402
    355403    <para>For example, to install <ulink
    356     url="http://ditec.um.es/~piernas/manpages-es/man-pages-es-1.55.tar.bz2">
    357     Spanish manual pages</ulink> in the legacy encoding, use the following
     404    url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2">
     405    French manual pages</ulink> in the legacy encoding, use the following
    358406    commands:</para>
    359407
    360 <screen role="nodump"><userinput>mv man7/iso_8859-7.7{,X}
    361 convert-mans UTF-8 ISO-8859-1 man?/*.?
    362 mv man7/iso_8859-7.7{X,}
    363 make install</userinput></screen>
    364 
    365     <note>
    366       <para>The <filename>man7/iso_8859-7.7</filename> file needs to be
    367       exclueded from the conversion process because it is already in
    368       ISO-8859-1 format.  This is a packaging bug in man-pages-es-1.55.
    369       Future versions should not require this workaround.</para>
    370     </note>
    371 
    372     <para>Finally, as an example installation of UTF-8 manual pages, the <ulink
    373     url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2">
    374     French manual pages</ulink> can be installed with the following
    375     commands:</para>
     408<screen role="nodump"><userinput>convert-mans UTF-8 ISO-8859-1 man?/*.?
     409mkdir -p /usr/share/man/fr
     410cp -rv man? /usr/share/man/fr</userinput></screen>
     411
     412    <note><para>The French manual pages ship with ready made scripts to do the
     413    same conversion. The above instructions are used only as an example for
     414    use of the <command>convert-mans</command> script.</para></note>
     415
     416    <para>Finally, as an example installation of UTF-8 manual pages, again, the
     417    French manual pages could be installed with the following commands:</para>
    376418
    377419<screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8
Note: See TracChangeset for help on using the changeset viewer.