Ignore:
Timestamp:
01/06/2006 01:59:08 AM (18 years ago)
Author:
Jeremy Huntwork <jhuntwork@…>
Branches:
10.0, 10.0-rc1, 10.1, 10.1-rc1, 11.0, 11.0-rc1, 11.0-rc2, 11.0-rc3, 11.1, 11.1-rc1, 11.2, 11.2-rc1, 11.3, 11.3-rc1, 12.0, 12.0-rc1, 12.1, 12.1-rc1, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.5-systemd, 7.6, 7.6-systemd, 7.7, 7.7-systemd, 7.8, 7.8-systemd, 7.9, 7.9-systemd, 8.0, 8.1, 8.2, 8.3, 8.4, 9.0, 9.1, arm, bdubbs/gcc13, ml-11.0, multilib, renodr/libudev-from-systemd, s6-init, trunk, xry111/arm64, xry111/arm64-12.0, xry111/clfs-ng, xry111/lfs-next, xry111/loongarch, xry111/loongarch-12.0, xry111/loongarch-12.1, xry111/mips64el, xry111/pip3, xry111/rust-wip-20221008, xry111/update-glibc
Children:
abf1f62
Parents:
60e34b5
Message:

Initial support of UTF-8. Thanks Alexander Patrakov.

git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@7245 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689

File:
1 edited

Legend:

Unmodified
Added
Removed
  • chapter07/profile.xml

    r60e34b5 rfa21b3d  
    7070  <replaceable>[CC]</replaceable> with the two-letter code for the appropriate
    7171  country (e.g., <quote>GB</quote>). <replaceable>[charmap]</replaceable> should
    72   be replaced with the canonical charmap for your chosen locale.</para>
     72  be replaced with the canonical charmap for your chosen locale. Optional
     73  modifiers such as <quote>@euro</quote> may also be present.</para>
    7374
    7475  <para>The list of all locales supported by Glibc can be obtained by running
     
    7778<screen role="nodump"><userinput>locale -a</userinput></screen>
    7879
    79   <para>Locales can have a number of synonyms, e.g. <quote>ISO-8859-1</quote>
     80  <para>Charmaps can have a number of aliases, e.g., <quote>ISO-8859-1</quote>
    8081  is also referred to as <quote>iso8859-1</quote> and <quote>iso88591</quote>.
    81   Some applications cannot handle the various synonyms correctly, so it is
    82   safest to choose the canonical name for a particular locale. To determine
     82  Some applications cannot handle the various synonyms correctly (e.g., require
     83  that <quote>UTF-8</quote> is written as <quote>UTF-8</quote>, not
     84  <quote>utf8</quote>), so it is safest in most
     85  cases to choose the canonical name for a particular locale. To determine
    8386  the canonical name, run the following command, where <replaceable>[locale
    8487  name]</replaceable> is the output given by <command>locale -a</command> for
     
    116119  Glibc.</para>
    117120
     121  <!-- FIXME: the xlib example will became obsolete real soon -->
    118122  <para>Some packages beyond LFS may also lack support for your chosen locale. One
    119123  example is the X library (part of the X Window System), which outputs the
     
    140144<literal># Begin /etc/profile
    141145
    142 export LANG=<replaceable>[ll]</replaceable>_<replaceable>[CC]</replaceable>.<replaceable>[charmap]</replaceable>
     146export LANG=<replaceable>[ll]</replaceable>_<replaceable>[CC]</replaceable>.<replaceable>[charmap]</replaceable><replaceable>[@modifiers]</replaceable>
    143147export INPUTRC=/etc/inputrc
    144148
     
    146150EOF</userinput></screen>
    147151
    148   <note>
    149     <para>The <quote>C</quote> (default) and <quote>en_US</quote> (the
    150     recommended one for United States English users) locales are different.</para>
    151   </note>
     152  <para>The <quote>C</quote> (default) and <quote>en_US</quote> (the recommended
     153  one for United States English users) locales are different. <quote>C</quote>
     154  uses the US-ASCII 7-bit character set, and treats bytes with the high bit set
     155  as invalid characters. That's why, e.g., the <command>ls</command> command
     156  substitutes them with question marks in that locale. Also, an attempt to send
     157  mail with such characters from Mutt or Pine results in non-RFC-conforming
     158  messages being sent (the charset in the outgoing mail is indicated as <quote>unknown
     159  8-bit</quote>). So you can use the <quote>C</quote> locale only if you are sure that
     160  you will never need 8-bit characters.</para>
    152161
    153   <para>Setting the keyboard layout, screen font, and locale-related environment
    154   variables are the only internationalization steps needed to support locales
    155   that use ordinary single-byte encodings and left-to-right writing direction.
    156   More complex cases (including UTF-8 based locales) require additional steps
    157   and additional patches because many applications tend to not work properly
    158   under such conditions. These steps and patches are not included in the LFS
    159   book and such locales are not yet supported by LFS.</para>
     162  <para>UTF-8 based locales are not supported well by many programs. E.g., the
     163  <command>watch</command> program displays only ASCII characters in UTF-8
     164  locales and has no such restriction in traditional 8-bit locales like en_US.
     165  Without patches and/or installing software beyond BLFS, in UTF-8 based locales
     166  you will not be able to do such basic tasks as printing plain-text files from
     167  the command line, recording Windows-readable CDs with filenames containing
     168  non-ASCII characters, viewing ID3v1 tags in MP3 files and so on. Work is in
     169  progress to document and, if possible, fix such problems, see
     170  <ulink url="&blfs-root;view/svn/introduction/locale-issues.html"/>.
     171  It is, however, safe to use UTF-8 based locales if you are going to use only
     172  KDE or GNOME and never open the terminal.</para>
     173  <!-- All abovementioned problems except "watch" have a known fix beyond BLFS -->
    160174
    161175</sect1>
Note: See TracChangeset for help on using the changeset viewer.