Opened 18 years ago

Closed 17 years ago

#1694 closed defect (fixed)

UTF-8 Nitpicks

Reported by: tushar@… Owned by: Jeremy Huntwork
Priority: lowest Milestone: 6.2
Component: Book Version: SVN
Severity: normal Keywords:

Description (last modified by Jeremy Huntwork)

See URL for discussion.

Some nitpicks related to the UTF-8 patch:

  • /usr/bin/zsoelim from groff is overwritten by man-db (three cheers for pkg-user hint:).
  • Mention that users can choose the man package instead of man-db (with a pointer to the man home-page/freshmeat-page).
  • Mention that folks can choose gdbm instead of berkeley db with a pointer to BLFS's gdbm page.
  • The man-db page details on the two approaches to UTF8 - Redhat and Debian. There should be an explaination on why the Redhat approach was not considered for the book.

Change History (6)

comment:1 by Jeremy Huntwork, 18 years ago

Description: modified (diff)
Milestone: 6.2

comment:2 by Jeremy Huntwork, 18 years ago

Just testing the new email settings - please ignore.

comment:3 by Jeremy Huntwork, 17 years ago

IIUC, we chose the Debian convention (not storing the man pages in UTF-8) for greater flexibility/compatibility with all locale encodings - so that Man-DB can convert on-the-fly to any required locale encoding. Man-DB expects the man pages to exist in a certain encoding, but does the conversion to user-specified encoding if it needs to. Better to store the pages in an encoding that Man-DB expects.

Alexander is this correct? Can you clarify and/or write up a short explanation as to our chosen path for the book?

comment:4 by alexander@…, 17 years ago

1) zsoelim: Trivial, please use zsoelim from Man-DB.

2) Man instead of Man-DB: this will bring us a lot of support issues like "I don't want the DB, I installed Man, and now I see can't-render-this squares in all manual pages instead of hyphens!" if we don't add (very long) configuration instructions for Man. Thus, WONTFIX.

3) Pointer to GDBM is OK.

4) Edit the text below as needed and add to the book.

Debian convention has been chosen because of the following considerations:

  • Compatibility with the existing BLFS instructions. Almost all BLFS packages that supply translated manual pages provide them in the old 8-bit encodings. Storing manuals in UTF-8 would require addition of conversion instructions to all BLFS packages
  • Man-DB is very rigid and understands nothing else. But it (unlike Man) works without configuration in any locale.
  • There is simply no fully-working implementation for RedHat convention. RedHat groff misformats (adds spaces in the middle of a word) simple testcases.

comment:5 by Jeremy Huntwork, 17 years ago

Owner: changed from lfs-book@… to Jeremy Huntwork
Status: newassigned

Thanks Alexander. Agreed on all points, and I'll work on getting your explanation into the book.

comment:6 by Jeremy Huntwork, 17 years ago

Resolution: fixed
Status: assignedclosed

Done as of r7499

Note: See TracTickets for help on using tickets.