Opened 18 years ago

Closed 17 years ago

Last modified 17 years ago

#2102 closed defect (fixed)

Audit translated manual pages

Reported by: alexander@… Owned by: Randy McMurchy
Priority: normal Milestone: 6.2.0
Component: BOOK Version: SVN
Severity: normal Keywords:
Cc:

Description

LFS expects that manual pages are in the language-specific (usually 8-bit) encoding, as specified on http://www.linuxfromscratch.org/lfs/view/6.2/chapter06/man-db.html. However, some packages install translated manual pages in UTF-8 encoding (e.g., Shadow, already dealt with), or manual pages in languages not in the table. All BLFS packages have to be audited for this, but I can't do this myself due to high cost of Internet traffic.

Manual pages that are for unsupported countries can be spotted by looking at subdirectories of /usr/share/man. If they are found, please modify the book in order to avoid their installation, as they would not be displayed correctly and would hide the (readable) corresponding English manual page.

Manual pages that are in UTF-8 can be found by running this simple shell script after installing the package:

#!/bin/sh
# Begin checkman.sh
# Usage: find /usr/share/man | xargs ./checkman.sh
for a in "$@"
do
    # echo "Checking $a..."
    # Pure-ASCII manual page (possibly except comments) is OK
    grep -v '.\\"' "$a" | iconv -f US-ASCII -t US-ASCII >/dev/null 2>&1 && continue
    # Non-UTF-8 manual page is OK
    iconv -f UTF-8 -t UTF-8 "$a" >/dev/null 2>&1 || continue
    # If we got here, we found UTF-8 manual page, bad.
    echo "UTF-8 manual page: $a" >&2
done
# End checkman.sh

If this script finds anything, a separate ticket should be opened with the subject "Package foo installs UTF-8 manual pages". I will provide instructions for each of such bugs.

Progress is tracked at CheckManualPages. This ticket should be closed when all BLFS packages are mentioned there as installing manual pages correctly.

Change History (15)

comment:1 by alexander@…, 18 years ago

whoops, I meant "find /usr/share/man -type f"

comment:2 by Randy McMurchy, 18 years ago

With all due respect to Alexander's efforts on this bug, wouldn't it be easier to put a caution note on the locale related issues page saying to simply delete an affected man page (providing elementary instructions to look in the /usr/share/man/xx tree)?

I mean what Alex is asking would actually require this check each and every time each and every package is added/updated. This seems excessive.

If a user is using a locale that has issues, she's bound to visit the locale related issues page at least once and see the caution note about incorrectly formatted man pages. So, if she pulls up a man page that doesn't format correctly in the native encoding, she'll know immediately that an English page probably exists and to delete the man page and/or country directory.

Another thought that doesn't require BLFS editors to be involved with every single package update is to provide instructions, *at the beginning of the book* to create unsupported country directory entries in the /usr/share/man tree which are actually symlinks to /dev/null.

This would be a one-time fix, not requiring editor checking if /usr/share/man/xx -> /dev/null. Any package that would install a page as /usr/share/man/xx/manx/manpage would continue with the installation process, even though the unsupported man pages are not really being installed.

Just a couple of thoughts to consider (but I do like the /dev/null idea).

comment:3 by alexander@…, 18 years ago

I mean what Alex is asking would actually require this check each and every time each and every package is added/updated. This seems excessive.

What's wrong in looking for bugs in advance, especially given that LFS-type users are lazy at reporting bugs when the fix is trivial?

In fact, someone is going to build as much as possible of BLFS just before a release. What I am asking for, is to run the script from the description of that ticket just after that. I am sure that not many issues will be found (not more than 5). But the remaining ones still have to be resolved.

Your /dev/null idea doesn't work, because when a package tries to install a file /usr/share/man/ko/man1/foo.1, it will get a "not a directory" error and "make install" will fail.

comment:4 by Randy McMurchy, 18 years ago

Yes, the /dev/null idea as I described is flawed. I notice you did not comment about my first suggestion, of letting the users handle this on their own.

A couple of questions just to help get me up to speed with mandb:

Is this an LFS-specific issue? For instance, are there distros available that would not require this and the man pages would be considered "supported".

Additionally, could one 'fix' the LFS installation so that the country that is currently unsupported could be made to be supported?

I suppose what I'm looking for is a way that we can put this onus on the user and wouldn't require the editors to have to remember to run a script after updating each and every package, each and every time.

Surely, there is some way to accomplish this. Another suggestion:

Provide a crontab entry for users to automatically run a script (both provided on the locale issues page) which removes any files that may exist. I'm not saying my suggestions are the optimum solution, I'm just, as I've mentioned, interested in pushing this off on the users.

comment:5 by alexander@…, 18 years ago

To the "leave it to the user" suggestion: acceptable, as long as he is asked to report bugs. I.e., something like this on locale-related issues page:


LFS expects that manual pages are in the language-specific (usually 8-bit) encoding, as specified on http://www.linuxfromscratch.org/lfs/view/6.2/chapter06/man-db.html. However, some packages install translated manual pages in UTF-8 encoding (e.g., Shadow, already dealt with), or manual pages in languages not in the table. Not all BLFS packages have been audited for conformance with the requirements put in LFS. If you find a manual page installed by any of BLFS packages that is obviously in the wrong encoding, please remove or convert it as needed, and report this to BLFS team as a bug.


To the "is this LFS-specific" question: no, it is not. Encodings of manual pages should be consistent in any given distro (but the convention varies from distro to distro).

To the "crontab entry" suggestion: Fedora does something very similar with their man setup, so the idea is probably good.

comment:6 by dnicholson@…, 18 years ago

I actually don't mind doing this. It's just one more little task at the end of a package install.

make DESTDIR=`pwd`/dest install
find `pwd`/dest/usr/share/man -type f | xargs checkman.sh

I checked on my system, which is a pretty fleshed out desktop system, with some servers. Results were actually really good.

The culprits were ImageMagick, git, xf86-input-evdev, elinks and shadow. Shadow is already fixed, and git and elinks aren't in the book. ImageMagick (/usr/share/man/man1/ImageMagick.1) and xf86-input-evdev (/usr/share/man/man4/evdev.4) are only one man page each.

I would be OK with doing this. Alexander has already provided the tools to make it simple.

comment:7 by alexander@…, 18 years ago

Both ImageMagick and evdev driver manual pages are indeed in UTF-8. The effect is:

ImageMagick:

"Bézier curves" became "Bézier curces"

evdev driver:

"Kristian Høgsberg" became "Kristian Høgsberg"

So this only manifests itself as a minor typo. The rest of the text is certainly readable.

Both manual pages are fixable:

sed -i "s/\xc3\xa9/\\\\['e]/" /usr/share/man/man1/ImageMagick.1
sed -i "s/\xc3\xb8/\\\\[\/o]/" /usr/share/man/man4/evdev.4

or equivalent seds in sources.

comment:8 by dnicholson@…, 18 years ago

Indeed your fixes work. I should be honest that I don't install all of X, so there could be more. I guess this still needs discussion, but it seems OK to me.

comment:9 by Randy McMurchy, 17 years ago

Type: taskdefect

========================================================== LFS expects that manual pages are in the language-specific (usually 8-bit) encoding, as specified on http://www.linuxfromscratch.org/lfs/view/6.2/chapter06/man-db.html. However, some packages install translated manual pages in UTF-8 encoding (e.g., Shadow, already dealt with), or manual pages in languages not in the table. Not all BLFS packages have been audited for conformance with the requirements put in LFS. If you find a manual page installed by any of BLFS packages that is obviously in the wrong encoding, please remove or convert it as needed, and report this to BLFS team as a bug. ==========================================================

I like this. Can we agree on the following:

  1. Place the text above on the locale related issues page.
  2. Provide fixes to the sources for packages that install bad

pages (easy enough to get the list of packages that need it)

  1. Periodically Editors should run the check to ensure no

other bad pages have snuck in. If we can agree on the above, or something similar if someone notices the strategy might be flawed, I'll gladly pick up the ticket and get this done.

comment:10 by alexander@…, 17 years ago

No objections. The proposed text is OK. Thanks for the idea to add it to the locale related issues page.

comment:11 by Randy McMurchy, 17 years ago

Owner: changed from blfs-book@… to Randy McMurchy
Status: newassigned

It was your suggestion, Alexander, I was just trying to summarize.

:-)

comment:12 by dnicholson@…, 17 years ago

I checked what I have on my newly built system (with a little tweak to see where they come from). Here goes:

UTF-8 manual page /usr/share/man/man4/evdev.4: xf86-input-evdev-1.1.5-1 UTF-8 manual page /usr/share/man/man4/radeon.4: xf86-video-ati-6.6.3-1 UTF-8 manual page /usr/share/man/man4/fbdev.4: xf86-video-fbdev-0.3.1-1

evdev.4 is because of Kristian Høgsberg. The other two are because of Michel Dänzer. We could do the convert-mans after the fact on the Xorg Drivers page so that the build commands aren't broken up.

Additionally, I seem to have screwed up my fixes for shadow because they're all showing as UTF-8. I'm sure that's an error on my part, though.

comment:13 by Randy McMurchy, 17 years ago

I've added the text discussed above to the 'Locale Related Issues' page and added commands to the ImageMagick and Xorg evdev packages to fix bad pages.

At this point, this ticket can be probably be closed and new instances of bad pages should be described in separate tickets (see #2214 and #2215).

Dan also mentioned a couple more Xorg pages, but I didn't see fixes for these.

Keeping the bug open until Alexander has a chance to provide comment about the changes.

comment:14 by alexander@…, 17 years ago

Resolution: fixed
Status: assignedclosed

Well done, closing.

comment:15 by dnicholson@…, 17 years ago

It would be very nice if the checkman.sh script was referenced somewhere instead of dying with this ticket. I believe the best place would be in the LFS book with man-db, but it's too late for LFS-6.2 right now. Maybe in the BLFS book, or even in the Editor's Guide.

Note: See TracTickets for help on using tickets.