Opened 14 years ago

Closed 14 years ago

#2722 closed task (wontfix)

Mistake on the linux console page

Reported by: splotz90 Owned by: Matthew Burgess
Priority: normal Milestone: 6.8
Component: Book Version: SVN
Severity: normal Keywords:
Cc:

Description

The book says:

"In UTF-8 mode, the kernel uses the application character map for conversion of composed 8-bit key codes in the keymap to UTF-8, and thus the argument of the "-m" parameter should be set to the encoding of the composed key codes in the keymap."

That's wrong. The man page of setfont says:

"If the console is in utf8 mode (see unicode_start(1)) then the kernel expects that user program output is coded as UTF-8 (see utf-8(7)), and converts that to Unicode (ucs2). Otherwise, a translation table is used from the 8-bit program output to 16-bit Unicode values. Such a translation table is called a Unicode console map."

The console internal always uses unicode (regardless if we're using the unicode mode or not): http://www.mjmwired.net/kernel/Documentation/unicode.txt

So this table is used to convert the output of programs, that are using "legacy charsets", to unicode. It doesn't convert the the key codes in the keymap. It also converts the input from the keyboard. Example:

program (8859-1 output) --> unicode console map (unicode output) --> font --> character on screen

and:

keyboard --> scancode --> kernel keymap (unicode output) --> unicode console map (8859-1 output) --> program

This conversion only takes place, if we're NOT in unicode mode. I've tested all these things ...


Some comments on the examples in the book:

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console

KEYMAP="de-latin1"
KEYMAP_CORRECTIONS="euro2"
FONT="lat0-16 -m 8859-15"

# End /etc/sysconfig/console
EOF

KEYMAP_CORRECTIONS="euro2" is not necessary because de-latin1 already includes euro2.map

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console

UNICODE="1"
KEYMAP="de-latin1"
KEYMAP_CORRECTIONS="euro2"
LEGACY_CHARSET="iso-8859-15"
FONT="LatArCyrHeb-16 -m 8859-15"

# End /etc/sysconfig/console
EOF

KEYMAP_CORRECTIONS="euro2" is not necessary because de-latin1 already includes euro2.map

FONT="LatArCyrHeb-16 -m 8859-15" --> "-m 8859-15" is not meaningful because the kernel doesn't use this translation table in unicode mode.

LEGACY_CHARSET="iso-8859-15" is not necessary because the de-latin1 keymap doesn't contain any charset specific character code values. The symbolic values will be converted into unicode in the right way (without help of this conversion). See src/ksyms.c in the kbd source.

The dumpkeys man page says:

-ccharset --charset=charset This instructs dumpkeys to interpret character code values according to the specified character set. This affects only the translation of character code values to symbolic names. Valid values for charset currently are iso-8859-X, Where X is a digit in 1-9. If no charset is specified, iso-8859-1 is used as a default. This option produces an output line 'charset "iso-8859-X"', telling loadkeys how to interpret the keymap. (For example, "division" is 0xf7 in iso-8859-1 but 0xba in iso-8859-8.)


I suggest that we're adding the following examples for a german LFS into the book:

For a system with programs that uses "legacy charsets":

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console

KEYMAP="de-latin1"
FONT="lat0-16 -m 8859-15"

# End /etc/sysconfig/console
EOF

For a system with no "legacy software" (only UTF-8 compatible programs):

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console

UNICODE="1"
KEYMAP="de-latin1"
FONT="lat0-16"

# End /etc/sysconfig/console
EOF

It should be pointed out that the "-m $charset" option is only meaningful if "legacy software" with old charsets needs to be supported.


Finally, some references:

http://linux.die.net/man/1/dumpkeys

http://linux.die.net/man/8/setfont

http://www.mjmwired.net/kernel/Documentation/unicode.txt

http://gunnarwrobel.de/wiki/Linux-and-the-keyboard.html

http://freeworld.thc.org/papers/writing-linux-kernel-keylogger.txt

Attachments (1)

console.txt (1.2 KB ) - added by splotz90 14 years ago.

Download all attachments as: .zip

Change History (14)

comment:1 by splotz90, 14 years ago

Correction:

I've updated my suggestion for the second example (unicode):

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console

UNICODE="1"
KEYMAP="de-latin1"
FONT="LatArCyrHeb-16"

# End /etc/sysconfig/console
EOF

And I meant "changing the examples" (not adding) ...

comment:2 by splotz90, 14 years ago

Second correction:

"It also converts the input from the keyboard."

and

"keyboard --> scancode --> kernel keymap (unicode output) --> unicode console map (8859-1 output) --> program"

isn't true.

If we're in ASCII mode (kdb_mode -a) the application will always get the 8 bit value from the keymap. Example: It will always receive E4 (hex) if I'm pressing "ä" on the keyboard. The unicode console map doesn't influence this value.

If we're in unicode mode (kdb_mode -u) the application will receive the UTF-8 value.

But the rest of my ticket should be true ... ;-)

comment:3 by willimm, 14 years ago

Oh, and why don't we add something for us US/UK English people? Here are a couple examples:

Non Unicode:

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console for non-Unicode English systems
KEYMAP="us"
FONT=“lat1-16 -m 8859-1”
# End /etc/sysconfig/console for non-Unicode English systems
EOF

In Unicode:

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console for Unicode English systems
UNICODE="1"
KEYMAP="us"
# On my (willimm) system, and Ken's systems too, FONT is sigma-general-8x16.
FONT=“lat1-16”
# End /etc/sysconfig/console for Unicode English systems
EOF

For UK English users, KEYMAP is uk instead of US.

in reply to:  3 ; comment:4 by bdubbs@…, 14 years ago

Replying to willimm:

Oh, and why don't we add something for us US/UK English people?

I can't speak for a UK configuration, but for US,

"This section discusses how to configure the console and consolelog bootscripts that set up the keyboard map, console font and console kernel log level. If non-ASCII characters (e.g., the copyright sign, the British pound sign and Euro symbol) will not be used and the keyboard is a U.S. one, much of this section can be skipped."

I skip the entire section. I've never noticed anything more needed.

in reply to:  4 ; comment:5 by willimm, 14 years ago

Replying to bdubbs@…:

I can't speak for a UK configuration, but for US,

"This section discusses how to configure the console and consolelog bootscripts that set up the keyboard map, console font and console kernel log level. If non-ASCII characters (e.g., the copyright sign, the British pound sign and Euro symbol) will not be used and the keyboard is a U.S. one, much of this section can be skipped."

I skip the entire section. I've never noticed anything more needed.

Well, let's say that you want a Unicode setup. Or you need to change the console log level. Or you want to use a prettier font. In that case, then I would NOT skip the entire section.

What would work best is to get rid of the above note and provide setups for US/UK English. Then provide default implementations for US/UK english systems.

in reply to:  5 comment:6 by bdubbs@…, 14 years ago

Replying to willimm:

Well, let's say that you want a Unicode setup.

For what application?

"The /etc/sysconfig/console file only controls the Linux text console localization. It has nothing to do with setting the proper keyboard layout and terminal fonts in the X Window System, with ssh sessions or with a serial console."

Or you need to change the console log level. Or you want to use a prettier font. In that case, then I would NOT skip the entire section.

Of course. That's one reason the section is there. The vast majority of US users can skip it.

What would work best is to get rid of the above note and provide setups for US/UK English. Then provide default implementations for US/UK english systems.

That would imply it's needed when it's not.

in reply to:  2 comment:7 by splotz90, 14 years ago

Replying to splotz90:

If we're in ASCII mode (kdb_mode -a) the application will always get the 8 bit value from the keymap.

Should be: "If we're in ASCII mode (kdb_mode -a), the application will always receive the lower 8 bit from the keymap."


I think adding an example for UK could be meaningful, because this keyboard layout differs from the US layout (http://en.wikipedia.org/wiki/Keyboard_layout#United_Kingdom).

in reply to:  1 comment:8 by splotz90, 14 years ago

Replying to splotz90:

Correction:

I've updated my suggestion for the second example (unicode)

LatArCyrHeb-16 shouldn't be used for a german LFS because it doesn't contains the €-symbol ...

So this is better:

cat > /etc/sysconfig/console << "EOF"
# Begin /etc/sysconfig/console

UNICODE="1"
KEYMAP="de-latin1"
FONT="lat0-16"

# End /etc/sysconfig/console
EOF

comment:9 by splotz90, 14 years ago

What about an extra hint to provide more examples. That would make life easier for the user ...

Here is an example. I would create such a hint if it is considered as meaningful.

by splotz90, 14 years ago

Attachment: console.txt added

comment:10 by Matthew Burgess, 14 years ago

Owner: changed from lfs-book@… to Matthew Burgess
Status: newassigned

comment:11 by Matthew Burgess, 14 years ago

I've started a discussion for this ticket at http://www.linuxfromscratch.org/pipermail/lfs-dev/2010-August/064239.html. If you wouldn't mind, splotz90, could you follow up there please?

comment:12 by bdubbs@…, 14 years ago

Milestone: 6.76.8

Moving to 6.8 for now, but we may get it in before 6.7.

comment:13 by Matthew Burgess, 14 years ago

Resolution: wontfix
Status: assignedclosed

There didn't seem to be much input/consensus in the thread referenced in comment 11, so I'm closing as wontfix.

Note: See TracTickets for help on using tickets.