Opened 14 years ago
Closed 14 years ago
#2722 closed task (wontfix)
Mistake on the linux console page
Reported by: | splotz90 | Owned by: | Matthew Burgess |
---|---|---|---|
Priority: | normal | Milestone: | 6.8 |
Component: | Book | Version: | SVN |
Severity: | normal | Keywords: | |
Cc: |
Description
The book says:
"In UTF-8 mode, the kernel uses the application character map for conversion of composed 8-bit key codes in the keymap to UTF-8, and thus the argument of the "-m" parameter should be set to the encoding of the composed key codes in the keymap."
That's wrong. The man page of setfont says:
"If the console is in utf8 mode (see unicode_start(1)) then the kernel expects that user program output is coded as UTF-8 (see utf-8(7)), and converts that to Unicode (ucs2). Otherwise, a translation table is used from the 8-bit program output to 16-bit Unicode values. Such a translation table is called a Unicode console map."
The console internal always uses unicode (regardless if we're using the unicode mode or not): http://www.mjmwired.net/kernel/Documentation/unicode.txt
So this table is used to convert the output of programs, that are using "legacy charsets", to unicode. It doesn't convert the the key codes in the keymap. It also converts the input from the keyboard. Example:
program (8859-1 output) --> unicode console map (unicode output) --> font --> character on screen
and:
keyboard --> scancode --> kernel keymap (unicode output) --> unicode console map (8859-1 output) --> program
This conversion only takes place, if we're NOT in unicode mode. I've tested all these things ...
Some comments on the examples in the book:
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console KEYMAP="de-latin1" KEYMAP_CORRECTIONS="euro2" FONT="lat0-16 -m 8859-15" # End /etc/sysconfig/console EOF
KEYMAP_CORRECTIONS="euro2" is not necessary because de-latin1 already includes euro2.map
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console UNICODE="1" KEYMAP="de-latin1" KEYMAP_CORRECTIONS="euro2" LEGACY_CHARSET="iso-8859-15" FONT="LatArCyrHeb-16 -m 8859-15" # End /etc/sysconfig/console EOF
KEYMAP_CORRECTIONS="euro2" is not necessary because de-latin1 already includes euro2.map
FONT="LatArCyrHeb-16 -m 8859-15" --> "-m 8859-15" is not meaningful because the kernel doesn't use this translation table in unicode mode.
LEGACY_CHARSET="iso-8859-15" is not necessary because the de-latin1 keymap doesn't contain any charset specific character code values. The symbolic values will be converted into unicode in the right way (without help of this conversion). See src/ksyms.c in the kbd source.
The dumpkeys man page says:
-ccharset --charset=charset This instructs dumpkeys to interpret character code values according to the specified character set. This affects only the translation of character code values to symbolic names. Valid values for charset currently are iso-8859-X, Where X is a digit in 1-9. If no charset is specified, iso-8859-1 is used as a default. This option produces an output line 'charset "iso-8859-X"', telling loadkeys how to interpret the keymap. (For example, "division" is 0xf7 in iso-8859-1 but 0xba in iso-8859-8.)
I suggest that we're adding the following examples for a german LFS into the book:
For a system with programs that uses "legacy charsets":
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console KEYMAP="de-latin1" FONT="lat0-16 -m 8859-15" # End /etc/sysconfig/console EOF
For a system with no "legacy software" (only UTF-8 compatible programs):
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console UNICODE="1" KEYMAP="de-latin1" FONT="lat0-16" # End /etc/sysconfig/console EOF
It should be pointed out that the "-m $charset" option is only meaningful if "legacy software" with old charsets needs to be supported.
Finally, some references:
http://linux.die.net/man/1/dumpkeys
http://linux.die.net/man/8/setfont
http://www.mjmwired.net/kernel/Documentation/unicode.txt
http://gunnarwrobel.de/wiki/Linux-and-the-keyboard.html
http://freeworld.thc.org/papers/writing-linux-kernel-keylogger.txt
Attachments (1)
Change History (14)
follow-up: 8 comment:1 by , 14 years ago
follow-up: 7 comment:2 by , 14 years ago
Second correction:
"It also converts the input from the keyboard."
and
"keyboard --> scancode --> kernel keymap (unicode output) --> unicode console map (8859-1 output) --> program"
isn't true.
If we're in ASCII mode (kdb_mode -a) the application will always get the 8 bit value from the keymap. Example: It will always receive E4 (hex) if I'm pressing "ä" on the keyboard. The unicode console map doesn't influence this value.
If we're in unicode mode (kdb_mode -u) the application will receive the UTF-8 value.
But the rest of my ticket should be true ... ;-)
follow-up: 4 comment:3 by , 14 years ago
Oh, and why don't we add something for us US/UK English people? Here are a couple examples:
Non Unicode:
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console for non-Unicode English systems KEYMAP="us" FONT=“lat1-16 -m 8859-1” # End /etc/sysconfig/console for non-Unicode English systems EOF
In Unicode:
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console for Unicode English systems UNICODE="1" KEYMAP="us" # On my (willimm) system, and Ken's systems too, FONT is sigma-general-8x16. FONT=“lat1-16” # End /etc/sysconfig/console for Unicode English systems EOF
For UK English users, KEYMAP is uk instead of US.
follow-up: 5 comment:4 by , 14 years ago
Replying to willimm:
Oh, and why don't we add something for us US/UK English people?
I can't speak for a UK configuration, but for US,
"This section discusses how to configure the console and consolelog bootscripts that set up the keyboard map, console font and console kernel log level. If non-ASCII characters (e.g., the copyright sign, the British pound sign and Euro symbol) will not be used and the keyboard is a U.S. one, much of this section can be skipped."
I skip the entire section. I've never noticed anything more needed.
follow-up: 6 comment:5 by , 14 years ago
Replying to bdubbs@…:
I can't speak for a UK configuration, but for US,
"This section discusses how to configure the console and consolelog bootscripts that set up the keyboard map, console font and console kernel log level. If non-ASCII characters (e.g., the copyright sign, the British pound sign and Euro symbol) will not be used and the keyboard is a U.S. one, much of this section can be skipped."
I skip the entire section. I've never noticed anything more needed.
Well, let's say that you want a Unicode setup. Or you need to change the console log level. Or you want to use a prettier font. In that case, then I would NOT skip the entire section.
What would work best is to get rid of the above note and provide setups for US/UK English. Then provide default implementations for US/UK english systems.
comment:6 by , 14 years ago
Replying to willimm:
Well, let's say that you want a Unicode setup.
For what application?
"The /etc/sysconfig/console file only controls the Linux text console localization. It has nothing to do with setting the proper keyboard layout and terminal fonts in the X Window System, with ssh sessions or with a serial console."
Or you need to change the console log level. Or you want to use a prettier font. In that case, then I would NOT skip the entire section.
Of course. That's one reason the section is there. The vast majority of US users can skip it.
What would work best is to get rid of the above note and provide setups for US/UK English. Then provide default implementations for US/UK english systems.
That would imply it's needed when it's not.
comment:7 by , 14 years ago
Replying to splotz90:
If we're in ASCII mode (kdb_mode -a) the application will always get the 8 bit value from the keymap.
Should be: "If we're in ASCII mode (kdb_mode -a), the application will always receive the lower 8 bit from the keymap."
I think adding an example for UK could be meaningful, because this keyboard layout differs from the US layout (http://en.wikipedia.org/wiki/Keyboard_layout#United_Kingdom).
comment:8 by , 14 years ago
Replying to splotz90:
Correction:
I've updated my suggestion for the second example (unicode)
LatArCyrHeb-16 shouldn't be used for a german LFS because it doesn't contains the €-symbol ...
So this is better:
cat > /etc/sysconfig/console << "EOF" # Begin /etc/sysconfig/console UNICODE="1" KEYMAP="de-latin1" FONT="lat0-16" # End /etc/sysconfig/console EOF
comment:9 by , 14 years ago
What about an extra hint to provide more examples. That would make life easier for the user ...
Here is an example. I would create such a hint if it is considered as meaningful.
by , 14 years ago
Attachment: | console.txt added |
---|
comment:10 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:11 by , 14 years ago
I've started a discussion for this ticket at http://www.linuxfromscratch.org/pipermail/lfs-dev/2010-August/064239.html. If you wouldn't mind, splotz90, could you follow up there please?
comment:12 by , 14 years ago
Milestone: | 6.7 → 6.8 |
---|
Moving to 6.8 for now, but we may get it in before 6.7.
comment:13 by , 14 years ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
There didn't seem to be much input/consensus in the thread referenced in comment 11, so I'm closing as wontfix.
Correction:
I've updated my suggestion for the second example (unicode):
And I meant "changing the examples" (not adding) ...