#4624 closed enhancement (duplicate)
Defects found through ICA
Reported by: | Pierre Labastie | Owned by: | Pierre Labastie |
---|---|---|---|
Priority: | normal | Milestone: | 10.0 |
Component: | Book | Version: | SVN |
Severity: | normal | Keywords: | |
Cc: |
Description
ICA (Iterative comparison analysis) allows to compare several builds of LFS, each build being done with the system built during the previous build. If some dependency or action was missing on first built and available on followings, it shows up immediately.
Whether lfs should be able to rebuild itself completely identical is maybe not necessary, but if the second build is too different from the first, it may be a indication of incorrectness.
Leaving priority and severity to normal. It will be adjusted depending on what is found.
Note that what we have allows building the whole blfs, so it is unlikely that priority and severity go the "high" direction.
Change History (26)
comment:1 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:2 by , 5 years ago
ranlib and ar (binutils) are different between the first build the second
Using obdump -x shows that libfl.so (from flex) is added in the dependencies for the second build. This is to be expected since flex is not built in chapter 5, and built much after binutils in chapter 6.
Solution build flex in chapter 5 and link /usr/lib/libfl.so* to /tools/lib in "Creating Essential Files and Symlinks".
Possible side effect Several other packages may benefit from having flex executable to generate some files. But it does not show up in ICA.
Question it is not clear how the fl library is used in ar and ranlib. So is it useful?
comment:3 by , 5 years ago
Some shadow programs incorrectly reference /bin/passwd after first build
login, expiry, and su hardcode /bin/passwd insted of /usr/bin/passwd. This may lead to the impossibility of renewing a password when it is expired.
Explanation Shadow's configure tests whether /usr/bin/passwd exists, and if not, hardcodes /bin/passwd. This is close to a bug: it should harcode $prefix/bin/passwd. Anyway...
Solution touch /usr/bin/passwd before building shadow.
Possible side effect None that I can think of: the empty file is almost immediately replaced when installing shadow.
comment:4 by , 5 years ago
new{g,u}idmap (shadow) differ between first and second pass
Using objdump -T, it is found that on second pass, those programs may call three more subprograms: prctl, seteuid, and capset. capset is from libcap, so is not linked on first build because libcap is built after shadow
Solution Move libcap before shadow. Actually, it is logical to have attr, acl and libcap grouped.
Possible side effect None yet.
Question The use of capset is morelikely to be used by pam or systemd. In both cases, we rebuild shadow in blfs, so this problem (and actually the one described in comment:3) disappears then. So, is it usefule to cure this? OTOH, it seems innocuous.
follow-up: 10 comment:5 by , 5 years ago
bison differs between first and second build
Using objdump -x shows that the second build is linked to libtextstyle.so (gettext)
Solution Move gettext before bison
Possible side effect none found yet
Question What does bison use this for? Actually, it seems it is to add the possibility of translated messages in bison sources. Is it useful? OTOH, moving gettext to be built early does not seem to have adverse effects.
comment:6 by , 5 years ago
readline.pc file references termcap on first build
It has Requires.private: termcap, while it is Requires.private: ncurses on second build. This is because ncurses is not found by configure on first build (nothing references /tools/lib when running configure).
Solution Adding --with-curses to configure options is enough.
Possible side effect None.
Question Are there programs using Requires.private in pkgconfig files? OTOH, minor change, that does not impact anything else.
follow-up: 16 comment:7 by , 5 years ago
.la files change the libraries that are linked into programs
Because of a bug in ICA (now fixed), .la files were deleted after first pass, but not after second. This was leading to various programs begin linked differently between 2nd and 3rd pass, and also some RPATH added on 3rd pass.
Solution .la files are evil! Strongly suggest to delete them, but "your distro, your rules".
Possible side effect Several if not removing .la files (specially because of the mixture between meson and libtool).
comment:8 by , 5 years ago
circular dependency between eudev and util-linux
util-linux uses libudev (eudev) in lsblk (with fallback to libblkid from util-linux if libudev has not been linked). And udev uses libblkid to find information about some block devices (specialy cdrom_id, ata_id, etc). Note that this is well known and cured in systemd book.
Solution Build util-linux in chapter 5 for sysv book, as for the systemd book (just remove revision='systemd' ), move eudev before util-linux in chapter 6, and link /tools/lib/pkgconfig/blkid.pc to /usr/lib/pkgconfig before building eudev. Also use LD_LIBRARY_PATH=/tools/lib when running udevadm hwdb --update.
Possible side effect None found yet, except it adds one build in chapter 5.
Question What is the improvement? lsblk gives more detailed information if built against libudev, depending n hardware.
follow-up: 17 comment:9 by , 5 years ago
gcc compilers (c++, gcc, cc1, cc1plus, ...) differ between first and second build
Using objdump -d, it is found that the assembly code differ between first and second build (different use of register, use of different opcodes, ...). I suspect it has something to do with LTO, which would be applied on second pass and not on first, but its highly speculative.
Solution None yet... (any idea welcome!)
follow-up: 11 comment:10 by , 5 years ago
Replying to pierre.labastie:
bison differs between first and second build
Using objdump -x shows that the second build is linked to libtextstyle.so (gettext)
Solution Move gettext before bison
Possible side effect none found yet
Well, there's one: gettext hardcodes bison localedir as /tools/share/locale. Reason: gettext's configure use bison --print-localedir, which points to /tools/share/locale since bison has not been rebuilt yet.
Will try make BISON_LOCALEDIR=/usr/share/locale
comment:11 by , 5 years ago
Replying to pierre.labastie:
Replying to pierre.labastie:
bison differs between first and second build
Using objdump -x shows that the second build is linked to libtextstyle.so (gettext)
Solution Move gettext before bison
Possible side effect none found yet
Well, there's one: gettext hardcodes bison localedir as /tools/share/locale. Reason: gettext's configure use bison --print-localedir, which points to /tools/share/locale since bison has not been rebuilt yet.
Will try make BISON_LOCALEDIR=/usr/share/locale
Seems to do the trick
follow-up: 15 comment:12 by , 5 years ago
Several files in the Python dir differ between 1st and 2nd pass
Namely: /usr/lib/python3.8/pyconfig.h has new defines related to UUID and /usr/lib/python3.8/_sysconfigdata__linux_x86_64-linux-gnu.py has two more entries related to UUID too. Furthermore there is a new file on 2nd pass: /usr/lib/python3.8/lib-dynload/_uuid.cpython-38-x86_64-linux-gnu.so.
Solution (in progress) Looks like configure checks for the header file (and then setup.py when building the associated module) <uuid/uuid.h>. Since I now have util-linux in chapter 5, I'll just try to symlink /tools/include/uuid to /usr/include during python build (removing at the end). But maybe it also needs libuuid.h in /usr/lib, don't know.
comment:13 by , 5 years ago
several files just differ in hardcoded compile time
For the record: vim, libcrypto.so.1.1, libgdbm.so.6.0.0, /use lib/perl5/<version>/<machine>-thread-multi/CORE/libperl.so, and libpython3.8.so.1.0
comment:14 by , 5 years ago
Libraries that are separated from their dbg symbols differ
Just for the record... They just differ in the vicinity of the string containing the name of the .dbg file (not sure why). Seems harmless.
comment:15 by , 5 years ago
Replying to pierre.labastie:
Several files in the Python dir differ between 1st and 2nd pass Solution Looks like configure checks for the header file (and then setup.py when building the associated module) <uuid/uuid.h>. Since I now have util-linux in chapter 5, I'll just try to symlink /tools/include/uuid to /usr/include during python build (removing at the end). But maybe it also needs libuuid.h in /usr/lib, don't know.
Yes, the library (libuuid.so*, not libuuid.h, me stupid) is needed, otherwise the module is built but wihdrawn, because it cannot be imported.
comment:16 by , 5 years ago
Replying to pierre.labastie:
.la files change the libraries that are linked into programs
Because of a bug in ICA (now fixed), .la files were deleted after first pass, but not after second. This was leading to various programs begin linked differently between 2nd and 3rd pass, and also some RPATH added on 3rd pass.
Solution .la files are evil! Strongly suggest to delete them, but "your distro, your rules".
Possible side effect Several if not removing .la files (specially because of the mixture between meson and libtool).
.la files are really evil. In BLFS, libxml2.la makes everything depending on libxml2.so.2 also depending on libicu*.so.${version}. And when icu is upgraded (its soname often change) tons of packages have to be rebuilt. Now I remove those .la files after installing any package.
follow-ups: 18 19 comment:17 by , 5 years ago
Replying to pierre.labastie:
gcc compilers (c++, gcc, cc1, cc1plus, ...) differ between first and second build
Using objdump -d, it is found that the assembly code differ between first and second build (different use of register, use of different opcodes, ...). I suspect it has something to do with LTO, which would be applied on second pass and not on first, but its highly speculative.
Solution None yet... (any idea welcome!)
Possible solution
Do a 3-stage bootstrap for GCC. (removing --disable-bootstrap
).
comment:19 by , 5 years ago
Replying to xry111:
Replying to pierre.labastie:
gcc compilers (c++, gcc, cc1, cc1plus, ...) differ between first and second build
Using objdump -d, it is found that the assembly code differ between first and second build (different use of register, use of different opcodes, ...). I suspect it has something to do with LTO, which would be applied on second pass and not on first, but its highly speculative.
Solution None yet... (any idea welcome!)
Possible solution
Do a 3-stage bootstrap for GCC. (removing
--disable-bootstrap
).
Or build BLFS GCC as soon as possible (with or without bootstrap). But I think we'd rather avoid bootstrapping. Building gcc three times is already rather time consuming. Thinking more about this: if nothing else, try a 2-stage bootstrap. Maybe it is enough. Those are just random thoughts though. I'm building a 2-stage ICA saving build directories of gcc. Hopefully, something should emerge from the comparison of the build directories.
comment:20 by , 5 years ago
All systemd binaries differ between first and second pass
objdump -s shows that there is a string containing /tools/lib in .dynstr on first build and not on the others. Don't know much more yet.
comment:21 by , 5 years ago
I've made little progress on the gcc problem. The only thing is that it occurs just in a few .o files. So, not an lto problem, and very unlikely to be a general optimization problem (otherwise almost all .o files would be affected). I'll try to see if those files are compiled with special compilation options.
comment:22 by , 5 years ago
I've been able to set up an example for the gcc differences, using one of the affected files, which is spellcheck.o. On a finished lfs, cd /sources and unpack gcc, cd to the unpacked tree, then:
mkdir build1 && cd build1 CC=/tools/bin/gcc \ CXX=/tools/bin/g++ \ ../configure --prefix=/usr \ --enable-languages=c,c++ \ --disable-multilib \ --disable-bootstrap \ --with-system-zlib make -j4 all-build make -j4 configure-host cd gcc make -j4 c-family/c-spellcheck.o cd ../.. mkdir build2 && cd build2 ../configure --prefix=/usr #... then the same thing again in build2 without setting CC and CXX
On my machine the produced .o files are different. As can be seen by diffing the output of objdump -d.
comment:23 by , 5 years ago
The problem is the following: the header files which are included, are not the same in each case.
The reason is that the search directory order is not the same, because for /tools/bin/g++, /usr/include is first, while for g++, it comes after /usr/include/c++/9.3.0. And somehow, with this different order, one file is included in the second case, which is not included in the first, containing a lot of use instructions. I guess this is the reason why the generated code is not the same (some different types used).
Solution Use -idirafter instead of -isystem in the specs file. That puts /usr/include at the end of the search. This solves the above problem, and headers for programs which are only in chapter 6 can be found in /usr/include. For those which are in chapter 5 too, /tools/include will be used instead. But this is not a problem, since those headers are the same.
comment:24 by , 5 years ago
The -idirafter solution is not totally satisfactory: I happen to have build on two different distros, and one of them generated gcc-pass2 with intl support from glibc, while the other not.
But with -idirafter, the glibc headers from /tools are used instead of those from /usr. Those in /usr do have intl support, but not those in /tools. So some programs are slightly different between first pass (using /tools) and second pass (using /usr).
comment:25 by , 5 years ago
Resolution: | → duplicate |
---|---|
Status: | assigned → closed |
Each of the following comments signal a problem identified with ICA. For each, an explanation is given, sometimes tentatively, and a solution is proposed. Each solution is open to discussion, so please comment!