Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#4624 closed enhancement (duplicate)

Defects found through ICA

Reported by: Pierre Labastie Owned by: Pierre Labastie
Priority: normal Milestone: 10.0
Component: Book Version: SVN
Severity: normal Keywords:
Cc:

Description

ICA (Iterative comparison analysis) allows to compare several builds of LFS, each build being done with the system built during the previous build. If some dependency or action was missing on first built and available on followings, it shows up immediately.

Whether lfs should be able to rebuild itself completely identical is maybe not necessary, but if the second build is too different from the first, it may be a indication of incorrectness.

Leaving priority and severity to normal. It will be adjusted depending on what is found.

Note that what we have allows building the whole blfs, so it is unlikely that priority and severity go the "high" direction.

Change History (26)

comment:1 by Pierre Labastie, 5 years ago

Owner: changed from lfs-book to Pierre Labastie
Status: newassigned

Each of the following comments signal a problem identified with ICA. For each, an explanation is given, sometimes tentatively, and a solution is proposed. Each solution is open to discussion, so please comment!

comment:2 by Pierre Labastie, 5 years ago

ranlib and ar (binutils) are different between the first build the second

Using obdump -x shows that libfl.so (from flex) is added in the dependencies for the second build. This is to be expected since flex is not built in chapter 5, and built much after binutils in chapter 6.

Solution build flex in chapter 5 and link /usr/lib/libfl.so* to /tools/lib in "Creating Essential Files and Symlinks".

Possible side effect Several other packages may benefit from having flex executable to generate some files. But it does not show up in ICA.

Question it is not clear how the fl library is used in ar and ranlib. So is it useful?

comment:3 by Pierre Labastie, 5 years ago

Some shadow programs incorrectly reference /bin/passwd after first build

login, expiry, and su hardcode /bin/passwd insted of /usr/bin/passwd. This may lead to the impossibility of renewing a password when it is expired.

Explanation Shadow's configure tests whether /usr/bin/passwd exists, and if not, hardcodes /bin/passwd. This is close to a bug: it should harcode $prefix/bin/passwd. Anyway...

Solution touch /usr/bin/passwd before building shadow.

Possible side effect None that I can think of: the empty file is almost immediately replaced when installing shadow.

comment:4 by Pierre Labastie, 5 years ago

new{g,u}idmap (shadow) differ between first and second pass

Using objdump -T, it is found that on second pass, those programs may call three more subprograms: prctl, seteuid, and capset. capset is from libcap, so is not linked on first build because libcap is built after shadow

Solution Move libcap before shadow. Actually, it is logical to have attr, acl and libcap grouped.

Possible side effect None yet.

Question The use of capset is morelikely to be used by pam or systemd. In both cases, we rebuild shadow in blfs, so this problem (and actually the one described in comment:3) disappears then. So, is it usefule to cure this? OTOH, it seems innocuous.

comment:5 by Pierre Labastie, 5 years ago

bison differs between first and second build

Using objdump -x shows that the second build is linked to libtextstyle.so (gettext)

Solution Move gettext before bison

Possible side effect none found yet

Question What does bison use this for? Actually, it seems it is to add the possibility of translated messages in bison sources. Is it useful? OTOH, moving gettext to be built early does not seem to have adverse effects.

comment:6 by Pierre Labastie, 5 years ago

readline.pc file references termcap on first build

It has Requires.private: termcap, while it is Requires.private: ncurses on second build. This is because ncurses is not found by configure on first build (nothing references /tools/lib when running configure).

Solution Adding --with-curses to configure options is enough.

Possible side effect None.

Question Are there programs using Requires.private in pkgconfig files? OTOH, minor change, that does not impact anything else.

comment:7 by Pierre Labastie, 5 years ago

.la files change the libraries that are linked into programs

Because of a bug in ICA (now fixed), .la files were deleted after first pass, but not after second. This was leading to various programs begin linked differently between 2nd and 3rd pass, and also some RPATH added on 3rd pass.

Solution .la files are evil! Strongly suggest to delete them, but "your distro, your rules".

Possible side effect Several if not removing .la files (specially because of the mixture between meson and libtool).

comment:8 by Pierre Labastie, 5 years ago

circular dependency between eudev and util-linux

util-linux uses libudev (eudev) in lsblk (with fallback to libblkid from util-linux if libudev has not been linked). And udev uses libblkid to find information about some block devices (specialy cdrom_id, ata_id, etc). Note that this is well known and cured in systemd book.

Solution Build util-linux in chapter 5 for sysv book, as for the systemd book (just remove revision='systemd' ), move eudev before util-linux in chapter 6, and link /tools/lib/pkgconfig/blkid.pc to /usr/lib/pkgconfig before building eudev. Also use LD_LIBRARY_PATH=/tools/lib when running udevadm hwdb --update.

Possible side effect None found yet, except it adds one build in chapter 5.

Question What is the improvement? lsblk gives more detailed information if built against libudev, depending n hardware.

comment:9 by Pierre Labastie, 5 years ago

gcc compilers (c++, gcc, cc1, cc1plus, ...) differ between first and second build

Using objdump -d, it is found that the assembly code differ between first and second build (different use of register, use of different opcodes, ...). I suspect it has something to do with LTO, which would be applied on second pass and not on first, but its highly speculative.

Solution None yet... (any idea welcome!)

in reply to:  5 ; comment:10 by Pierre Labastie, 5 years ago

Replying to pierre.labastie:

bison differs between first and second build

Using objdump -x shows that the second build is linked to libtextstyle.so (gettext)

Solution Move gettext before bison

Possible side effect none found yet

Well, there's one: gettext hardcodes bison localedir as /tools/share/locale. Reason: gettext's configure use bison --print-localedir, which points to /tools/share/locale since bison has not been rebuilt yet.

Will try make BISON_LOCALEDIR=/usr/share/locale

in reply to:  10 comment:11 by Pierre Labastie, 5 years ago

Replying to pierre.labastie:

Replying to pierre.labastie:

bison differs between first and second build

Using objdump -x shows that the second build is linked to libtextstyle.so (gettext)

Solution Move gettext before bison

Possible side effect none found yet

Well, there's one: gettext hardcodes bison localedir as /tools/share/locale. Reason: gettext's configure use bison --print-localedir, which points to /tools/share/locale since bison has not been rebuilt yet.

Will try make BISON_LOCALEDIR=/usr/share/locale

Seems to do the trick

comment:12 by Pierre Labastie, 5 years ago

Several files in the Python dir differ between 1st and 2nd pass

Namely: /usr/lib/python3.8/pyconfig.h has new defines related to UUID and /usr/lib/python3.8/_sysconfigdata__linux_x86_64-linux-gnu.py has two more entries related to UUID too. Furthermore there is a new file on 2nd pass: /usr/lib/python3.8/lib-dynload/_uuid.cpython-38-x86_64-linux-gnu.so.

Solution (in progress) Looks like configure checks for the header file (and then setup.py when building the associated module) <uuid/uuid.h>. Since I now have util-linux in chapter 5, I'll just try to symlink /tools/include/uuid to /usr/include during python build (removing at the end). But maybe it also needs libuuid.h in /usr/lib, don't know.

comment:13 by Pierre Labastie, 5 years ago

several files just differ in hardcoded compile time

For the record: vim, libcrypto.so.1.1, libgdbm.so.6.0.0, /use lib/perl5/<version>/<machine>-thread-multi/CORE/libperl.so, and libpython3.8.so.1.0

Last edited 5 years ago by Pierre Labastie (previous) (diff)

comment:14 by Pierre Labastie, 5 years ago

Libraries that are separated from their dbg symbols differ

Just for the record... They just differ in the vicinity of the string containing the name of the .dbg file (not sure why). Seems harmless.

in reply to:  12 comment:15 by Pierre Labastie, 5 years ago

Replying to pierre.labastie:

Several files in the Python dir differ between 1st and 2nd pass Solution Looks like configure checks for the header file (and then setup.py when building the associated module) <uuid/uuid.h>. Since I now have util-linux in chapter 5, I'll just try to symlink /tools/include/uuid to /usr/include during python build (removing at the end). But maybe it also needs libuuid.h in /usr/lib, don't know.

Yes, the library (libuuid.so*, not libuuid.h, me stupid) is needed, otherwise the module is built but wihdrawn, because it cannot be imported.

in reply to:  7 comment:16 by Xi Ruoyao, 5 years ago

Replying to pierre.labastie:

.la files change the libraries that are linked into programs

Because of a bug in ICA (now fixed), .la files were deleted after first pass, but not after second. This was leading to various programs begin linked differently between 2nd and 3rd pass, and also some RPATH added on 3rd pass.

Solution .la files are evil! Strongly suggest to delete them, but "your distro, your rules".

Possible side effect Several if not removing .la files (specially because of the mixture between meson and libtool).

.la files are really evil. In BLFS, libxml2.la makes everything depending on libxml2.so.2 also depending on libicu*.so.${version}. And when icu is upgraded (its soname often change) tons of packages have to be rebuilt. Now I remove those .la files after installing any package.

in reply to:  9 ; comment:17 by Xi Ruoyao, 5 years ago

Replying to pierre.labastie:

gcc compilers (c++, gcc, cc1, cc1plus, ...) differ between first and second build

Using objdump -d, it is found that the assembly code differ between first and second build (different use of register, use of different opcodes, ...). I suspect it has something to do with LTO, which would be applied on second pass and not on first, but its highly speculative.

Solution None yet... (any idea welcome!)

Possible solution

Do a 3-stage bootstrap for GCC. (removing --disable-bootstrap).

Last edited 5 years ago by Xi Ruoyao (previous) (diff)

in reply to:  17 comment:18 by Xi Ruoyao, 5 years ago

deleted (replicate unintentionally)

Last edited 5 years ago by Xi Ruoyao (previous) (diff)

in reply to:  17 comment:19 by Pierre Labastie, 5 years ago

Replying to xry111:

Replying to pierre.labastie:

gcc compilers (c++, gcc, cc1, cc1plus, ...) differ between first and second build

Using objdump -d, it is found that the assembly code differ between first and second build (different use of register, use of different opcodes, ...). I suspect it has something to do with LTO, which would be applied on second pass and not on first, but its highly speculative.

Solution None yet... (any idea welcome!)

Possible solution

Do a 3-stage bootstrap for GCC. (removing --disable-bootstrap).

Or build BLFS GCC as soon as possible (with or without bootstrap). But I think we'd rather avoid bootstrapping. Building gcc three times is already rather time consuming. Thinking more about this: if nothing else, try a 2-stage bootstrap. Maybe it is enough. Those are just random thoughts though. I'm building a 2-stage ICA saving build directories of gcc. Hopefully, something should emerge from the comparison of the build directories.

comment:20 by Pierre Labastie, 5 years ago

All systemd binaries differ between first and second pass

objdump -s shows that there is a string containing /tools/lib in .dynstr on first build and not on the others. Don't know much more yet.

comment:21 by Pierre Labastie, 5 years ago

I've made little progress on the gcc problem. The only thing is that it occurs just in a few .o files. So, not an lto problem, and very unlikely to be a general optimization problem (otherwise almost all .o files would be affected). I'll try to see if those files are compiled with special compilation options.

comment:22 by Pierre Labastie, 5 years ago

I've been able to set up an example for the gcc differences, using one of the affected files, which is spellcheck.o. On a finished lfs, cd /sources and unpack gcc, cd to the unpacked tree, then:

mkdir build1 && cd build1
CC=/tools/bin/gcc                     \
CXX=/tools/bin/g++                    \
../configure --prefix=/usr            \
             --enable-languages=c,c++ \
             --disable-multilib       \
             --disable-bootstrap      \
             --with-system-zlib
make -j4 all-build
make -j4 configure-host
cd gcc
make -j4 c-family/c-spellcheck.o
cd ../..
mkdir build2 && cd build2
../configure --prefix=/usr  #... then the same thing again in build2 without setting CC and CXX

On my machine the produced .o files are different. As can be seen by diffing the output of objdump -d.

comment:23 by Pierre Labastie, 5 years ago

The problem is the following: the header files which are included, are not the same in each case.

The reason is that the search directory order is not the same, because for /tools/bin/g++, /usr/include is first, while for g++, it comes after /usr/include/c++/9.3.0. And somehow, with this different order, one file is included in the second case, which is not included in the first, containing a lot of use instructions. I guess this is the reason why the generated code is not the same (some different types used).

Solution Use -idirafter instead of -isystem in the specs file. That puts /usr/include at the end of the search. This solves the above problem, and headers for programs which are only in chapter 6 can be found in /usr/include. For those which are in chapter 5 too, /tools/include will be used instead. But this is not a problem, since those headers are the same.

Last edited 5 years ago by Pierre Labastie (previous) (diff)

comment:24 by Pierre Labastie, 5 years ago

The -idirafter solution is not totally satisfactory: I happen to have build on two different distros, and one of them generated gcc-pass2 with intl support from glibc, while the other not.

But with -idirafter, the glibc headers from /tools are used instead of those from /usr. Those in /usr do have intl support, but not those in /tools. So some programs are slightly different between first pass (using /tools) and second pass (using /usr).

comment:25 by Bruce Dubbs, 5 years ago

Resolution: duplicate
Status: assignedclosed

Split into tickets #4631 through #4642. Closing as duplicate.

comment:26 by Bruce Dubbs, 5 years ago

Milestone: 9.210.0

Milestone renamed

Note: See TracTickets for help on using tickets.