Ignore:
Timestamp:
10/01/2022 08:03:20 AM (19 months ago)
Author:
Xi Ruoyao <xry111@…>
Branches:
xry111/clfs-ng
Children:
ef1f48b
Parents:
259794e (diff), 2bf32ff (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent.
Message:

Merge remote-tracking branch 'origin/trunk' into xry111/clfs-ng

File:
1 edited

Legend:

Unmodified
Added
Removed
  • part3intro/toolchaintechnotes.xml

    r259794e rdd61c77  
    1212
    1313  <para>This section explains some of the rationale and technical details
    14   behind the overall build method. It is not essential to immediately
     14  behind the overall build method. Don't try to immediately
    1515  understand everything in this section. Most of this information will be
    16   clearer after performing an actual build. This section can be referred
    17   to at any time during the process.</para>
     16  clearer after performing an actual build. Come back and re-read this chapter
     17  at any time during the build process.</para>
    1818
    1919  <para>The overall goal of <xref linkend="chapter-cross-tools"/> and <xref
    20   linkend="chapter-temporary-tools"/> is to produce a temporary area that
    21   contains a known-good set of tools that can be isolated from the host system.
    22   By using <command>chroot</command>, the commands in the remaining chapters
    23   will be contained within that environment, ensuring a clean, trouble-free
     20  linkend="chapter-temporary-tools"/> is to produce a temporary area
     21  containing a set of tools that are known to be good, and that are isolated from the host system.
     22  By using the <command>chroot</command> command, the compilations in the remaining chapters
     23  will be isolated within that environment, ensuring a clean, trouble-free
    2424  build of the target LFS system. The build process has been designed to
    25   minimize the risks for new readers and to provide the most educational value
     25  minimize the risks for new readers, and to provide the most educational value
    2626  at the same time.</para>
    2727
    28   <para>The build process is based on the process of
     28  <para>This build process is based on
    2929  <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used
    30   for building a compiler and its toolchain for a machine different from
    31   the one that is used for the build. This is not strictly needed for LFS,
     30  to build a compiler and its associated toolchain for a machine different from
     31  the one that is used for the build. This is not strictly necessary for LFS,
    3232  since the machine where the new system will run is the same as the one
    33   used for the build. But cross-compilation has the great advantage that
     33  used for the build. But cross-compilation has one great advantage:
    3434  anything that is cross-compiled cannot depend on the host environment.</para>
    3535
     
    4040    <note>
    4141      <para>
    42         The LFS book is not, and does not contain a general tutorial to
    43         build a cross (or native) toolchain. Don't use the command in the
    44         book for a cross toolchain which will be used for some purpose other
     42        The LFS book is not (and does not contain) a general tutorial to
     43        build a cross (or native) toolchain. Don't use the commands in the
     44        book for a cross toolchain for some purpose other
    4545        than building LFS, unless you really understand what you are doing.
    4646      </para>
    4747    </note>
    4848
    49     <para>Cross-compilation involves some concepts that deserve a section on
    50     their own. Although this section may be omitted in a first reading,
    51     coming back to it later will be beneficial to your full understanding of
     49    <para>Cross-compilation involves some concepts that deserve a section of
     50    their own. Although this section may be omitted on a first reading,
     51    coming back to it later will help you gain a fuller understanding of
    5252    the process.</para>
    5353
    54     <para>Let us first define some terms used in this context:</para>
     54    <para>Let us first define some terms used in this context.</para>
    5555
    5656    <variablelist>
    57       <varlistentry><term>build</term><listitem>
     57      <varlistentry><term>The build</term><listitem>
    5858        <para>is the machine where we build programs. Note that this machine
    59         is referred to as the <quote>host</quote> in other
    60         sections.</para></listitem>
     59        is also referred to as the <quote>host</quote>.</para></listitem>
    6160      </varlistentry>
    6261
    63       <varlistentry><term>host</term><listitem>
     62      <varlistentry><term>The host</term><listitem>
    6463        <para>is the machine/system where the built programs will run. Note
    6564        that this use of <quote>host</quote> is not the same as in other
     
    6766      </varlistentry>
    6867
    69       <varlistentry><term>target</term><listitem>
     68      <varlistentry><term>The target</term><listitem>
    7069        <para>is only used for compilers. It is the machine the compiler
    71         produces code for. It may be different from both build and
    72         host.</para></listitem>
     70        produces code for. It may be different from both the build and
     71        the host.</para></listitem>
    7372      </varlistentry>
    7473
     
    7675
    7776    <para>As an example, let us imagine the following scenario (sometimes
    78     referred to as <quote>Canadian Cross</quote>): we may have a
     77    referred to as <quote>Canadian Cross</quote>): we have a
    7978    compiler on a slow machine only, let's call it machine A, and the compiler
    80     ccA. We may have also a fast machine (B), but with no compiler, and we may
    81     want to produce code for another slow machine (C). To build a
    82     compiler for machine C, we would have three stages:</para>
     79    ccA. We also have a fast machine (B), but no compiler for (B), and we
     80    want to produce code for a third, slow machine (C). We will build a
     81    compiler for machine C in three stages.</para>
    8382
    8483    <informaltable align="center">
     
    9695          <row>
    9796            <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>
    98             <entry>build cross-compiler cc1 using ccA on machine A</entry>
     97            <entry>Build cross-compiler cc1 using ccA on machine A.</entry>
    9998          </row>
    10099          <row>
    101100            <entry>2</entry><entry>A</entry><entry>B</entry><entry>C</entry>
    102             <entry>build cross-compiler cc2 using cc1 on machine A</entry>
     101            <entry>Build cross-compiler cc2 using cc1 on machine A.</entry>
    103102          </row>
    104103          <row>
    105104            <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>
    106             <entry>build compiler ccC using cc2 on machine B</entry>
     105            <entry>Build compiler ccC using cc2 on machine B.</entry>
    107106          </row>
    108107        </tbody>
     
    110109    </informaltable>
    111110
    112     <para>Then, all the other programs needed by machine C can be compiled
     111    <para>Then, all the programs needed by machine C can be compiled
    113112    using cc2 on the fast machine B. Note that unless B can run programs
    114     produced for C, there is no way to test the built programs until machine
    115     C itself is running. For example, for testing ccC, we may want to add a
     113    produced for C, there is no way to test the newly built programs until machine
     114    C itself is running. For example, to run a test suite on ccC, we may want to add a
    116115    fourth stage:</para>
    117116
     
    130129          <row>
    131130            <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>
    132             <entry>rebuild  and test ccC using itself on machine C</entry>
     131            <entry>Rebuild and test ccC using ccC on machine C.</entry>
    133132          </row>
    134133        </tbody>
     
    147146
    148147    <note>
    149       <para>Almost all the build systems use names of the form
    150       cpu-vendor-kernel-os referred to as the machine triplet. An astute
    151       reader may wonder why a <quote>triplet</quote> refers to a four component
    152       name. The reason is history: initially, three component names were enough
    153       to designate a machine unambiguously, but with new machines and systems
    154       appearing, that proved insufficient. The word <quote>triplet</quote>
    155       remained. A simple way to determine your machine triplet is to run
    156       the <command>config.guess</command>
     148      <para>All packages involved with cross compilation in the book use an
     149      autoconf-based building system.  The autoconf-based building system
     150      accepts system types in the form cpu-vendor-kernel-os,
     151      referred to as the system triplet.  Since the vendor field is mostly
     152      irrelevant, autoconf allows to omit it. An astute reader may wonder
     153      why a <quote>triplet</quote> refers to a four component name. The
     154      reason is the kernel field and the os field originiated from one
     155      <quote>system</quote> field.  Such a three-field form is still valid
     156      today for some systems, for example
     157      <literal>x86_64-unknown-freebsd</literal>.  But for other systems,
     158      two systems can share the same kernel but still be too different to
     159      use a same triplet for them.  For example, an Android running on a
     160      mobile phone is completely different from Ubuntu running on an ARM64
     161      server, despite they are running on the same type of CPU (ARM64) and
     162      using the same kernel (Linux).
     163      Without an emulation layer, you cannot run an
     164      executable for the server on the mobile phone or vice versa.  So the
     165      <quote>system</quote> field is separated into kernel and os fields to
     166      designate these systems unambiguously.  For our example, the Android
     167      system is designated <literal>aarch64-unknown-linux-android</literal>,
     168      and the Ubuntu system is designated
     169      <literal>aarch64-unknown-linux-gnu</literal>.  The word
     170      <quote>triplet</quote> remained. A simple way to determine your
     171      system triplet is to run the <command>config.guess</command>
    157172      script that comes with the source for many packages. Unpack the binutils
    158173      sources and run the script: <userinput>./config.guess</userinput> and note
    159174      the output. For example, for a 32-bit Intel processor the
    160175      output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit
    161       system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para>
    162 
    163       <para>Also be aware of the name of the platform's dynamic linker, often
     176      system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>. On most
     177      Linux systems the even simpler <command>gcc -dumpmachine</command> command
     178      will give you similar information.</para>
     179
     180      <para>You should also be aware of the name of the platform's dynamic linker, often
    164181      referred to as the dynamic loader (not to be confused with the standard
    165182      linker <command>ld</command> that is part of binutils). The dynamic linker
    166       provided by Glibc finds and loads the shared libraries needed by a
     183      provided by package glibc finds and loads the shared libraries needed by a
    167184      program, prepares the program to run, and then runs it. The name of the
    168185      dynamic linker for a 32-bit Intel machine is <filename
    169       class="libraryfile">ld-linux.so.2</filename> and is <filename
    170       class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems. A
     186      class="libraryfile">ld-linux.so.2</filename>; it's <filename
     187      class="libraryfile">ld-linux-x86-64.so.2</filename> on 64-bit systems. A
    171188      sure-fire way to determine the name of the dynamic linker is to inspect a
    172189      random binary from the host system by running: <userinput>readelf -l
    173190      &lt;name of binary&gt; | grep interpreter</userinput> and noting the
    174191      output. The authoritative reference covering all platforms is in the
    175       <filename>shlib-versions</filename> file in the root of the Glibc source
     192      <filename>shlib-versions</filename> file in the root of the glibc source
    176193      tree.</para>
    177194    </note>
     
    179196    <para>In order to fake a cross compilation in LFS, the name of the host triplet
    180197    is slightly adjusted by changing the &quot;vendor&quot; field in the
    181     <envar>LFS_TGT</envar> variable. We also use the
     198    <envar>LFS_TGT</envar> variable so it says &quot;lfs&quot;. We also use the
    182199    <parameter>--with-sysroot</parameter> option when building the cross linker and
    183200    cross compiler to tell them where to find the needed host files. This
    184201    ensures that none of the other programs built in <xref
    185202    linkend="chapter-temporary-tools"/> can link to libraries on the build
    186     machine. Only two stages are mandatory, and one more for tests:</para>
     203    machine. Only two stages are mandatory, plus one more for tests.</para>
    187204
    188205    <informaltable align="center">
     
    200217          <row>
    201218            <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>
    202             <entry>build cross-compiler cc1 using cc-pc on pc</entry>
     219            <entry>Build cross-compiler cc1 using cc-pc on pc.</entry>
    203220          </row>
    204221          <row>
    205222            <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>
    206             <entry>build compiler cc-lfs using cc1 on pc</entry>
     223            <entry>Build compiler cc-lfs using cc1 on pc.</entry>
    207224          </row>
    208225          <row>
    209226            <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
    210             <entry>rebuild and test cc-lfs using itself on lfs</entry>
     227            <entry>Rebuild and test cc-lfs using cc-lfs on lfs.</entry>
    211228          </row>
    212229        </tbody>
     
    214231    </informaltable>
    215232
    216     <para>In the above table, <quote>on pc</quote> means the commands are run
     233    <para>In the preceding table, <quote>on pc</quote> means the commands are run
    217234    on a machine using the already installed distribution. <quote>On
    218235    lfs</quote> means the commands are run in a chrooted environment.</para>
     
    220237    <para>Now, there is more about cross-compiling: the C language is not
    221238    just a compiler, but also defines a standard library. In this book, the
    222     GNU C library, named glibc, is used. This library must
    223     be compiled for the lfs machine, that is, using the cross compiler cc1.
     239    GNU C library, named glibc, is used (there is an alternative, &quot;musl&quot;). This library must
     240    be compiled for the LFS machine; that is, using the cross compiler cc1.
    224241    But the compiler itself uses an internal library implementing complex
    225     instructions not available in the assembler instruction set. This
    226     internal library is named libgcc, and must be linked to the glibc
     242    subroutines for functions not available in the assembler instruction set. This
     243    internal library is named libgcc, and it must be linked to the glibc
    227244    library to be fully functional! Furthermore, the standard library for
    228     C++ (libstdc++) also needs being linked to glibc. The solution to this
    229     chicken and egg problem is to first build a degraded cc1 based libgcc,
    230     lacking some functionalities such as threads and exception handling, then
    231     build glibc using this degraded compiler (glibc itself is not
    232     degraded), then build libstdc++. But this last library will lack the
    233     same functionalities as libgcc.</para>
    234 
    235     <para>This is not the end of the story: the conclusion of the preceding
     245    C++ (libstdc++) must also be linked with glibc. The solution to this
     246    chicken and egg problem is first to build a degraded cc1-based libgcc,
     247    lacking some functionalities such as threads and exception handling, and then
     248    to build glibc using this degraded compiler (glibc itself is not
     249    degraded), and also to build libstdc++. This last library will lack some of the
     250    functionality of libgcc.</para>
     251
     252    <para>This is not the end of the story: the upshot of the preceding
    236253    paragraph is that cc1 is unable to build a fully functional libstdc++, but
    237254    this is the only compiler available for building the C/C++ libraries
    238255    during stage 2! Of course, the compiler built during stage 2, cc-lfs,
    239256    would be able to build those libraries, but (1) the build system of
    240     GCC does not know that it is usable on pc, and (2) using it on pc
    241     would be at risk of linking to the pc libraries, since cc-lfs is a native
    242     compiler. So we have to build libstdc++ later, in chroot.</para>
     257    gcc does not know that it is usable on pc, and (2) using it on pc
     258    would create a risk of linking to the pc libraries, since cc-lfs is a native
     259    compiler. So we have to re-build libstdc++ later as a part of
     260    gcc stage 2.</para>
     261
     262    <para>In &ch-final; (or <quote>stage 3</quote>), all packages needed for
     263    the LFS system are built. Even if a package is already installed into
     264    the LFS system in a previous chapter, we still rebuild the package
     265    unless we are completely sure it's unnecessary.  The main reason for
     266    rebuilding these packages is to settle them down: if we reinstall a LFS
     267    package on a complete LFS system, the installed content of the package
     268    should be same as the content of the same package installed in
     269    &ch-final;.  The temporary packages installed in &ch-tmp-cross; or
     270    &ch-tmp-chroot; cannot satisify this expectation because some of them
     271    are built without optional dependencies installed, and autoconf cannot
     272    perform some feature checks in &ch-tmp-cross; because of cross
     273    compilation, causing the temporary packages to lack optional features
     274    or use suboptimal code routines. Additionally, a minor reason for
     275    rebuilding the packages is allowing to run the testsuite.</para>
    243276
    244277  </sect2>
     
    253286
    254287    <para>Binutils is installed first because the <command>configure</command>
    255     runs of both GCC and Glibc perform various feature tests on the assembler
     288    runs of both gcc and glibc perform various feature tests on the assembler
    256289    and linker to determine which software features to enable or disable. This
    257     is more important than one might first realize. An incorrectly configured
    258     GCC or Glibc can result in a subtly broken toolchain, where the impact of
     290    is more important than one might realize at first. An incorrectly configured
     291    gcc or glibc can result in a subtly broken toolchain, where the impact of
    259292    such breakage might not show up until near the end of the build of an
    260293    entire distribution. A test suite failure will usually highlight this error
     
    275308    will show all the files successfully opened during the linking.</para>
    276309
    277     <para>The next package installed is GCC. An example of what can be
     310    <para>The next package installed is gcc. An example of what can be
    278311    seen during its run of <command>configure</command> is:</para>
    279312
     
    282315
    283316    <para>This is important for the reasons mentioned above. It also
    284     demonstrates that GCC's configure script does not search the PATH
     317    demonstrates that gcc's configure script does not search the PATH
    285318    directories to find which tools to use. However, during the actual
    286319    operation of <command>gcc</command> itself, the same search paths are not
     
    296329
    297330    <para>Next installed are sanitized Linux API headers. These allow the
    298     standard C library (Glibc) to interface with features that the Linux
     331    standard C library (glibc) to interface with features that the Linux
    299332    kernel will provide.</para>
    300333
    301     <para>The next package installed is Glibc. The most important
    302     considerations for building Glibc are the compiler, binary tools, and
    303     kernel headers. The compiler is generally not an issue since Glibc will
     334    <para>The next package installed is glibc. The most important
     335    considerations for building glibc are the compiler, binary tools, and
     336    kernel headers. The compiler is generally not an issue since glibc will
    304337    always use the compiler relating to the <parameter>--host</parameter>
    305338    parameter passed to its configure script; e.g. in our case, the compiler
     
    314347    and the use of the <parameter>-nostdinc</parameter> and
    315348    <parameter>-isystem</parameter> flags to control the compiler's include
    316     search path. These items highlight an important aspect of the Glibc
     349    search path. These items highlight an important aspect of the glibc
    317350    package&mdash;it is very self-sufficient in terms of its build machinery
    318351    and generally does not rely on toolchain defaults.</para>
    319352
    320     <para>As said above, the standard C++ library is compiled next, followed in
    321     <xref linkend="chapter-temporary-tools"/> by all the programs that need
    322     themselves to be built. The install step of all those packages uses the
    323     <envar>DESTDIR</envar> variable to have the
    324     programs land into the LFS filesystem.</para>
     353    <para>As mentioned above, the standard C++ library is compiled next, followed in
     354    <xref linkend="chapter-temporary-tools"/> by other programs that need
     355    to be cross compiled for breaking circular dependencies at build time.
     356    The install step of all those packages uses the
     357    <envar>DESTDIR</envar> variable to force installation
     358    in the LFS filesystem.</para>
    325359
    326360    <para>At the end of <xref linkend="chapter-temporary-tools"/> the native
    327     lfs compiler is installed. First binutils-pass2 is built,
    328     with the same <envar>DESTDIR</envar> install as the other programs,
    329     then the second pass of GCC is constructed, omitting libstdc++
    330     and other non-important libraries.  Due to some weird logic in GCC's
     361    LFS compiler is installed. First binutils-pass2 is built,
     362    in the same <envar>DESTDIR</envar> directory as the other programs,
     363    then the second pass of gcc is constructed, omitting some
     364    non-critical libraries.  Due to some weird logic in gcc's
    331365    configure script, <envar>CC_FOR_TARGET</envar> ends up as
    332     <command>cc</command> when the host is the same as the target, but is
     366    <command>cc</command> when the host is the same as the target, but
    333367    different from the build system. This is why
    334     <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is put explicitly into
    335     the configure options.</para>
     368    <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is declared explicitly
     369    as one of the configuration options.</para>
    336370
    337371    <para>Upon entering the chroot environment in <xref
    338     linkend="chapter-chroot-temporary-tools"/>, the first task is to install
    339     libstdc++. Then temporary installations of programs needed for the proper
     372    linkend="chapter-chroot-temporary-tools"/>,
     373    the temporary installations of programs needed for the proper
    340374    operation of the toolchain are performed. From this point onwards, the
    341375    core toolchain is self-contained and self-hosted. In
Note: See TracChangeset for help on using the changeset viewer.