Changeset 675606b for chapter05/toolchaintechnotes.xml
- Timestamp:
- 06/16/2020 11:56:28 AM (4 years ago)
- Branches:
- 10.0, 10.0-rc1, 10.1, 10.1-rc1, 11.0, 11.0-rc1, 11.0-rc2, 11.0-rc3, 11.1, 11.1-rc1, 11.2, 11.2-rc1, 11.3, 11.3-rc1, 12.0, 12.0-rc1, 12.1, 12.1-rc1, arm, bdubbs/gcc13, ml-11.0, multilib, renodr/libudev-from-systemd, s6-init, trunk, xry111/arm64, xry111/arm64-12.0, xry111/clfs-ng, xry111/lfs-next, xry111/loongarch, xry111/loongarch-12.0, xry111/loongarch-12.1, xry111/mips64el, xry111/pip3, xry111/rust-wip-20221008, xry111/update-glibc
- Children:
- 9a05e45
- Parents:
- 560065f (diff), 1cd5961 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the(diff)
links above to see all the changes relative to each parent. - File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
chapter05/toolchaintechnotes.xml
r560065f r675606b 17 17 to at any time during the process.</para> 18 18 19 <para>The overall goal of <xref linkend="chapter-temporary-tools"/> is to 20 produce a temporary area that contains a known-good set of tools that can be 21 isolated from the host system. By using <command>chroot</command>, the 22 commands in the remaining chapters will be contained within that environment, 23 ensuring a clean, trouble-free build of the target LFS system. The build 24 process has been designed to minimize the risks for new readers and to provide 25 the most educational value at the same time.</para> 26 27 <note> 28 <para>Before continuing, be aware of the name of the working platform, 29 often referred to as the target triplet. A simple way to determine the 30 name of the target triplet is to run the <command>config.guess</command> 31 script that comes with the source for many packages. Unpack the Binutils 32 sources and run the script: <userinput>./config.guess</userinput> and note 33 the output. For example, for a 32-bit Intel processor the 34 output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit 35 system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para> 36 37 <para>Also be aware of the name of the platform's dynamic linker, often 38 referred to as the dynamic loader (not to be confused with the standard 39 linker <command>ld</command> that is part of Binutils). The dynamic linker 40 provided by Glibc finds and loads the shared libraries needed by a program, 41 prepares the program to run, and then runs it. The name of the dynamic 42 linker for a 32-bit Intel machine will be <filename 43 class="libraryfile">ld-linux.so.2</filename> (<filename 44 class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems). A 45 sure-fire way to determine the name of the dynamic linker is to inspect a 46 random binary from the host system by running: <userinput>readelf -l 47 <name of binary> | grep interpreter</userinput> and noting the 48 output. The authoritative reference covering all platforms is in the 49 <filename>shlib-versions</filename> file in the root of the Glibc source 50 tree.</para> 51 </note> 52 53 <para>Some key technical points of how the <xref 54 linkend="chapter-temporary-tools"/> build method works:</para> 55 56 <itemizedlist> 57 <listitem> 58 <para>Slightly adjusting the name of the working platform, by changing the 59 "vendor" field target triplet by way of the 60 <envar>LFS_TGT</envar> variable, ensures that the first build of Binutils 61 and GCC produces a compatible cross-linker and cross-compiler. Instead of 62 producing binaries for another architecture, the cross-linker and 63 cross-compiler will produce binaries compatible with the current 64 hardware.</para> 65 </listitem> 66 <listitem> 67 <para> The temporary libraries are cross-compiled. Because a 68 cross-compiler by its nature cannot rely on anything from its host 69 system, this method removes potential contamination of the target 70 system by lessening the chance of headers or libraries from the host 71 being incorporated into the new tools. Cross-compilation also allows for 72 the possibility of building both 32-bit and 64-bit libraries on 64-bit 73 capable hardware.</para> 74 </listitem> 75 <listitem> 76 <para>Careful manipulation of the GCC source tells the compiler which target 77 dynamic linker will be used.</para> 78 </listitem> 79 </itemizedlist> 80 81 <para>Binutils is installed first because the <command>configure</command> 82 runs of both GCC and Glibc perform various feature tests on the assembler 83 and linker to determine which software features to enable or disable. This 84 is more important than one might first realize. An incorrectly configured 85 GCC or Glibc can result in a subtly broken toolchain, where the impact of 86 such breakage might not show up until near the end of the build of an 87 entire distribution. A test suite failure will usually highlight this error 88 before too much additional work is performed.</para> 89 90 <para>Binutils installs its assembler and linker in two locations, 91 <filename class="directory">/tools/bin</filename> and <filename 92 class="directory">/tools/$LFS_TGT/bin</filename>. The tools in one 93 location are hard linked to the other. An important facet of the linker is 94 its library search order. Detailed information can be obtained from 95 <command>ld</command> by passing it the <parameter>--verbose</parameter> 96 flag. For example, an <userinput>ld --verbose | grep SEARCH</userinput> 97 will illustrate the current search paths and their order. It shows which 98 files are linked by <command>ld</command> by compiling a dummy program and 99 passing the <parameter>--verbose</parameter> switch to the linker. For example, 100 <userinput>gcc dummy.c -Wl,--verbose 2>&1 | grep succeeded</userinput> 101 will show all the files successfully opened during the linking.</para> 102 103 <para>The next package installed is GCC. An example of what can be 104 seen during its run of <command>configure</command> is:</para> 105 106 <screen><computeroutput>checking what assembler to use... /tools/i686-lfs-linux-gnu/bin/as 107 checking what linker to use... /tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen> 108 109 <para>This is important for the reasons mentioned above. It also demonstrates 110 that GCC's configure script does not search the PATH directories to find which 111 tools to use. However, during the actual operation of <command>gcc</command> 112 itself, the same search paths are not necessarily used. To find out which 113 standard linker <command>gcc</command> will use, run: 114 <userinput>gcc -print-prog-name=ld</userinput>.</para> 115 116 <para>Detailed information can be obtained from <command>gcc</command> by 117 passing it the <parameter>-v</parameter> command line option while compiling 118 a dummy program. For example, <userinput>gcc -v dummy.c</userinput> will show 119 detailed information about the preprocessor, compilation, and assembly stages, 120 including <command>gcc</command>'s included search paths and their order.</para> 121 122 <para>Next installed are sanitized Linux API headers. These allow the standard 123 C library (Glibc) to interface with features that the Linux kernel will 124 provide.</para> 125 126 <para>The next package installed is Glibc. The most important considerations 127 for building Glibc are the compiler, binary tools, and kernel headers. The 128 compiler is generally not an issue since Glibc will always use the compiler 129 relating to the <parameter>--host</parameter> parameter passed to its 130 configure script; e.g. in our case, the compiler will be 131 <command>i686-lfs-linux-gnu-gcc</command>. The binary tools and kernel 132 headers can be a bit more complicated. Therefore, take no risks and use the 133 available configure switches to enforce the correct selections. After the run 134 of <command>configure</command>, check the contents of the 135 <filename>config.make</filename> file in the <filename 136 class="directory">glibc-build</filename> directory for all important details. 137 Note the use of <parameter>CC="i686-lfs-gnu-gcc"</parameter> to control which 138 binary tools are used and the use of the <parameter>-nostdinc</parameter> and 139 <parameter>-isystem</parameter> flags to control the compiler's include 140 search path. These items highlight an important aspect of the Glibc 141 package—it is very self-sufficient in terms of its build machinery and 142 generally does not rely on toolchain defaults.</para> 143 144 <para>During the second pass of Binutils, we are able to utilize the 145 <parameter>--with-lib-path</parameter> configure switch to control 146 <command>ld</command>'s library search path.</para> 147 148 <para>For the second pass of GCC, its sources also need to be modified to 149 tell GCC to use the new dynamic linker. Failure to do so will result in the 150 GCC programs themselves having the name of the dynamic linker from the host 151 system's <filename class="directory">/lib</filename> directory embedded into 152 them, which would defeat the goal of getting away from the host. From this 153 point onwards, the core toolchain is self-contained and self-hosted. The 154 remainder of the <xref linkend="chapter-temporary-tools"/> packages all build 155 against the new Glibc in <filename 156 class="directory">/tools</filename>.</para> 157 158 <para>Upon entering the chroot environment in <xref 159 linkend="chapter-building-system"/>, the first major package to be 160 installed is Glibc, due to its self-sufficient nature mentioned above. 161 Once this Glibc is installed into <filename 162 class="directory">/usr</filename>, we will perform a quick changeover of the 163 toolchain defaults, and then proceed in building the rest of the target 164 LFS system.</para> 19 <para>The overall goal of this chapter and <xref 20 linkend="chapter-temporary-tools"/> is to produce a temporary area that 21 contains a known-good set of tools that can be isolated from the host system. 22 By using <command>chroot</command>, the commands in the remaining chapters 23 will be contained within that environment, ensuring a clean, trouble-free 24 build of the target LFS system. The build process has been designed to 25 minimize the risks for new readers and to provide the most educational value 26 at the same time.</para> 27 28 <para>The build process is based on the process of 29 <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used 30 for building a compiler and its toolchain for a machine different from 31 the one that is used for the build. This is not strictly needed for LFS, 32 since the machine where the new system will run is the same as the one 33 used for the build. But cross-compilation has the great advantage that 34 anything that is cross-compiled cannot depend on the host environment.</para> 35 36 <sect2 id="cross-compile" xreflabel="About Cross-Compilation"> 37 38 <title>About Cross-Compilation</title> 39 40 <para>Cross-compilation involves some concepts that deserve a section on 41 their own. Although this section may be omitted in a first reading, it 42 is strongly suggested to come back to it later in order to get a full 43 grasp of the build process.</para> 44 45 <para>Let us first define some terms used in this context:</para> 46 47 <variablelist> 48 <varlistentry><term>build</term><listitem> 49 <para>is the machine where we build programs. Note that this machine 50 is referred to as the <quote>host</quote> in other 51 sections.</para></listitem> 52 </varlistentry> 53 54 <varlistentry><term>host</term><listitem> 55 <para>is the machine/system where the built programs will run. Note 56 that this use of <quote>host</quote> is not the same as in other 57 sections.</para></listitem> 58 </varlistentry> 59 60 <varlistentry><term>target</term><listitem> 61 <para>is only used for compilers. It is the machine the compiler 62 produces code for. It may be different from both build and 63 host.</para></listitem> 64 </varlistentry> 65 66 </variablelist> 67 68 <para>As an example, let us imagine the following scenario: we may have a 69 compiler on a slow machine only, let's call the machine A, and the compiler 70 ccA. We may have also a fast machine (B), but with no compiler, and we may 71 want to produce code for a another slow machine (C). Then, to build a 72 compiler for machine C, we would have three stages:</para> 73 74 <informaltable align="center"> 75 <tgroup cols="5"> 76 <colspec colnum="1" align="center"/> 77 <colspec colnum="2" align="center"/> 78 <colspec colnum="3" align="center"/> 79 <colspec colnum="4" align="center"/> 80 <colspec colnum="5" align="left"/> 81 <thead> 82 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry> 83 <entry>Target</entry><entry>Action</entry></row> 84 </thead> 85 <tbody> 86 <row> 87 <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry> 88 <entry>build cross-compiler cc1 using ccA on machine A</entry> 89 </row> 90 <row> 91 <entry>2</entry><entry>A</entry><entry>B</entry><entry>B</entry> 92 <entry>build cross-compiler cc2 using cc1 on machine A</entry> 93 </row> 94 <row> 95 <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry> 96 <entry>build compiler ccC using cc2 on machine B</entry> 97 </row> 98 </tbody> 99 </tgroup> 100 </informaltable> 101 102 <para>Then, all the other programs needed by machine C can be compiled 103 using cc2 on the fast machine B. Note that unless B can run programs 104 produced for C, there is no way to test the built programs until machine 105 C itself is running. For example, for testing ccC, we may want to add a 106 fourth stage:</para> 107 108 <informaltable align="center"> 109 <tgroup cols="5"> 110 <colspec colnum="1" align="center"/> 111 <colspec colnum="2" align="center"/> 112 <colspec colnum="3" align="center"/> 113 <colspec colnum="4" align="center"/> 114 <colspec colnum="5" align="left"/> 115 <thead> 116 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry> 117 <entry>Target</entry><entry>Action</entry></row> 118 </thead> 119 <tbody> 120 <row> 121 <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry> 122 <entry>rebuild and test ccC using itself on machine C</entry> 123 </row> 124 </tbody> 125 </tgroup> 126 </informaltable> 127 128 <para>In the example above, only cc1 and cc2 are cross-compilers, that is, 129 they produce code for a machine different from the one they are run on. 130 The other compilers ccA and ccC produce code for the machine they are run 131 on. Such compilers are called <emphasis>native</emphasis> compilers.</para> 132 133 </sect2> 134 135 <sect2 id="lfs-cross"> 136 <title>Implementation of Cross-Compilation for LFS</title> 137 138 <note> 139 <para>Almost all the build systems use names of the form 140 cpu-vendor-kernel-os referred to as the machine triplet. An astute 141 reader may wonder why a <quote>triplet</quote> refers to a four component 142 name. The reason is history: initially, three component names were enough 143 to designate unambiguously a machine, but with new machines and systems 144 appearing, that proved insufficient. The word <quote>triplet</quote> 145 remained. A simple way to determine your machine triplet is to run 146 the <command>config.guess</command> 147 script that comes with the source for many packages. Unpack the binutils 148 sources and run the script: <userinput>./config.guess</userinput> and note 149 the output. For example, for a 32-bit Intel processor the 150 output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit 151 system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para> 152 153 <para>Also be aware of the name of the platform's dynamic linker, often 154 referred to as the dynamic loader (not to be confused with the standard 155 linker <command>ld</command> that is part of binutils). The dynamic linker 156 provided by Glibc finds and loads the shared libraries needed by a 157 program, prepares the program to run, and then runs it. The name of the 158 dynamic linker for a 32-bit Intel machine will be <filename 159 class="libraryfile">ld-linux.so.2</filename> (<filename 160 class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems). A 161 sure-fire way to determine the name of the dynamic linker is to inspect a 162 random binary from the host system by running: <userinput>readelf -l 163 <name of binary> | grep interpreter</userinput> and noting the 164 output. The authoritative reference covering all platforms is in the 165 <filename>shlib-versions</filename> file in the root of the Glibc source 166 tree.</para> 167 </note> 168 169 <para>In order to fake a cross compilation, the name of the host triplet 170 is slightly adjusted by changing the "vendor" field in the 171 <envar>LFS_TGT</envar> variable. We also use the 172 <parameter>--with-sysroot</parameter> option when building the cross linker and 173 cross compiler to tell them where to find the needed host files. This 174 ensures that none of the other programs built in <xref 175 linkend="chapter-temporary-tools"/> can link to libraries on the build 176 machine. Only two stages are mandatory, and one more for tests:</para> 177 178 <informaltable align="center"> 179 <tgroup cols="5"> 180 <colspec colnum="1" align="center"/> 181 <colspec colnum="2" align="center"/> 182 <colspec colnum="3" align="center"/> 183 <colspec colnum="4" align="center"/> 184 <colspec colnum="5" align="left"/> 185 <thead> 186 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry> 187 <entry>Target</entry><entry>Action</entry></row> 188 </thead> 189 <tbody> 190 <row> 191 <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry> 192 <entry>build cross-compiler cc1 using cc-pc on pc</entry> 193 </row> 194 <row> 195 <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry> 196 <entry>build compiler cc-lfs using cc1 on pc</entry> 197 </row> 198 <row> 199 <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry> 200 <entry>rebuild and test cc-lfs using itself on lfs</entry> 201 </row> 202 </tbody> 203 </tgroup> 204 </informaltable> 205 206 <para>In the above table, <quote>on pc</quote> means the commands are run 207 on a machine using the already installed distribution. <quote>On 208 lfs</quote> means the commands are run in a chrooted environment.</para> 209 210 <para>Now, there is more about cross-compiling: the C language is not 211 just a compiler, but also defines a standard library. In this book, the 212 GNU C library, named glibc, is used. This library must 213 be compiled for the lfs machine, that is, using the cross compiler cc1. 214 But the compiler itself uses an internal library implementing complex 215 instructions not available in the assembler instruction set. This 216 internal library is named libgcc, and must be linked to the glibc 217 library to be fully functional! Furthermore, the standard library for 218 C++ (libstdc++) also needs being linked to glibc. The solution 219 to this chicken and egg problem is to first build a degraded cc1 based libgcc, 220 lacking some fuctionalities such as threads and exception handling, then 221 build glibc using this degraded compiler (glibc itself is not 222 degraded), then build libstdc++. But this last library will lack the 223 same functionalities as libgcc.</para> 224 225 <para>This is not the end of the story: the conclusion of the preceding 226 paragraph is that cc1 is unable to build a fully functional libstdc++, but 227 this is the only compiler available for building the C/C++ libraries 228 during stage 2! Of course, the compiler built during stage 2, cc-lfs, 229 would be able to build those libraries, but (1) the build system of 230 GCC does not know that it is usable on pc, and (2) using it on pc 231 would be at risk of linking to the pc libraries, since cc-lfs is a native 232 compiler. So we have to build libstdc++ later, in chroot.</para> 233 234 </sect2> 235 236 <sect2 id="other-details"> 237 238 <title>Other procedural details</title> 239 240 <para>The cross-compiler will be installed in a separate <filename 241 class="directory">$LFS/tools</filename> directory, since it will not 242 be part of the final system.</para> 243 244 <para>Binutils is installed first because the <command>configure</command> 245 runs of both GCC and Glibc perform various feature tests on the assembler 246 and linker to determine which software features to enable or disable. This 247 is more important than one might first realize. An incorrectly configured 248 GCC or Glibc can result in a subtly broken toolchain, where the impact of 249 such breakage might not show up until near the end of the build of an 250 entire distribution. A test suite failure will usually highlight this error 251 before too much additional work is performed.</para> 252 253 <para>Binutils installs its assembler and linker in two locations, 254 <filename class="directory">$LFS/tools/bin</filename> and <filename 255 class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one 256 location are hard linked to the other. An important facet of the linker is 257 its library search order. Detailed information can be obtained from 258 <command>ld</command> by passing it the <parameter>--verbose</parameter> 259 flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command> 260 will illustrate the current search paths and their order. It shows which 261 files are linked by <command>ld</command> by compiling a dummy program and 262 passing the <parameter>--verbose</parameter> switch to the linker. For 263 example, 264 <command>$LFS_TGT-gcc dummy.c -Wl,--verbose 2>&1 | grep succeeded</command> 265 will show all the files successfully opened during the linking.</para> 266 267 <para>The next package installed is GCC. An example of what can be 268 seen during its run of <command>configure</command> is:</para> 269 270 <screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/as 271 checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen> 272 273 <para>This is important for the reasons mentioned above. It also 274 demonstrates that GCC's configure script does not search the PATH 275 directories to find which tools to use. However, during the actual 276 operation of <command>gcc</command> itself, the same search paths are not 277 necessarily used. To find out which standard linker <command>gcc</command> 278 will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>.</para> 279 280 <para>Detailed information can be obtained from <command>gcc</command> by 281 passing it the <parameter>-v</parameter> command line option while compiling 282 a dummy program. For example, <command>gcc -v dummy.c</command> will show 283 detailed information about the preprocessor, compilation, and assembly 284 stages, including <command>gcc</command>'s included search paths and their 285 order.</para> 286 287 <para>Next installed are sanitized Linux API headers. These allow the 288 standard C library (Glibc) to interface with features that the Linux 289 kernel will provide.</para> 290 291 <para>The next package installed is Glibc. The most important 292 considerations for building Glibc are the compiler, binary tools, and 293 kernel headers. The compiler is generally not an issue since Glibc will 294 always use the compiler relating to the <parameter>--host</parameter> 295 parameter passed to its configure script; e.g. in our case, the compiler 296 will be <command>$LFS_TGT-gcc</command>. The binary tools and kernel 297 headers can be a bit more complicated. Therefore, take no risks and use 298 the available configure switches to enforce the correct selections. After 299 the run of <command>configure</command>, check the contents of the 300 <filename>config.make</filename> file in the <filename 301 class="directory">build</filename> directory for all important details. 302 Note the use of <parameter>CC="$LFS_TGT-gcc"</parameter> (with 303 <envar>$LFS_TGT</envar> expanded) to control which binary tools are used 304 and the use of the <parameter>-nostdinc</parameter> and 305 <parameter>-isystem</parameter> flags to control the compiler's include 306 search path. These items highlight an important aspect of the Glibc 307 package—it is very self-sufficient in terms of its build machinery 308 and generally does not rely on toolchain defaults.</para> 309 310 <para>As said above, the standard C++ library is compiled next, followed in 311 Chapter 6 by all the programs that need themselves to be built. The install 312 step of libstdc++ uses the <envar>DESTDIR</envar> variable to have the 313 programs land into the LFS filesystem.</para> 314 315 <para>In Chapter 7 the native lfs compiler is built. First binutils-pass2, 316 with the same <envar>DESTDIR</envar> install as the other programs is 317 built, and then the second pass of GCC is constructed, omitting libstdc++ 318 and other non-important libraries. Due to some weird logic in GCC's 319 configure script, <envar>CC_FOR_TARGET</envar> ends up as 320 <command>cc</command> when the host is the same as the target, but is 321 different from the build system. This is why 322 <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is put explicitely into 323 the configure options.</para> 324 325 <para>Upon entering the chroot environment in <xref 326 linkend="chapter-chroot-temporary-tools"/>, the first task is to install 327 libstdc++. Then temporary installations of programs needed for the proper 328 operation of the toolchain are performed. Programs needed for testing 329 other programs are also built. From this point onwards, the 330 core toolchain is self-contained and self-hosted. In 331 <xref linkend="chapter-building-system"/>, final versions of all the 332 packages needed for a fully functional system are built, tested and 333 installed.</para> 334 335 </sect2> 165 336 166 337 </sect1>
Note:
See TracChangeset
for help on using the changeset viewer.