source: part3intro/toolchaintechnotes.xml@ 3c4e129

11.3 11.3-rc1 multilib trunk xry111/arm64 xry111/clfs-ng xry111/glibc-2.37 xry111/kcfg-revise xry111/pip3 xry111/rust-wip-20221008
Last change on this file since 3c4e129 was 3c4e129, checked in by David Bryant <davidbryant@…>, 6 months ago

Make minor corrections to English idiom / style.

  • Property mode set to 100644
File size: 17.7 KB
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "" [
4 <!ENTITY % general-entities SYSTEM "../general.ent">
5 %general-entities;
8<sect1 id="ch-tools-toolchaintechnotes" xreflabel="Toolchain Technical Notes">
9 <?dbhtml filename="toolchaintechnotes.html"?>
11 <title>Toolchain Technical Notes</title>
13 <para>This section explains some of the rationale and technical details
14 behind the overall build method. Don't try to immediately
15 understand everything in this section. Most of this information will be
16 clearer after performing an actual build. Come back and re-read this chapter
17 at any time during the build process.</para>
19 <para>The overall goal of <xref linkend="chapter-cross-tools"/> and <xref
20 linkend="chapter-temporary-tools"/> is to produce a temporary area
21 containing a set of tools that are known to be good, and that are isolated from the host system.
22 By using the <command>chroot</command> command, the compilations in the remaining chapters
23 will be isolated within that environment, ensuring a clean, trouble-free
24 build of the target LFS system. The build process has been designed to
25 minimize the risks for new readers, and to provide the most educational value
26 at the same time.</para>
28 <para>This build process is based on
29 <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used
30 to build a compiler and its associated toolchain for a machine different from
31 the one that is used for the build. This is not strictly necessary for LFS,
32 since the machine where the new system will run is the same as the one
33 used for the build. But cross-compilation has one great advantage:
34 anything that is cross-compiled cannot depend on the host environment.</para>
36 <sect2 id="cross-compile" xreflabel="About Cross-Compilation">
38 <title>About Cross-Compilation</title>
40 <note>
41 <para>
42 The LFS book is not (and does not contain) a general tutorial to
43 build a cross (or native) toolchain. Don't use the commands in the
44 book for a cross toolchain for some purpose other
45 than building LFS, unless you really understand what you are doing.
46 </para>
47 </note>
49 <para>Cross-compilation involves some concepts that deserve a section of
50 their own. Although this section may be omitted on a first reading,
51 coming back to it later will help you gain a fuller understanding of
52 the process.</para>
54 <para>Let us first define some terms used in this context.</para>
56 <variablelist>
57 <varlistentry><term>The build</term><listitem>
58 <para>is the machine where we build programs. Note that this machine
59 is also referred to as the <quote>host</quote>.</para></listitem>
60 </varlistentry>
62 <varlistentry><term>The host</term><listitem>
63 <para>is the machine/system where the built programs will run. Note
64 that this use of <quote>host</quote> is not the same as in other
65 sections.</para></listitem>
66 </varlistentry>
68 <varlistentry><term>The target</term><listitem>
69 <para>is only used for compilers. It is the machine the compiler
70 produces code for. It may be different from both the build and
71 the host.</para></listitem>
72 </varlistentry>
74 </variablelist>
76 <para>As an example, let us imagine the following scenario (sometimes
77 referred to as <quote>Canadian Cross</quote>): we have a
78 compiler on a slow machine only, let's call it machine A, and the compiler
79 ccA. We also have a fast machine (B), but no compiler for (B), and we
80 want to produce code for a third, slow machine (C). We will build a
81 compiler for machine C in three stages.</para>
83 <informaltable align="center">
84 <tgroup cols="5">
85 <colspec colnum="1" align="center"/>
86 <colspec colnum="2" align="center"/>
87 <colspec colnum="3" align="center"/>
88 <colspec colnum="4" align="center"/>
89 <colspec colnum="5" align="left"/>
90 <thead>
91 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
92 <entry>Target</entry><entry>Action</entry></row>
93 </thead>
94 <tbody>
95 <row>
96 <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>
97 <entry>Build cross-compiler cc1 using ccA on machine A.</entry>
98 </row>
99 <row>
100 <entry>2</entry><entry>A</entry><entry>B</entry><entry>C</entry>
101 <entry>Build cross-compiler cc2 using cc1 on machine A.</entry>
102 </row>
103 <row>
104 <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>
105 <entry>Build compiler ccC using cc2 on machine B.</entry>
106 </row>
107 </tbody>
108 </tgroup>
109 </informaltable>
111 <para>Then, all the programs needed by machine C can be compiled
112 using cc2 on the fast machine B. Note that unless B can run programs
113 produced for C, there is no way to test the newly built programs until machine
114 C itself is running. For example, to run a test suite on ccC, we may want to add a
115 fourth stage:</para>
117 <informaltable align="center">
118 <tgroup cols="5">
119 <colspec colnum="1" align="center"/>
120 <colspec colnum="2" align="center"/>
121 <colspec colnum="3" align="center"/>
122 <colspec colnum="4" align="center"/>
123 <colspec colnum="5" align="left"/>
124 <thead>
125 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
126 <entry>Target</entry><entry>Action</entry></row>
127 </thead>
128 <tbody>
129 <row>
130 <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>
131 <entry>Rebuild and test ccC using ccC on machine C.</entry>
132 </row>
133 </tbody>
134 </tgroup>
135 </informaltable>
137 <para>In the example above, only cc1 and cc2 are cross-compilers, that is,
138 they produce code for a machine different from the one they are run on.
139 The other compilers ccA and ccC produce code for the machine they are run
140 on. Such compilers are called <emphasis>native</emphasis> compilers.</para>
142 </sect2>
144 <sect2 id="lfs-cross">
145 <title>Implementation of Cross-Compilation for LFS</title>
147 <note>
148 <para>Almost all the build systems use names of the form
149 cpu-vendor-kernel-os, referred to as the machine triplet. (Sometimes,
150 the vendor field is omitted.) An astute
151 reader may wonder why a <quote>triplet</quote> refers to a four component
152 name. The reason is historical: initially, three component names were enough
153 to designate a machine unambiguously, but as new machines and systems
154 proliferated, that proved insufficient. The word <quote>triplet</quote>
155 remained. A simple way to determine your machine triplet is to run
156 the <command>config.guess</command>
157 script that comes with the source for many packages. Unpack the binutils
158 sources and run the script: <userinput>./config.guess</userinput> and note
159 the output. For example, for a 32-bit Intel processor the
160 output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit
161 system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>. On most
162 Linux systems the even simpler <command>gcc -dumpmachine</command> command
163 will give you similar information.</para>
165 <para>You should also be aware of the name of the platform's dynamic linker, often
166 referred to as the dynamic loader (not to be confused with the standard
167 linker <command>ld</command> that is part of binutils). The dynamic linker
168 provided by package glibc finds and loads the shared libraries needed by a
169 program, prepares the program to run, and then runs it. The name of the
170 dynamic linker for a 32-bit Intel machine is <filename
171 class="libraryfile"></filename>; it's <filename
172 class="libraryfile"></filename> on 64-bit systems. A
173 sure-fire way to determine the name of the dynamic linker is to inspect a
174 random binary from the host system by running: <userinput>readelf -l
175 &lt;name of binary&gt; | grep interpreter</userinput> and noting the
176 output. The authoritative reference covering all platforms is in the
177 <filename>shlib-versions</filename> file in the root of the glibc source
178 tree.</para>
179 </note>
181 <para>In order to fake a cross compilation in LFS, the name of the host triplet
182 is slightly adjusted by changing the &quot;vendor&quot; field in the
183 <envar>LFS_TGT</envar> variable so it says &quot;lfs&quot;. We also use the
184 <parameter>--with-sysroot</parameter> option when building the cross linker and
185 cross compiler to tell them where to find the needed host files. This
186 ensures that none of the other programs built in <xref
187 linkend="chapter-temporary-tools"/> can link to libraries on the build
188 machine. Only two stages are mandatory, plus one more for tests.</para>
190 <informaltable align="center">
191 <tgroup cols="5">
192 <colspec colnum="1" align="center"/>
193 <colspec colnum="2" align="center"/>
194 <colspec colnum="3" align="center"/>
195 <colspec colnum="4" align="center"/>
196 <colspec colnum="5" align="left"/>
197 <thead>
198 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
199 <entry>Target</entry><entry>Action</entry></row>
200 </thead>
201 <tbody>
202 <row>
203 <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>
204 <entry>Build cross-compiler cc1 using cc-pc on pc.</entry>
205 </row>
206 <row>
207 <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>
208 <entry>Build compiler cc-lfs using cc1 on pc.</entry>
209 </row>
210 <row>
211 <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
212 <entry>Rebuild and test cc-lfs using cc-lfs on lfs.</entry>
213 </row>
214 </tbody>
215 </tgroup>
216 </informaltable>
218 <para>In the preceding table, <quote>on pc</quote> means the commands are run
219 on a machine using the already installed distribution. <quote>On
220 lfs</quote> means the commands are run in a chrooted environment.</para>
222 <para>Now, there is more about cross-compiling: the C language is not
223 just a compiler, but also defines a standard library. In this book, the
224 GNU C library, named glibc, is used (there is an alternative, &quot;musl&quot;). This library must
225 be compiled for the LFS machine; that is, using the cross compiler cc1.
226 But the compiler itself uses an internal library implementing complex
227 subroutines for functions not available in the assembler instruction set. This
228 internal library is named libgcc, and it must be linked to the glibc
229 library to be fully functional! Furthermore, the standard library for
230 C++ (libstdc++) must also be linked with glibc. The solution to this
231 chicken and egg problem is first to build a degraded cc1-based libgcc,
232 lacking some functionalities such as threads and exception handling, and then
233 to build glibc using this degraded compiler (glibc itself is not
234 degraded), and also to build libstdc++. This last library will lack some of the
235 functionality of libgcc.</para>
237 <para>This is not the end of the story: the upshot of the preceding
238 paragraph is that cc1 is unable to build a fully functional libstdc++, but
239 this is the only compiler available for building the C/C++ libraries
240 during stage 2! Of course, the compiler built during stage 2, cc-lfs,
241 would be able to build those libraries, but (1) the build system of
242 gcc does not know that it is usable on pc, and (2) using it on pc
243 would create a risk of linking to the pc libraries, since cc-lfs is a native
244 compiler. So we have to re-build libstdc++ twice later on: as a part of
245 gcc stage 2, and then again in the chroot environment (gcc stage 3).</para>
247 </sect2>
249 <sect2 id="other-details">
251 <title>Other procedural details</title>
253 <para>The cross-compiler will be installed in a separate <filename
254 class="directory">$LFS/tools</filename> directory, since it will not
255 be part of the final system.</para>
257 <para>Binutils is installed first because the <command>configure</command>
258 runs of both gcc and glibc perform various feature tests on the assembler
259 and linker to determine which software features to enable or disable. This
260 is more important than one might realize at first. An incorrectly configured
261 gcc or glibc can result in a subtly broken toolchain, where the impact of
262 such breakage might not show up until near the end of the build of an
263 entire distribution. A test suite failure will usually highlight this error
264 before too much additional work is performed.</para>
266 <para>Binutils installs its assembler and linker in two locations,
267 <filename class="directory">$LFS/tools/bin</filename> and <filename
268 class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one
269 location are hard linked to the other. An important facet of the linker is
270 its library search order. Detailed information can be obtained from
271 <command>ld</command> by passing it the <parameter>--verbose</parameter>
272 flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command>
273 will illustrate the current search paths and their order. It shows which
274 files are linked by <command>ld</command> by compiling a dummy program and
275 passing the <parameter>--verbose</parameter> switch to the linker. For
276 example,
277 <command>$LFS_TGT-gcc dummy.c -Wl,--verbose 2&gt;&amp;1 | grep succeeded</command>
278 will show all the files successfully opened during the linking.</para>
280 <para>The next package installed is gcc. An example of what can be
281 seen during its run of <command>configure</command> is:</para>
283<screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/as
284checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen>
286 <para>This is important for the reasons mentioned above. It also
287 demonstrates that gcc's configure script does not search the PATH
288 directories to find which tools to use. However, during the actual
289 operation of <command>gcc</command> itself, the same search paths are not
290 necessarily used. To find out which standard linker <command>gcc</command>
291 will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>.</para>
293 <para>Detailed information can be obtained from <command>gcc</command> by
294 passing it the <parameter>-v</parameter> command line option while compiling
295 a dummy program. For example, <command>gcc -v dummy.c</command> will show
296 detailed information about the preprocessor, compilation, and assembly
297 stages, including <command>gcc</command>'s included search paths and their
298 order.</para>
300 <para>Next installed are sanitized Linux API headers. These allow the
301 standard C library (glibc) to interface with features that the Linux
302 kernel will provide.</para>
304 <para>The next package installed is glibc. The most important
305 considerations for building glibc are the compiler, binary tools, and
306 kernel headers. The compiler is generally not an issue since glibc will
307 always use the compiler relating to the <parameter>--host</parameter>
308 parameter passed to its configure script; e.g. in our case, the compiler
309 will be <command>$LFS_TGT-gcc</command>. The binary tools and kernel
310 headers can be a bit more complicated. Therefore, we take no risks and use
311 the available configure switches to enforce the correct selections. After
312 the run of <command>configure</command>, check the contents of the
313 <filename>config.make</filename> file in the <filename
314 class="directory">build</filename> directory for all important details.
315 Note the use of <parameter>CC="$LFS_TGT-gcc"</parameter> (with
316 <envar>$LFS_TGT</envar> expanded) to control which binary tools are used
317 and the use of the <parameter>-nostdinc</parameter> and
318 <parameter>-isystem</parameter> flags to control the compiler's include
319 search path. These items highlight an important aspect of the glibc
320 package&mdash;it is very self-sufficient in terms of its build machinery
321 and generally does not rely on toolchain defaults.</para>
323 <para>As mentioned above, the standard C++ library is compiled next, followed in
324 <xref linkend="chapter-temporary-tools"/> by other programs that need
325 to be cross compiled for breaking circular dependencies at build time.
326 The install step of all those packages uses the
327 <envar>DESTDIR</envar> variable to force installation
328 in the LFS filesystem.</para>
330 <para>At the end of <xref linkend="chapter-temporary-tools"/> the native
331 LFS compiler is installed. First binutils-pass2 is built,
332 in the same <envar>DESTDIR</envar> directory as the other programs,
333 then the second pass of gcc is constructed, omitting some
334 non-critical libraries. Due to some weird logic in gcc's
335 configure script, <envar>CC_FOR_TARGET</envar> ends up as
336 <command>cc</command> when the host is the same as the target, but
337 different from the build system. This is why
338 <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is declared explicitly
339 as one of the configuration options.</para>
341 <para>Upon entering the chroot environment in <xref
342 linkend="chapter-chroot-temporary-tools"/>,
343 the temporary installations of programs needed for the proper
344 operation of the toolchain are performed. From this point onwards, the
345 core toolchain is self-contained and self-hosted. In
346 <xref linkend="chapter-building-system"/>, final versions of all the
347 packages needed for a fully functional system are built, tested and
348 installed.</para>
350 </sect2>
Note: See TracBrowser for help on using the repository browser.