source: chapter05/toolchaintechnotes.xml@ 784dc13

10.0 10.0-rc1 10.1 10.1-rc1 11.0 11.0-rc1 11.0-rc2 11.0-rc3 11.1 11.1-rc1 11.2 11.2-rc1 11.3 11.3-rc1 12.0 12.0-rc1 12.1 12.1-rc1 arm bdubbs/gcc13 ml-11.0 multilib renodr/libudev-from-systemd s6-init trunk xry111/arm64 xry111/arm64-12.0 xry111/clfs-ng xry111/lfs-next xry111/loongarch xry111/loongarch-12.0 xry111/loongarch-12.1 xry111/mips64el xry111/pip3 xry111/rust-wip-20221008 xry111/update-glibc
Last change on this file since 784dc13 was 784dc13, checked in by Pierre Labastie <pieere@…>, 4 years ago

Fix a reference in toolchain notes

git-svn-id: http://svn.linuxfromscratch.org/LFS/branches/cross2@11899 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689

  • Property mode set to 100644
File size: 16.7 KB
Line 
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../general.ent">
5 %general-entities;
6]>
7
8<sect1 id="ch-tools-toolchaintechnotes">
9 <?dbhtml filename="toolchaintechnotes.html"?>
10
11 <title>Toolchain Technical Notes</title>
12
13 <para>This section explains some of the rationale and technical details
14 behind the overall build method. It is not essential to immediately
15 understand everything in this section. Most of this information will be
16 clearer after performing an actual build. This section can be referred
17 to at any time during the process.</para>
18
19 <para>The overall goal of <xref linkend="chapter-temporary-tools"/> is to
20 produce a temporary area that contains a known-good set of tools that can be
21 isolated from the host system. By using <command>chroot</command>, the
22 commands in the remaining chapters will be contained within that environment,
23 ensuring a clean, trouble-free build of the target LFS system. The build
24 process has been designed to minimize the risks for new readers and to provide
25 the most educational value at the same time.</para>
26
27 <para>The build process is based on the process of
28 <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used
29 for building a compiler and its toolchain for a machine different from
30 the one that is used for the build. This is not strictly needed for LFS,
31 since the machine where the new system will run is the same as the one
32 used for the build. But cross-compilation has the great advantage that
33 anything that is cross-compiled cannot depend on the host environment.</para>
34
35 <sect2 id="cross-compile" xreflabel="About Cross-Compilation">
36
37 <title>About Cross-Compilation</title>
38
39 <para>Cross-compilation involves some concepts that deserve a section on
40 their own. Although this section may be omitted in a first reading, it
41 is strongly suggested to come back to it later in order to get a full
42 grasp of the build process.</para>
43
44 <para>Let us first define some terms used in this context:</para>
45
46 <variablelist>
47 <varlistentry><term>build</term><listitem>
48 <para>is the machine where we build programs. Note that this machine
49 is referred to as the <quote>host</quote> in other
50 sections.</para></listitem>
51 </varlistentry>
52
53 <varlistentry><term>host</term><listitem>
54 <para>is the machine/system where the built programs will run. Note
55 that this use of <quote>host</quote> is not the same as in other
56 sections.</para></listitem>
57 </varlistentry>
58
59 <varlistentry><term>target</term><listitem>
60 <para>is only used for compilers. It is the machine the compiler
61 produces code for. It may be different from both build and
62 host.</para></listitem>
63 </varlistentry>
64
65 </variablelist>
66
67 <para>As an example, let us imagine the following scenario: we may have a
68 compiler on a slow machine only, let's call the machine A, and the compiler
69 ccA. We may have also a fast machine (B), but with no compiler, and we may
70 want to produce code for a another slow machine (C). Then, to build a
71 compiler for machine C, we would have three stages:</para>
72
73 <informaltable align="center">
74 <tgroup cols="5">
75 <colspec colnum="1" align="center"/>
76 <colspec colnum="2" align="center"/>
77 <colspec colnum="3" align="center"/>
78 <colspec colnum="4" align="center"/>
79 <colspec colnum="5" align="left"/>
80 <thead>
81 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
82 <entry>Target</entry><entry>Action</entry></row>
83 </thead>
84 <tbody>
85 <row>
86 <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>
87 <entry>build cross-compiler cc1 using ccA on machine A</entry>
88 </row>
89 <row>
90 <entry>2</entry><entry>A</entry><entry>B</entry><entry>B</entry>
91 <entry>build cross-compiler cc2 using cc1 on machine A</entry>
92 </row>
93 <row>
94 <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>
95 <entry>build compiler ccC using cc2 on machine B</entry>
96 </row>
97 </tbody>
98 </tgroup>
99 </informaltable>
100
101 <para>Then, all the other programs needed by machine C can be compiled
102 using cc2 on the fast machine B. Note that unless B can run programs
103 produced for C, there is no way to test the built programs until machine
104 C itself is running. For example, for testing ccC, we may want to add a
105 fourth stage:</para>
106
107 <informaltable align="center">
108 <tgroup cols="5">
109 <colspec colnum="1" align="center"/>
110 <colspec colnum="2" align="center"/>
111 <colspec colnum="3" align="center"/>
112 <colspec colnum="4" align="center"/>
113 <colspec colnum="5" align="left"/>
114 <thead>
115 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
116 <entry>Target</entry><entry>Action</entry></row>
117 </thead>
118 <tbody>
119 <row>
120 <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>
121 <entry>rebuild and test ccC using itself on machine C</entry>
122 </row>
123 </tbody>
124 </tgroup>
125 </informaltable>
126
127 <para>In the example above, only cc1 and cc2 are cross-compilers, that is,
128 they produce code for a machine different from the one they are run on.
129 The other compilers ccA and ccC produce code for the machine they are run
130 on. Such compilers are called <emphasis>native</emphasis> compilers.</para>
131
132 </sect2>
133
134 <sect2 id="lfs-cross">
135 <title>Implementation of Cross-Compilation for LFS</title>
136
137 <note>
138 <para>Almost all the build systems use names of the form
139 cpu-vendor-kernel-os referred to as the machine triplet. An astute
140 reader may wonder why a <quote>triplet</quote> refers to a four component
141 name. The reason is history: initially, three component names were enough
142 to designate unambiguously a machine, but with new machines and systems
143 appearing, that proved insufficient. The word <quote>triplet</quote>
144 remained. A simple way to determine your machine triplet is to run
145 the <command>config.guess</command>
146 script that comes with the source for many packages. Unpack the Binutils
147 sources and run the script: <userinput>./config.guess</userinput> and note
148 the output. For example, for a 32-bit Intel processor the
149 output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit
150 system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para>
151
152 <para>Also be aware of the name of the platform's dynamic linker, often
153 referred to as the dynamic loader (not to be confused with the standard
154 linker <command>ld</command> that is part of Binutils). The dynamic linker
155 provided by Glibc finds and loads the shared libraries needed by a
156 program, prepares the program to run, and then runs it. The name of the
157 dynamic linker for a 32-bit Intel machine will be <filename
158 class="libraryfile">ld-linux.so.2</filename> (<filename
159 class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems). A
160 sure-fire way to determine the name of the dynamic linker is to inspect a
161 random binary from the host system by running: <userinput>readelf -l
162 &lt;name of binary&gt; | grep interpreter</userinput> and noting the
163 output. The authoritative reference covering all platforms is in the
164 <filename>shlib-versions</filename> file in the root of the Glibc source
165 tree.</para>
166 </note>
167
168 <para>In order to fake a cross compilation, the name of the host triplet
169 is slightly adjusted by changing the &quot;vendor&quot; field in the
170 <envar>LFS_TGT</envar> variable. We also use the
171 <parameter>--with-sysroot</parameter> when building the cross linker and
172 cross compiler, to tell them where to find the needed host files. This
173 ensures none of the other programs built in <xref
174 linkend="chapter-temporary-tools"/> can link to libraries on the build
175 machine. Only two stages are mandatory, and one more for tests:</para>
176
177 <informaltable align="center">
178 <tgroup cols="5">
179 <colspec colnum="1" align="center"/>
180 <colspec colnum="2" align="center"/>
181 <colspec colnum="3" align="center"/>
182 <colspec colnum="4" align="center"/>
183 <colspec colnum="5" align="left"/>
184 <thead>
185 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
186 <entry>Target</entry><entry>Action</entry></row>
187 </thead>
188 <tbody>
189 <row>
190 <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>
191 <entry>build cross-compiler cc1 using cc-pc on pc</entry>
192 </row>
193 <row>
194 <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>
195 <entry>build compiler cc-lfs using cc1 on pc</entry>
196 </row>
197 <row>
198 <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
199 <entry>rebuild and test cc-lfs using itself on lfs</entry>
200 </row>
201 </tbody>
202 </tgroup>
203 </informaltable>
204
205 <para>In the above table, <quote>on pc</quote> means the commands are run
206 on a machine using the already installed distribution. <quote>On
207 lfs</quote> means the commands are run in a chrooted environment.</para>
208
209 <para>Now, there is more about cross-compiling: the C language is not
210 just a compiler, but also defines a standard library. In this book, the
211 GNU C library, named glibc, is used. This library must
212 be compiled for the lfs machine, that is, using the cross compiler cc1.
213 But the compiler itself uses an internal library implementing complex
214 instructions not available in the assembler instruction set. This
215 internal library is named libgcc, and must be linked to the glibc
216 library to be fully functional! Furthermore, the standard library for
217 C++ (libstdc++) also needs being linked to glibc. The solution
218 to this chicken and egg problem is to first build a degraded cc1+libgcc,
219 lacking some fuctionalities such as threads and exception handling, then
220 build glibc using this degraded compiler (glibc itself is not
221 degraded), then build libstdc++. But this last library will lack the
222 same functionalities as libgcc.</para>
223
224 <para>This is not the end of the story: the conclusion of the preceding
225 paragraph is that cc1 is unable to build a fully functional libstdc++, but
226 this is the only compiler available for building the C/C++ libraries
227 during stage 2! Of course, the compiler built during stage 2, cc-lfs,
228 would be able to build those libraries, but (i) the build system of
229 gcc does not know that it is usable on pc, and (ii) using it on pc
230 would be at risk of linking to the pc libraries, since cc-lfs is a native
231 compiler. So we have to build libstdc++ later, in chroot.</para>
232
233 </sect2>
234
235 <sect2 id="other-details">
236
237 <title>Other procedural details</title>
238
239 <para>The cross-compiler will be installed in a separate <filename
240 class="directory">$LFS/tools</filename> directory, since it will not
241 be part of the final system.</para>
242
243 <para>Binutils is installed first because the <command>configure</command>
244 runs of both GCC and Glibc perform various feature tests on the assembler
245 and linker to determine which software features to enable or disable. This
246 is more important than one might first realize. An incorrectly configured
247 GCC or Glibc can result in a subtly broken toolchain, where the impact of
248 such breakage might not show up until near the end of the build of an
249 entire distribution. A test suite failure will usually highlight this error
250 before too much additional work is performed.</para>
251
252 <para>Binutils installs its assembler and linker in two locations,
253 <filename class="directory">$LFS/tools/bin</filename> and <filename
254 class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one
255 location are hard linked to the other. An important facet of the linker is
256 its library search order. Detailed information can be obtained from
257 <command>ld</command> by passing it the <parameter>--verbose</parameter>
258 flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command>
259 will illustrate the current search paths and their order. It shows which
260 files are linked by <command>ld</command> by compiling a dummy program and
261 passing the <parameter>--verbose</parameter> switch to the linker. For
262 example,
263 <command>$LFS_TGT-gcc dummy.c -Wl,--verbose 2&gt;&amp;1 | grep succeeded</command>
264 will show all the files successfully opened during the linking.</para>
265
266 <para>The next package installed is GCC. An example of what can be
267 seen during its run of <command>configure</command> is:</para>
268
269<screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/as
270checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen>
271
272 <para>This is important for the reasons mentioned above. It also
273 demonstrates that GCC's configure script does not search the PATH
274 directories to find which tools to use. However, during the actual
275 operation of <command>gcc</command> itself, the same search paths are not
276 necessarily used. To find out which standard linker <command>gcc</command>
277 will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>.</para>
278
279 <para>Detailed information can be obtained from <command>gcc</command> by
280 passing it the <parameter>-v</parameter> command line option while compiling
281 a dummy program. For example, <command>gcc -v dummy.c</command> will show
282 detailed information about the preprocessor, compilation, and assembly
283 stages, including <command>gcc</command>'s included search paths and their
284 order.</para>
285
286 <para>Next installed are sanitized Linux API headers. These allow the
287 standard C library (Glibc) to interface with features that the Linux
288 kernel will provide.</para>
289
290 <para>The next package installed is Glibc. The most important
291 considerations for building Glibc are the compiler, binary tools, and
292 kernel headers. The compiler is generally not an issue since Glibc will
293 always use the compiler relating to the <parameter>--host</parameter>
294 parameter passed to its configure script; e.g. in our case, the compiler
295 will be <command>$LFS_TGT-gcc</command>. The binary tools and kernel
296 headers can be a bit more complicated. Therefore, take no risks and use
297 the available configure switches to enforce the correct selections. After
298 the run of <command>configure</command>, check the contents of the
299 <filename>config.make</filename> file in the <filename
300 class="directory">build</filename> directory for all important details.
301 Note the use of <parameter>CC="$LFS_TGT-gcc"</parameter> (with
302 <envar>$LFS_TGT</envar> expanded) to control which binary tools are used
303 and the use of the <parameter>-nostdinc</parameter> and
304 <parameter>-isystem</parameter> flags to control the compiler's include
305 search path. These items highlight an important aspect of the Glibc
306 package&mdash;it is very self-sufficient in terms of its build machinery
307 and generally does not rely on toolchain defaults.</para>
308
309 <para>As said above, the standard C++ library is compiled next, followed
310 by all the programs that need themselves to be built. The install step
311 uses the <envar>DESTDIR</envar> variable to have the programs land into
312 the LFS filesystem.</para>
313
314 <para>Then the native lfs compiler is built. First Binutils Pass 2, with
315 the same <envar>DESTDIR</envar> install as the other programs, then the
316 second pass of GCC, omitting libstdc++ and other non-important libraries.
317 Due to some weird logic in GCC's configure script,
318 <envar>CC_FOR_TARGET</envar> ends up as <command>cc</command> when host
319 is the same as target, but is different from build. This is why
320 <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is put explicitely into
321 the configure options.</para>
322
323 <para>Upon entering the chroot environment in <xref
324 linkend="chapter-chroot-temporary-tools"/>, the first task is to install
325 libstdc++. Then temporary installations of programs needed for the proper
326 operation of the toolchain are performed. Programs needed for testing
327 other programs are also built. From this point onwards, the
328 core toolchain is self-contained and self-hosted. In
329 <xref linkend="chapter-building-system"/>, final versions of all the
330 packages needed for a fully functional system are built, tested and
331 installed.</para>
332
333 </sect2>
334
335</sect1>
Note: See TracBrowser for help on using the repository browser.