source: part3intro/toolchaintechnotes.xml@ 12fff1e

10.0 10.0-rc1 10.1 10.1-rc1 11.0 11.0-rc1 11.0-rc2 11.0-rc3 ml-11.0 multilib trunk xry111/git-transition xry111/glibc-2.34 xry111/tester-nohack xry111/usr-move
Last change on this file since 12fff1e was 12fff1e, checked in by Pierre Labastie <pieere@…>, 18 months ago

Slightly change the layout in part III, so that the preliminary material
appear separated. Minor rewrites for accounting for the new layout

git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@11949 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689

  • Property mode set to 100644
File size: 16.9 KB
Line 
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
4 <!ENTITY % general-entities SYSTEM "../general.ent">
5 %general-entities;
6]>
7
8<sect1 id="ch-tools-toolchaintechnotes">
9 <?dbhtml filename="toolchaintechnotes.html"?>
10
11 <title>Toolchain Technical Notes</title>
12
13 <para>This section explains some of the rationale and technical details
14 behind the overall build method. It is not essential to immediately
15 understand everything in this section. Most of this information will be
16 clearer after performing an actual build. This section can be referred
17 to at any time during the process.</para>
18
19 <para>The overall goal of this chapter and <xref
20 linkend="chapter-temporary-tools"/> is to produce a temporary area that
21 contains a known-good set of tools that can be isolated from the host system.
22 By using <command>chroot</command>, the commands in the remaining chapters
23 will be contained within that environment, ensuring a clean, trouble-free
24 build of the target LFS system. The build process has been designed to
25 minimize the risks for new readers and to provide the most educational value
26 at the same time.</para>
27
28 <para>The build process is based on the process of
29 <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used
30 for building a compiler and its toolchain for a machine different from
31 the one that is used for the build. This is not strictly needed for LFS,
32 since the machine where the new system will run is the same as the one
33 used for the build. But cross-compilation has the great advantage that
34 anything that is cross-compiled cannot depend on the host environment.</para>
35
36 <sect2 id="cross-compile" xreflabel="About Cross-Compilation">
37
38 <title>About Cross-Compilation</title>
39
40 <para>Cross-compilation involves some concepts that deserve a section on
41 their own. Although this section may be omitted in a first reading, it
42 is strongly suggested to come back to it later in order to get a full
43 grasp of the build process.</para>
44
45 <para>Let us first define some terms used in this context:</para>
46
47 <variablelist>
48 <varlistentry><term>build</term><listitem>
49 <para>is the machine where we build programs. Note that this machine
50 is referred to as the <quote>host</quote> in other
51 sections.</para></listitem>
52 </varlistentry>
53
54 <varlistentry><term>host</term><listitem>
55 <para>is the machine/system where the built programs will run. Note
56 that this use of <quote>host</quote> is not the same as in other
57 sections.</para></listitem>
58 </varlistentry>
59
60 <varlistentry><term>target</term><listitem>
61 <para>is only used for compilers. It is the machine the compiler
62 produces code for. It may be different from both build and
63 host.</para></listitem>
64 </varlistentry>
65
66 </variablelist>
67
68 <para>As an example, let us imagine the following scenario: we may have a
69 compiler on a slow machine only, let's call the machine A, and the compiler
70 ccA. We may have also a fast machine (B), but with no compiler, and we may
71 want to produce code for a another slow machine (C). Then, to build a
72 compiler for machine C, we would have three stages:</para>
73
74 <informaltable align="center">
75 <tgroup cols="5">
76 <colspec colnum="1" align="center"/>
77 <colspec colnum="2" align="center"/>
78 <colspec colnum="3" align="center"/>
79 <colspec colnum="4" align="center"/>
80 <colspec colnum="5" align="left"/>
81 <thead>
82 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
83 <entry>Target</entry><entry>Action</entry></row>
84 </thead>
85 <tbody>
86 <row>
87 <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>
88 <entry>build cross-compiler cc1 using ccA on machine A</entry>
89 </row>
90 <row>
91 <entry>2</entry><entry>A</entry><entry>B</entry><entry>B</entry>
92 <entry>build cross-compiler cc2 using cc1 on machine A</entry>
93 </row>
94 <row>
95 <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>
96 <entry>build compiler ccC using cc2 on machine B</entry>
97 </row>
98 </tbody>
99 </tgroup>
100 </informaltable>
101
102 <para>Then, all the other programs needed by machine C can be compiled
103 using cc2 on the fast machine B. Note that unless B can run programs
104 produced for C, there is no way to test the built programs until machine
105 C itself is running. For example, for testing ccC, we may want to add a
106 fourth stage:</para>
107
108 <informaltable align="center">
109 <tgroup cols="5">
110 <colspec colnum="1" align="center"/>
111 <colspec colnum="2" align="center"/>
112 <colspec colnum="3" align="center"/>
113 <colspec colnum="4" align="center"/>
114 <colspec colnum="5" align="left"/>
115 <thead>
116 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
117 <entry>Target</entry><entry>Action</entry></row>
118 </thead>
119 <tbody>
120 <row>
121 <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>
122 <entry>rebuild and test ccC using itself on machine C</entry>
123 </row>
124 </tbody>
125 </tgroup>
126 </informaltable>
127
128 <para>In the example above, only cc1 and cc2 are cross-compilers, that is,
129 they produce code for a machine different from the one they are run on.
130 The other compilers ccA and ccC produce code for the machine they are run
131 on. Such compilers are called <emphasis>native</emphasis> compilers.</para>
132
133 </sect2>
134
135 <sect2 id="lfs-cross">
136 <title>Implementation of Cross-Compilation for LFS</title>
137
138 <note>
139 <para>Almost all the build systems use names of the form
140 cpu-vendor-kernel-os referred to as the machine triplet. An astute
141 reader may wonder why a <quote>triplet</quote> refers to a four component
142 name. The reason is history: initially, three component names were enough
143 to designate unambiguously a machine, but with new machines and systems
144 appearing, that proved insufficient. The word <quote>triplet</quote>
145 remained. A simple way to determine your machine triplet is to run
146 the <command>config.guess</command>
147 script that comes with the source for many packages. Unpack the binutils
148 sources and run the script: <userinput>./config.guess</userinput> and note
149 the output. For example, for a 32-bit Intel processor the
150 output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit
151 system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para>
152
153 <para>Also be aware of the name of the platform's dynamic linker, often
154 referred to as the dynamic loader (not to be confused with the standard
155 linker <command>ld</command> that is part of binutils). The dynamic linker
156 provided by Glibc finds and loads the shared libraries needed by a
157 program, prepares the program to run, and then runs it. The name of the
158 dynamic linker for a 32-bit Intel machine will be <filename
159 class="libraryfile">ld-linux.so.2</filename> (<filename
160 class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems). A
161 sure-fire way to determine the name of the dynamic linker is to inspect a
162 random binary from the host system by running: <userinput>readelf -l
163 &lt;name of binary&gt; | grep interpreter</userinput> and noting the
164 output. The authoritative reference covering all platforms is in the
165 <filename>shlib-versions</filename> file in the root of the Glibc source
166 tree.</para>
167 </note>
168
169 <para>In order to fake a cross compilation, the name of the host triplet
170 is slightly adjusted by changing the &quot;vendor&quot; field in the
171 <envar>LFS_TGT</envar> variable. We also use the
172 <parameter>--with-sysroot</parameter> option when building the cross linker and
173 cross compiler to tell them where to find the needed host files. This
174 ensures that none of the other programs built in <xref
175 linkend="chapter-temporary-tools"/> can link to libraries on the build
176 machine. Only two stages are mandatory, and one more for tests:</para>
177
178 <informaltable align="center">
179 <tgroup cols="5">
180 <colspec colnum="1" align="center"/>
181 <colspec colnum="2" align="center"/>
182 <colspec colnum="3" align="center"/>
183 <colspec colnum="4" align="center"/>
184 <colspec colnum="5" align="left"/>
185 <thead>
186 <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
187 <entry>Target</entry><entry>Action</entry></row>
188 </thead>
189 <tbody>
190 <row>
191 <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>
192 <entry>build cross-compiler cc1 using cc-pc on pc</entry>
193 </row>
194 <row>
195 <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>
196 <entry>build compiler cc-lfs using cc1 on pc</entry>
197 </row>
198 <row>
199 <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
200 <entry>rebuild and test cc-lfs using itself on lfs</entry>
201 </row>
202 </tbody>
203 </tgroup>
204 </informaltable>
205
206 <para>In the above table, <quote>on pc</quote> means the commands are run
207 on a machine using the already installed distribution. <quote>On
208 lfs</quote> means the commands are run in a chrooted environment.</para>
209
210 <para>Now, there is more about cross-compiling: the C language is not
211 just a compiler, but also defines a standard library. In this book, the
212 GNU C library, named glibc, is used. This library must
213 be compiled for the lfs machine, that is, using the cross compiler cc1.
214 But the compiler itself uses an internal library implementing complex
215 instructions not available in the assembler instruction set. This
216 internal library is named libgcc, and must be linked to the glibc
217 library to be fully functional! Furthermore, the standard library for
218 C++ (libstdc++) also needs being linked to glibc. The solution
219 to this chicken and egg problem is to first build a degraded cc1 based libgcc,
220 lacking some fuctionalities such as threads and exception handling, then
221 build glibc using this degraded compiler (glibc itself is not
222 degraded), then build libstdc++. But this last library will lack the
223 same functionalities as libgcc.</para>
224
225 <para>This is not the end of the story: the conclusion of the preceding
226 paragraph is that cc1 is unable to build a fully functional libstdc++, but
227 this is the only compiler available for building the C/C++ libraries
228 during stage 2! Of course, the compiler built during stage 2, cc-lfs,
229 would be able to build those libraries, but (1) the build system of
230 GCC does not know that it is usable on pc, and (2) using it on pc
231 would be at risk of linking to the pc libraries, since cc-lfs is a native
232 compiler. So we have to build libstdc++ later, in chroot.</para>
233
234 </sect2>
235
236 <sect2 id="other-details">
237
238 <title>Other procedural details</title>
239
240 <para>The cross-compiler will be installed in a separate <filename
241 class="directory">$LFS/tools</filename> directory, since it will not
242 be part of the final system.</para>
243
244 <para>Binutils is installed first because the <command>configure</command>
245 runs of both GCC and Glibc perform various feature tests on the assembler
246 and linker to determine which software features to enable or disable. This
247 is more important than one might first realize. An incorrectly configured
248 GCC or Glibc can result in a subtly broken toolchain, where the impact of
249 such breakage might not show up until near the end of the build of an
250 entire distribution. A test suite failure will usually highlight this error
251 before too much additional work is performed.</para>
252
253 <para>Binutils installs its assembler and linker in two locations,
254 <filename class="directory">$LFS/tools/bin</filename> and <filename
255 class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one
256 location are hard linked to the other. An important facet of the linker is
257 its library search order. Detailed information can be obtained from
258 <command>ld</command> by passing it the <parameter>--verbose</parameter>
259 flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command>
260 will illustrate the current search paths and their order. It shows which
261 files are linked by <command>ld</command> by compiling a dummy program and
262 passing the <parameter>--verbose</parameter> switch to the linker. For
263 example,
264 <command>$LFS_TGT-gcc dummy.c -Wl,--verbose 2&gt;&amp;1 | grep succeeded</command>
265 will show all the files successfully opened during the linking.</para>
266
267 <para>The next package installed is GCC. An example of what can be
268 seen during its run of <command>configure</command> is:</para>
269
270<screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/as
271checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen>
272
273 <para>This is important for the reasons mentioned above. It also
274 demonstrates that GCC's configure script does not search the PATH
275 directories to find which tools to use. However, during the actual
276 operation of <command>gcc</command> itself, the same search paths are not
277 necessarily used. To find out which standard linker <command>gcc</command>
278 will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>.</para>
279
280 <para>Detailed information can be obtained from <command>gcc</command> by
281 passing it the <parameter>-v</parameter> command line option while compiling
282 a dummy program. For example, <command>gcc -v dummy.c</command> will show
283 detailed information about the preprocessor, compilation, and assembly
284 stages, including <command>gcc</command>'s included search paths and their
285 order.</para>
286
287 <para>Next installed are sanitized Linux API headers. These allow the
288 standard C library (Glibc) to interface with features that the Linux
289 kernel will provide.</para>
290
291 <para>The next package installed is Glibc. The most important
292 considerations for building Glibc are the compiler, binary tools, and
293 kernel headers. The compiler is generally not an issue since Glibc will
294 always use the compiler relating to the <parameter>--host</parameter>
295 parameter passed to its configure script; e.g. in our case, the compiler
296 will be <command>$LFS_TGT-gcc</command>. The binary tools and kernel
297 headers can be a bit more complicated. Therefore, take no risks and use
298 the available configure switches to enforce the correct selections. After
299 the run of <command>configure</command>, check the contents of the
300 <filename>config.make</filename> file in the <filename
301 class="directory">build</filename> directory for all important details.
302 Note the use of <parameter>CC="$LFS_TGT-gcc"</parameter> (with
303 <envar>$LFS_TGT</envar> expanded) to control which binary tools are used
304 and the use of the <parameter>-nostdinc</parameter> and
305 <parameter>-isystem</parameter> flags to control the compiler's include
306 search path. These items highlight an important aspect of the Glibc
307 package&mdash;it is very self-sufficient in terms of its build machinery
308 and generally does not rely on toolchain defaults.</para>
309
310 <para>As said above, the standard C++ library is compiled next, followed in
311 Chapter 6 by all the programs that need themselves to be built. The install
312 step of libstdc++ uses the <envar>DESTDIR</envar> variable to have the
313 programs land into the LFS filesystem.</para>
314
315 <para>In Chapter 7 the native lfs compiler is built. First binutils-pass2,
316 with the same <envar>DESTDIR</envar> install as the other programs is
317 built, and then the second pass of GCC is constructed, omitting libstdc++
318 and other non-important libraries. Due to some weird logic in GCC's
319 configure script, <envar>CC_FOR_TARGET</envar> ends up as
320 <command>cc</command> when the host is the same as the target, but is
321 different from the build system. This is why
322 <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is put explicitely into
323 the configure options.</para>
324
325 <para>Upon entering the chroot environment in <xref
326 linkend="chapter-chroot-temporary-tools"/>, the first task is to install
327 libstdc++. Then temporary installations of programs needed for the proper
328 operation of the toolchain are performed. Programs needed for testing
329 other programs are also built. From this point onwards, the
330 core toolchain is self-contained and self-hosted. In
331 <xref linkend="chapter-building-system"/>, final versions of all the
332 packages needed for a fully functional system are built, tested and
333 installed.</para>
334
335 </sect2>
336
337</sect1>
Note: See TracBrowser for help on using the repository browser.