Ignore:
Timestamp:
09/18/2023 06:59:45 PM (9 months ago)
Author:
Xi Ruoyao <xry111@…>
Branches:
12.1, ken/TL2024, ken/tuningfonts, lazarus, plabs/newcss, python3.11, rahul/power-profiles-daemon, renodr/vulkan-addition, trunk, xry111/llvm18
Children:
33c1959a
Parents:
a8a3c9d8
git-author:
Xi Ruoyao <xry111@…> (09/18/2023 06:17:29 PM)
git-committer:
Xi Ruoyao <xry111@…> (09/18/2023 06:59:45 PM)
Message:

building-notes: Note how to use cgroup for limiting resource usage

We were saying "-jN means using N cores (or N threads)". This is
completely wrong. "-jN" only tells the building system to run N jobs
simultaneously, but each job can start their own subprocesses or threads
and there is no way for the building system to know how many
subprocesses or threads a job will start.

This caused a lot of misunderstandings and encouraged users to wrongly
blame building systems.

Fix the description of -jN, and add how to use cgroup to control the
usage of CPU cores and system RAM.

On a systemd-based system, systemd is the cgroup manager and manually
operating on cgroups may puzzle systemd. So use systemd-run for
creating and setting up cgroup. On a sysv-based system create and set
up the cgroup manually.

Location:
introduction/important
Files:
1 added
1 edited

Legend:

Unmodified
Added
Removed
  • introduction/important/building-notes.xml

    ra8a3c9d8 rb6d54494  
    201201    compilation time for a package can be reduced by performing a "parallel
    202202    make" by either setting an environment variable or telling the make program
    203     how many processors are available. For instance, a Core2Duo can support two
    204     simultaneous processes with: </para>
    205 
    206     <screen><userinput>export MAKEFLAGS='-j2'</userinput></screen>
     203    to simultaneously execute multiple jobs.</para>
     204
     205    <para>For instance, an Intel Core i9-13900K CPU contains 8 performance
     206    (P) cores and 16 efficiency (E) cores, and the P cores support SMT
     207    (Simultaneous MultiThreading, also known as
     208    <quote>Hyper-Threading</quote>) so each P core can run two threads
     209    simultaneously and the Linux kernel will treat each P core as two
     210    logical cores.  As the result, there are 32 logical cores in total.
     211    To utilize all these logical cores running <command>make</command>, we
     212    can set an environment variable to tell <command>make</command> to
     213    run 32 jobs simultaneously:</para>
     214
     215    <screen><userinput>export MAKEFLAGS='-j32'</userinput></screen>
    207216
    208217    <para>or just building with:</para>
    209218
    210     <screen><userinput>make -j2</userinput></screen>
     219    <screen><userinput>make -j32</userinput></screen>
    211220
    212221    <para>
     
    215224    </para>
    216225
    217     <screen><userinput>export NINJAJOBS=2</userinput></screen>
     226    <screen><userinput>export NINJAJOBS=32</userinput></screen>
    218227
    219228    <para>
     
    221230    </para>
    222231
    223     <screen><userinput>ninja -j2</userinput></screen>
    224 
    225     <para>
    226       but for ninja, the default number of jobs is N + 2, if
    227       the number of logical processors N is greater than 2; or N + 1 if
     232    <screen><userinput>ninja -j32</userinput></screen>
     233
     234    <para>
     235      If you are not sure about the number of logical cores, run the
     236      <command>nproc</command> command.
     237    </para>
     238
     239    <para>
     240      For <command>make</command>, the default number of jobs is 1.  But
     241      for <command>ninja</command>, the default number of jobs is N + 2 if
     242      the number of logical cores N is greater than 2; or N + 1 if
    228243      N is 1 or 2.  The reason to use a number of jobs slightly greater
    229       than the number of logical processors is keeping all logical
     244      than the number of logical cores is keeping all logical
    230245      processors busy even if some jobs are performing I/O operations.
     246    </para>
     247
     248    <para>
     249      Note that the <option>-j</option> switches only limits the parallel
     250      jobs started by <command>make</command> or <command>ninja</command>,
     251      but each job may still spawn its own processes or threads.  For
     252      example, <command>ld.gold</command> will use multiple threads for
     253      linking, and some tests of packages can spawn multiple threads for
     254      testing thread safety properties.  There is no generic way for the
     255      building system to know the number of processes or threads spawned by
     256      a job. So generally we should not consider the value passed with
     257      <option>-j</option> a hard limit of the number of logical cores to
     258      use.  Read <xref linkend='build-in-cgroup'/> if you want to set such
     259      a hard limit.
    231260    </para>
    232261
     
    249278    will override the similar setting in the <envar>MAKEFLAGS</envar>
    250279    environment variable.</para>
    251 <!-- outdated
    252     <note><para>When running the package tests or the install portion of the
    253     package build process, we do not recommend using an option greater than
    254     '-j1' unless specified otherwise.  The installation procedures or checks
    255     have not been validated using parallel procedures and may fail with issues
    256     that are difficult to debug.</para></note>
    257 -->
     280
    258281    <important>
    259282      <para>
     
    272295      </para>
    273296    </important>
     297  </sect2>
     298
     299  <sect2 id="build-in-cgroup">
     300    <title>Use Linux Control Group to Limit the Resource Usage</title>
     301
     302    <para>
     303      Sometimes we want to limit the resource usage when we build a
     304      package.  For example, when we have 8 logical cores, we may want
     305      to use only 6 cores for building the package and reserve another
     306      2 cores for playing a movie.  The Linux kernel provides a feature
     307      called control groups (cgroup) for such a need.
     308    </para>
     309
     310    <para>
     311      Enable control group in the kernel configuration, then rebuild the
     312      kernel and reboot if necessary:
     313    </para>
     314
     315    <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
     316      href="cgroup-kernel.xml"/>
     317
     318    <!-- We need cgroup2 mounted at /sys/fs/cgroup.  It's done by
     319         systemd itself in LFS systemd, mountvirtfs script in LFS sysv. -->
     320
     321    <para revision='systemd'>
     322      Ensure <xref linkend='systemd'/> and <xref linkend='shadow'/> have
     323      been rebuilt with <xref linkend='linux-pam'/> support (if you are
     324      interacting via a SSH or graphical session, also ensure the
     325      <xref linkend='openssh'/> server or the desktop manager has been
     326      built with <xref linkend='linux-pam'/>).  As the &root; user, create
     327      a configuration file to allow resource control without &root;
     328      privilege, and instruct <command>systemd</command> to reload the
     329      configuration:
     330    </para>
     331
     332    <screen revision="systemd" role="nodump"><userinput>mkdir -pv /etc/systemd/system/user@.service.d &amp;&amp;
     333cat &gt; /etc/systemd/system/user@.service.d/delegate.conf &lt;&lt; EOF &amp;&amp;
     334<literal>[Service]
     335Delegate=memory cpuset</literal>
     336systemctl daemon-reload</userinput></screen>
     337
     338    <para revision='systemd'>
     339      Then logout and login again.  Now to run <command>make -j5</command>
     340      with the first 4 logical cores and 8 GB of system memory, issue:
     341    </para>
     342
     343    <screen revision="systemd" role="nodump"><userinput>systemctl   --user start dbus                &amp;&amp;
     344systemd-run --user --pty --pipe --wait -G -d \
     345            -p MemoryHigh=8G                 \
     346            -p AllowedCPUs=0-3               \
     347            make -j5</userinput></screen>
     348
     349    <para revision='sysv'>
     350      Ensure <xref linkend='sudo'/> is installed.  To run
     351      <command>make -j5</command> with the first 4 logical cores and 8 GB
     352      of system memory, issue:
     353    </para>
     354
     355    <!-- "\EOF" because we expect $$ to be expanded by the "bash -e"
     356         shell, not the current shell.
     357
     358         TODO: can we use elogind to delegate the controllers (like
     359         systemd) to avoid relying on sudo?  -->
     360    <screen revision="sysv" role="nodump"><userinput>bash -e &lt;&lt; \EOF
     361  sudo mkdir /sys/fs/cgroup/$$
     362  sudo sh -c \
     363    "echo +memory +cpuset > /sys/fs/cgroup/cgroup.subtree_control"
     364  sudo sh -c \
     365    "echo 0-3 > /sys/fs/cgroup/$$/cpuset.cpus"
     366  sudo sh -c \
     367    "echo $((8 &lt;&lt; 30)) > /sys/fs/cgroup/$$/memory.high"
     368  (
     369    sudo sh -c "echo $BASHPID > /sys/fs/cgroup/$$/cgroup.procs"
     370    exec make -j5
     371  )
     372  sudo rmdir /sys/fs/cgroup/$$
     373EOF</userinput></screen>
     374
     375    <para>
     376      With
     377      <phrase revision='systemd'>
     378        <parameter>MemoryHigh=8G</parameter>
     379      </phrase>
     380      <phrase revision='sysv'>
     381        <literal>8589934592</literal> (expanded from
     382        <userinput>$((8 &lt;&lt; 30))</userinput>) in the
     383        <filename>memory.high</filename> entry
     384      </phrase>, a soft limit of memory usage is set.
     385      If the processes in the cgroup (<command>make</command> and all the
     386      descendants of it) uses more than 8 GB of system memory in total,
     387      the kernel will throttle down the processes and try to reclaim the
     388      system memory from them.  But they can still use more than 8 GB of
     389      system memory.  If you want to make a hard limit instead, replace
     390      <phrase revision='systemd'>
     391        <parameter>MemoryHigh</parameter> with
     392        <parameter>MemoryMax</parameter>.
     393      </phrase>
     394      <phrase revision='sysv'>
     395        <filename>memory.high</filename> with
     396        <filename>memory.max</filename>.
     397      </phrase>
     398      But doing so will cause the processes killed if 8 GB is not enough
     399      for them.
     400    </para>
     401
     402    <para>
     403      <phrase revision='systemd'>
     404        <parameter>AllowedCPUs=0-3</parameter>
     405      </phrase>
     406      <phrase revision='sysv'>
     407        <literal>0-3</literal> in the <filename>cpuset.cpus</filename>
     408        entry
     409      </phrase> makes the kernel only run the processes in the cgroup on
     410      the logical cores with numbers 0, 1, 2, or 3.  You may need to
     411      adjust this setting based the mapping between the logical cores and the
     412      physical cores.  For example, with an Intel Core i9-13900K CPU,
     413      the logical cores 0, 2, 4, ..., 14 are mapped to the first threads of
     414      the eight physical P cores, the logical cores 1, 3, 5, ..., 15 are
     415      mapped to the second threads of the physical P cores, and the logical
     416      cores 16, 17, ..., 31 are mapped to the 16 physical E cores.  So if
     417      we want to use four threads from four different P cores, we need to
     418      specify <literal>0,2,4,6</literal> instead of <literal>0-3</literal>.
     419      Note that the other CPU models may use a different mapping scheme.
     420      If you are not sure about the mapping between the logical cores
     421      and the physical cores, run <command>grep -E '^processor|^core'
     422      /proc/cpuinfo</command> which will output logical core IDs in the
     423      <computeroutput>processor</computeroutput> lines, and physical core
     424      IDs in the <computeroutput>core id</computeroutput> lines.
     425    </para>
     426
     427    <para>
     428      When the <command>nproc</command> or <command>ninja</command> command
     429      runs in a cgroup, it will use the number of logical cores assigned to
     430      the cgroup as the <quote>system logical core count</quote>.  For
     431      example, in a cgroup with logical cores 0-3 assigned,
     432      <command>nproc</command> will print
     433      <computeroutput>4</computeroutput>, and <command>ninja</command>
     434      will run 6 (4 + 2) jobs simultaneously if no <option>-j</option>
     435      setting is explicitly given.
     436    </para>
     437
     438    <para revision="systemd">
     439      Read the man pages <filename>systemd-run(1)</filename> and
     440      <filename>systemd.resource-control(5)</filename> for the detailed
     441      explanation of parameters in the command.
     442    </para>
     443
     444    <para revision="sysv">
     445      Read the <filename>Documentation/admin-guide/cgroup-v2.rst</filename>
     446      file in the Linux kernel source tree for the detailed explanation of
     447      <systemitem class="filesystem">cgroup2</systemitem> pseudo file
     448      system entries referred in the command.
     449    </para>
     450
    274451  </sect2>
    275452
     
    9621139      <para>
    9631140        Like <command>ninja</command>, by default <command>cargo</command>
    964         uses all logical processors.  This can often be worked around,
     1141        uses all logical cores.  This can often be worked around,
    9651142        either by exporting
    9661143        <envar>CARGO_BUILD_JOBS=<replaceable>&lt;N&gt;</replaceable></envar>
Note: See TracChangeset for help on using the changeset viewer.