source: BLFS/libs/func_dependencies@ 141d072

ablfs-more legacy trunk
Last change on this file since 141d072 was 141d072, checked in by Pierre Labastie <pierre.labastie@…>, 3 years ago

BLFS Dependencies: Fix pass2 not run

With the way we manage "first" deps, it may happen that no node
unreference the pass2 node. In that case, we add it to root.
Add some documentation for what we do for those deps, too.

  • Property mode set to 100644
File size: 25.8 KB
Line 
1#!/bin/bash
2#
3
4#-----------------------------------------------------------------------------#
5# This is a set of (recursive) functions for manipulating a dependency graph. #
6# We use algorithms and definitions from chapter 4 (mainly section 4.2) of #
7# https://algs4.cs.princeton.edu/. The graph we manipulate is the directed #
8# graph of the dependencies: nodes are packages in the BLFS book. A node A is #
9# connected to a node B if package A depends on B. A topological order (rev- #
10# erted) is exactly what we want for a build order. But a topological order #
11# only exists if the graph is acyclic. We'll therefore have to remove cycles. #
12# There are a number of other features we want to consider: #
13# - edges are weighted according to the dependency requirement: #
14# 1 for required #
15# 2 for recommended #
16# 3 for optional #
17# 4 for external #
18# We should consider only edges with weight lower or equal to that #
19# specified by the user, but see below. #
20# - we do not want to build the whole book. The user specifies a set of #
21# packages, and we have to consider only nodes reachable from this set #
22# using edges of weight not exceeding the specified weight. #
23# - when doing the topological sort, we want to consider all the edges and #
24# not only those not exceeding the specified weight: If a package A in the #
25# reachable subgraph depends optionally on another package B in the same #
26# subgraph, we want to build B before A. But this means we'll have to #
27# remove cycles for all weights. #
28# - dependencies have another qualifier: before or after. The problem: if a #
29# package A depends on B with an "after" qualifier, and a package C depends #
30# on A with a "before" qualifier, C may need B to be able to use A. So the #
31# only safe way to consider "after" qualifiers is to consider that they are #
32# "before" deps for any parent of the packages considered. There is an #
33# exception to that rule: if B depends on C (possibly through a chain of #
34# several dependencies), then C should still be built before B. #
35# We'll therefore have a 3 pass procedure. First build the set of nodes #
36# reachable from the root set. Second, remove dangling edges (those pointing #
37# to packages outside the node set), and move "after" edges to "before" edges #
38# originating from the parents. Third remove cycles and generate a #
39# topological sort. #
40# #
41# Pass 1: graph generation #
42# ======================== #
43# Data layout for pass 1 #
44# ---------------------- #
45# A node of the tree is represented by a text file <nodeName>.dep. Each edge #
46# starting from this node is represented by a line in this file. We keep #
47# those files in the same directory. We introduce a special node named root, #
48# whose edges point to the list of nodes requested by the user. Each line #
49# contains three fields: #
50# - the weight of the edge #
51# - the qualifier: "before" (b), "after" (a), or "first" (f) #
52# - the name of the destination of the edge (without the ".dep" extension) #
53# #
54# Recursive function "generate_subgraph" #
55# -------------------------------------- #
56# This function treats a node of the graph that is not a leaf and that is #
57# seen for the first time in the DFS. The dependencies of this node are #
58# known, and stored in a .dep file. For each dependency in that file, there #
59# are three cases: #
60# - the weight of the edge leading to that dependency is higher than #
61# requested. This dependency is discarded (some information printed) #
62# - the weight of the edge is lower or equal to requested, but the node #
63# has already been visited (the .dep file exists). Discard too (some #
64# information printed) #
65# - the weight of the edge is lower or equal to requested, and the node #
66# has not been seen: then the dependencies of that node are generated, #
67# and there are two cases: #
68# - the node has no dependencies: just create an empty .dep file, so #
69# that we know the node has been visited #
70# - the node has dependencies: call generate_subgraph for that node #
71# #
72# This function takes four parameters: #
73# - The node filename: this is the only one useful for the algorithm #
74# - The depth: number of steps starting from root (for pretty print only) #
75# - The weight of the edge leading to that node (for printing) #
76# - The qualifier (for printing) #
77# #
78# Pass 2: graph transformation #
79# ============================ #
80# We now have three loops over nodes of the graph #
81# Loop 1: Remove dead edges #
82# ------------------------- #
83# Since some nodes have not been created because the edge leading to them #
84# had too high a weight, the edges leading to them have to be suppressed. #
85# For each existing node file, we make a list of lines to remove by #
86# testing whether the destination exists. We then remove the lines. #
87# Another approach would be to make a temporary file and output only #
88# lines that should stay, then rename the file. This would save a loop. #
89# All in all it is an N*e process, where N is the number of nodes and e #
90# the average number of edges originating from a node. #
91# Loop 2: Treat "after" edges #
92# --------------------------- #
93# If a node is the origin of edges qualified as "after", we want the #
94# nodes which are the destination of those edges to be built after #
95# the origin node, but before any node that depend on the origin #
96# node. For that, the general rule is to change: #
97# P---b--->A---a--->D #
98# to: #
99# P---b--->Agroupxx---b--->A #
100# | #
101# ---b--->D #
102# But there is a problem if D depends on P, possibly through a chain, #
103# because we create a cycle which shouldn't exist. If this is the case, #
104# we leave A as a dependency of P: #
105# P---b--->A #
106# #
107# Agroupxx---b--->A #
108# | #
109# ---b--->D #
110# Doing so, it may happen that Agroupxx has no parent. We then add #
111# Agroupxx as a dependency of root. The problem with this algorithm is #
112# the search for paths from D to A, which may be exponential in the #
113# number of nodes in the graph. #
114# TODO: document other passes #
115# TODO: needs also to document the .tree files #
116# TODO: The following is obsolete #
117# Circular dependencies: #
118# #
119# In case we find a cirdular dependency, it has the form : #
120# parent->dependency_0->...->dependency_n->dependency_0 #
121# If we want to build dependency_n before dependency_0, no problem: #
122# we just prune the tree at dependency_n. If we want to build first #
123# dependency_0, we need to put dependency_n as a dependency of parent, #
124# then erase and rebuild the subtree from there. Now, we may have met #
125# another circular dependency in the subtree, and erasing the tree makes #
126# us forget the decision which was made. So, after first generating the #
127# list of dependencies from packages.xml, we keep the generated list in #
128# a file <nodeName>.odep, which we modify according to the decision which #
129# was made. #
130#---------------------------------------------------------------------------#
131
132# Global variables:
133# A string of spaces for indenting:
134declare -a spaceSTR=" "
135# When we are backing up from a circular dependency, `parentNode'
136# contains the node which has an edge entering the cycle
137declare parentNode
138
139#---------------------#
140generate_subgraph() { #
141#---------------------#
142: <<inline_doc
143 function: Create a subgraph of all the nodes reachable from the node
144 represented by the file whose name is $1. The edges considered
145 are those with maximal weight DEP_LEVEL (recursive function).
146 input vars: $1 : file name corresponding to the node whose edges will be
147 followed for the DFS
148 $2 : weight of the edge leading to this node
149 $3 : depth (root is 1)
150 $4 : qualifier (a for after, b for before, f for first)
151 externals: vars: DEP_LEVEL contains 1 if we want to build the
152 tree only for required dependencies,
153 2 if we want also recommended ones,
154 3 if we want also optional ones, but only
155 for the requested packages,
156 4 if we want all the dependencies
157 (excluding external of course)
158 MAIL_SERVER contains the name of the MTA we want to use.
159 files: ../xsl/dependencies.xsl: stylesheet for creating the
160 .dep files
161 ../packages.xml: File containing packages id
162 and dependencies
163 returns: 0 if the tree has been successfully created
164 output: files: for each node reachable from $1, a file <node>.dep.
165 on error: nothing
166 on success: nothing
167inline_doc
168
169local depFile=$1
170local -i weight=$2
171local -i depth=$3
172local qualifier=$4
173local -i spacing=0
174local priostring
175local buildstring
176local id_of_dep
177local prio_of_dep
178local build_of_dep
179local dep_level
180
181if (( depth < 10 )); then spacing=1; fi
182case $weight in
183 1) priostring=required ;;
184 2) priostring=recommended ;;
185 3) priostring=optional ;;
186esac
187case $qualifier in
188 a) buildstring=runtime ;;
189 b|f) buildstring= ;;
190esac
191dep_level=$DEP_LEVEL
192if [ "$dep_level" = 3 ] && [ "$depth" -gt 2 ]; then dep_level=2; fi
193if [ "$dep_level" -gt 3 ]; then dep_level=3; fi
194echo -en "\nNode: $depth${spaceSTR:0:$(( depth + spacing ))}${RED}${depFile%.dep}${OFF} $priostring $buildstring"
195
196depth=$(( depth + 1 ))
197if (( depth < 10 )); then spacing=1; else spacing=0; fi
198# Start of loop
199{
200while read prio_of_dep build_of_dep id_of_dep; do
201 case $prio_of_dep in
202 1) priostring=required ;;
203 2) priostring=recommended ;;
204 3) priostring=optional ;;
205 4) priostring=external ;;
206 esac
207 case $build_of_dep in
208 a ) buildstring=runtime ;;
209 b|f) buildstring= ;;
210 esac
211# Has this entry already been seen?
212# TODO: no there is no special case!
213# We have a special case here: if the entry has been seen at depth > 2
214# and now depth=2 and DEP_LEVEL=3, optional deps have not been processed.
215# If this is the case, just consider it has not been seen.
216 if [ -f ${id_of_dep}.dep ] ; then
217 case $depth$DEP_LEVEL in
218 23) ;;
219 *)
220# Just display it and proceed.
221 echo -en "\nEdge: $depth${spaceSTR:0:$((depth + spacing))}${MAGENTA}${id_of_dep}${OFF} $priostring $buildstring"
222 continue
223 ;;
224 esac
225 fi
226# Is the weight higher than requested?
227 if [ "$prio_of_dep" -gt $dep_level ]; then
228# Just display it and proceed.
229 echo -en "\n Out: $depth${spaceSTR:0:$((depth + spacing))}${YELLOW}${id_of_dep}${OFF} $priostring $buildstring"
230 continue
231 fi
232# Otherwise, let's build the corresponding subgraph.
233 xsltproc --stringparam idofdep "$id_of_dep" \
234 --stringparam MTA "$MAIL_SERVER" \
235 -o ${id_of_dep}.dep \
236 ../xsl/dependencies.xsl ../packages.xml
237
238 if [[ -s ${id_of_dep}.dep ]]; then # this dependency has dependencies
239 generate_subgraph ${id_of_dep}.dep $prio_of_dep $depth $build_of_dep
240 else # id_of_dep has no dependencies, just touch the file and display
241 touch ${id_of_dep}.dep
242 echo -en "\nLeaf: $depth${spaceSTR:0:$((depth + spacing))}${CYAN}${id_of_dep}${OFF} $priostring $buildstring"
243 fi
244done
245} <$depFile
246depth=$(( depth - 1 ))
247if (( depth < 10 )); then spacing=1; else spacing=0; fi
248echo -en "\n End: $depth${spaceSTR:0:$((depth + spacing))}${GREEN}${depFile%.dep}${OFF}"
249return 0
250}
251
252#-----------#
253path_to() { #
254#-----------#
255: <<inline_doc
256 function: check whether there is a path from $1 to $2 on the graph
257 input vars: $1 contains the filename of the starting node.
258 $2 contains the name of the node to reach
259 $3 contains the weight above which we do not want to
260 follow an edge
261 $seen (global) contains the list of already seen nodes.
262 It must ve set to " " prior to calling the function
263 returns: 0 if the node has been found
264 1 if not
265 on error: nothing
266 on success: nothing
267inline_doc
268local start=$1
269local seek=$2
270local prio=$3
271local prio_of_dep
272local build_of_dep
273local id_of_dep
274local r
275
276if test "${start%.dep}" = "$seek"; then return 0; fi
277seen="$seen${start%.dep} "
278if test -s $start; then
279 {
280 while read prio_of_dep build_of_dep id_of_dep; do
281 if test "$prio" -lt "$prio_of_dep"; then continue; fi
282 if ! test "${seen% $id_of_dep *}" = "$seen"; then continue; fi
283 if path_to ${id_of_dep}.dep $seek $prio; then return 0; fi
284 done
285 } < $start
286fi
287return 1
288}
289#------------------#
290clean_subgraph() { #
291#------------------#
292: <<inline_doc
293 function: Remove dangling edges and create groups of deps for "after"
294 deps: A-before->B-after->C becomes:
295 A -before-> Bgroupxx -before-> B
296 \
297 -before-> C
298 the name of the group is chosen so that it is unlikely as
299 a package name (so that it is removed when building the
300 xml book).
301 Also change the "first" qualifier so that a cycle:
302 A -first-> B ---chain---> A becomes:
303 B ---chain---> A -before-> B-pass1
304 and we remove all the dependencies which have a chain to
305 A in B-pass1.
306 Since we do not change anything else, it may happen that
307 nothing depends on B. In that case, B is appended to root.
308 input vars: None
309 files: <node>.dep files containing dangling edges and
310 "after" qualifiers
311 returns: 0
312 output: files: <node>.dep files containing no dangling edges and
313 no "after" qualifiers
314 on error: nothing
315 on success: nothing
316inline_doc
317
318local node
319local id_of_dep
320local prio_of_dep
321local build_of_dep
322local lines_to_remove
323local lines_to_change
324local parent
325local p
326local b
327local start
328local seen
329
330for node in $(ls *.dep); do
331 if test $node = root.dep; then continue; fi
332 echo Cleaning $node
333 lines_to_remove=
334 {
335 while read prio_of_dep build_of_dep id_of_dep; do
336 if ! test -f ${id_of_dep}.dep; then
337 lines_to_remove="$lines_to_remove $id_of_dep"
338 continue
339 fi
340 done
341 } <$node
342 for id_of_dep in $lines_to_remove; do
343 sed "/\ $id_of_dep\$/d" -i $node
344 done
345done
346for node in $(grep -l ' a ' *.dep); do
347 lines_to_remove=
348 echo Process "runtime" deps in $node
349 if ! [ -e ${node%.dep}groupxx.dep ]; then
350 b=0 # Nothing depends on <node>groupxx
351 for parent in $(grep -l ${node%.dep}\$ *); do
352 p=0 # No "after" dependency depends on this parent
353 for start in $(grep ' a ' $node | cut -d' ' -f3); do
354 seen=" " # global variable used in "path_to"
355 if path_to ${start}.dep ${parent%.dep} 3; then p=1; break; fi
356 done
357 if test $p = 0; then
358 b=1
359 sed -i "s/\ ${node%.dep}\$/&groupxx/" $parent
360 fi
361 done
362 echo "1 b ${node%.dep}" > ${node%.dep}groupxx.dep
363 if test $b = 0; then echo "1 b ${node%.dep}groupxx" >> root.dep; fi
364 fi
365 {
366 while read prio_of_dep build_of_dep id_of_dep; do
367 if test $build_of_dep = a; then
368 if ! grep -q ${id_of_dep} ${node%.dep}groupxx.dep; then
369 echo "$prio_of_dep b ${id_of_dep}" >> ${node%.dep}groupxx.dep
370 fi
371 lines_to_remove="$lines_to_remove $id_of_dep"
372 fi
373 done
374 } <$node
375 for id_of_dep in $lines_to_remove; do
376 sed "/\ $id_of_dep\$/d" -i $node
377 done
378done
379for node in $(grep -l ' f ' *); do
380 echo Process "first" deps in $node
381 lines_to_change=
382 {
383 while read prio_of_dep build_of_dep id_of_dep; do
384 if test $build_of_dep = f; then
385 if ! test -f ${id_of_dep}-pass1.dep; then
386 cp ${id_of_dep}{,-pass1}.dep;
387 fi
388 lines_to_change="$lines_to_change $id_of_dep"
389 unset lr # lines to remove in -pass1
390 {
391 while read p b start; do
392 seen=" " # global variable used in "path_to"
393 if path_to ${start}.dep ${node%.dep} $p; then
394 lr="$lr $start"
395 fi
396 done
397 } < ${id_of_dep}-pass1.dep
398 for p in $lr; do
399 sed "/\ $p\$/d" -i ${id_of_dep}-pass1.dep
400 done
401 fi
402 done
403 } <$node
404 for id_of_dep in $lines_to_change; do
405 sed "/\ $id_of_dep\$/"'{s/[[:digit:]] f /1 b /;s/$/-pass1/}' -i $node
406 if ! grep -q " $id_of_dep\$" *.dep; then
407 echo 1 b $id_of_dep >>root.dep
408 fi
409 done
410done
411} # End clean_subgraph
412
413#----------------------------#
414generate_dependency_tree() { #
415#----------------------------#
416: <<inline_doc
417 function: Create a subtree of the dependency tree
418 (recursive function)
419 input vars: $1 : file with a list of targets (child nodes)
420 the first line of the file is the link
421 $2 : priority (1=req, 2=rec, 3=opt)
422 returns: 0 if the tree has been successfully created
423 1 if we are backing up to the parent of a circular dep
424 and there are only required deps in the cycle
425 2 if we are backing up to the parent of a circular dep
426 and there are recommended deps and no optional deps in the
427 cycle
428 3 if we are backing up to the parent of a circular dep
429 and there are optiional deps in the cycle
430 modifies: vars: ParentNode is set when return is not 0
431 output: files: for each <pkg> with dependencies in $1,
432 a file <pkg>.tree and its dependencies
433 on error: nothing
434 on success: nothing
435inline_doc
436
437local depFile=$1
438local priority=$2
439local -a rootlink
440local -a priolink
441local -a otherlink
442local -i depth
443local -i count=0
444local id_of_dep
445local build_of_dep
446local prio_of_dep
447local parent
448local lines_to_remove=
449local srootlink
450local priostring
451local dpriostring
452local i
453
454{
455read -a rootlink
456depth=${#rootlink[*]}
457read -a priolink
458srootlink="${rootlink[*]} "
459case $priority in
460 1) priostring=required ;;
461 2) priostring=recommended ;;
462 3) priostring=optional ;;
463esac
464# start of depFile
465echo -en "\nNode: $depth${spaceSTR:0:$depth}${RED}${depFile%.tree}${OFF} $priostring"
466
467while read prio_of_dep build_of_dep id_of_dep; do
468 case $prio_of_dep in
469 1) dpriostring=required ;;
470 2) dpriostring=recommended ;;
471 3) dpriostring=optional ;;
472 esac
473# count entries in file
474 (( count++ ))
475# Has this entry already been seen?
476 if [ -f ${id_of_dep}.tree ]; then # found ${id_of_dep}.tree already in tree
477 otherlink=($(head -n 1 ${id_of_dep}.tree))
478 if [ -z "${otherlink[*]}" ]
479 then echo otherlink empty for $id_of_dep.tree
480 echo This should not happen, but happens to happen...
481 exit 1
482 fi
483#Do not use "${rootlink[*]}" =~ "${otherlink[*]}": case rootlink=(1 11)
484# and otherlink=(1 1)
485 if [[ ${srootlink#"${otherlink[*]} "} != ${srootlink} ]]; then # cir. dep
486 echo -en "\nCirc: $((depth+1))${spaceSTR:0:$((depth+1))}${YELLOW}${id_of_dep}${OFF} $dpriostring"
487# Find lowest priority in branch from parent to depFile:
488 p2=0
489 for (( i=${#otherlink[*]}; i < $depth ; i++ )) ; do
490 if (( ${priolink[i]} > $p2 )); then p2=${priolink[i]}; fi
491 done
492 if (( $prio_of_dep >= $p2 )); then # prune
493 lines_to_remove="$lines_to_remove $id_of_dep"
494 sed -i "/$id_of_dep/d" ${depFile/.tree/.dep}
495 else # find and set parent, then return lowest priority
496# The parent has the same link without the last entry.
497# We do not need otherlink anymore so just destroy the last element
498 unset otherlink[-1]
499 parentNode=$(grep ^"${otherlink[*]}"\$ -l *)
500 return $p2
501 fi
502 else # not circular: prune tree (but not .dep, since it may happen that
503 # the tree is destroyed and rebuilt in another order)
504 lines_to_remove="$lines_to_remove $id_of_dep"
505 fi # circular or not
506 continue # this dependency has already been seen, and the associated
507 # subtree computed. We are done
508 fi # Has this entry already been seen?
509# So, this entry has not already been seen. Let's build the corresponding
510# subtree. First check there is a subtree...
511# Use -s, because it may happen that after removing lines, .dep exists
512# but is empty.
513 if [[ -s ${id_of_dep}.dep ]]; then # this dependency has dependencies
514 sed "1i${rootlink[*]} $count\\
515${priolink[*]} $prio_of_dep" < ${id_of_dep}.dep \
516 > ${id_of_dep}.tree # add link and priolink
517 generate_dependency_tree ${id_of_dep}.tree $prio_of_dep
518# Test return value, in case we exchange dependencies
519 p2=$?
520 case $p2 in
521 0) # Normal return
522 ;;
523 $prio_of_dep) # we remove this dep, but since it may become unreachable,
524 # move it to be built later (as a dep of parent).
525 tree_erase ${id_of_dep}.tree
526 lines_to_remove="$lines_to_remove $id_of_dep"
527 sed -i "/${id_of_dep}/d" ${depFile/.tree/.dep}
528 echo "$prio_of_dep b $id_of_dep" >> $parentNode
529# must be added to .dep in case parentNode is destroyed when erasing
530# the tree
531 echo "$prio_of_dep b $id_of_dep" >> ${parentNode/.tree/.dep}
532 continue
533 ;;
534 *) # We are backing up
535 return $p2
536 ;;
537 esac
538 else # id_of_dep has no dependencies, just record the link in a file
539 # and print
540 echo "${rootlink[*]} $count" > ${id_of_dep}.tree
541 echo -en "\nLeaf: $(($depth+1))${spaceSTR:0:$(($depth+1))}${CYAN}${id_of_dep}${OFF} $dpriostring"
542 fi
543done
544echo -en "\n End: $depth${spaceSTR:0:$depth}${GREEN}${depFile%.tree}${OFF}"
545} <$depFile
546# It may happen that a file is created with several times
547# the same line. Normally, all those lines but one
548# would be flagged to be removed (or all of them if
549# the dependency appeared before). A simple sed /$line/d
550# destroys all the lines. We should instead remove
551# only one for each appearance of it in lines_to_remove.
552# so first get the position of last line and then delete
553# that line
554for line in $lines_to_remove
555 do lineno=$(sed -n /^[[:digit:]]\ b\ $line\$/= $depFile | tail -n1)
556 sed -i ${lineno}d $depFile
557done
558return 0
559}
560
561
562#---------------#
563tree_browse() { #
564#---------------#
565local file=$1
566local f
567
568#echo file=$file
569for f in $(grep '[^0-9 ]' $file | sed 's/.* //'); do
570# echo f=$f
571 if grep -q '[^0-9 ]' ${f}.tree ; then
572 tree_browse ${f}.tree
573 fi
574 echo $f
575done
576}
577
578#--------------#
579tree_erase() { #
580#--------------#
581local file=$1
582local f
583local rootlink
584local rootlink2
585
586#echo file=$file
587rootlink="$(head -n1 $file) "
588for f in $(grep '[^0-9 ]' $file | sed 's/.* //'); do
589 if [ -f ${f}.tree ]; then
590 rootlink2="$(head -n1 ${f}.tree) "
591# We want two things:
592# i) do not erase the file if it is in another branch
593# ii) do not erase the file if there is a circular dependency
594# for case i), we test that rootlink is contained in rootlink2
595# for case ii), we test that rootlink2 is not contained in
596# rootlink.
597# See comment above about srootlink
598 if [[ ${rootlink2#${rootlink}} != ${rootlink2} &&
599 ${rootlink#${rootlink2}} == ${rootlink} ]] ; then
600 tree_erase ${f}.tree
601 fi
602 fi
603done
604rm -f $file
605}
Note: See TracBrowser for help on using the repository browser.