source: BLFS/libs/func_dependencies@ a78cfc9

ablfs-more legacy trunk
Last change on this file since a78cfc9 was a78cfc9, checked in by Pierre Labastie <pierre.labastie@…>, 3 years ago

Better document the loop on "after" deps

Add also a description of that loop in the general header

  • Property mode set to 100644
File size: 25.3 KB
Line 
1#!/bin/bash
2#
3
4#-----------------------------------------------------------------------------#
5# This is a set of (recursive) functions for manipulating a dependency graph. #
6# We use algorithms and definitions from chapter 4 (mainly section 4.2) of #
7# https://algs4.cs.princeton.edu/. The graph we manipulate is the directed #
8# graph of the dependencies: nodes are packages in the BLFS book. A node A is #
9# connected to a node B if package A depends on B. A topological order (rev- #
10# erted) is exactly what we want for a build order. But a topological order #
11# only exists if the graph is acyclic. We'll therefore have to remove cycles. #
12# There are a number of other features we want to consider: #
13# - edges are weighted according to the dependency requirement: #
14# 1 for required #
15# 2 for recommended #
16# 3 for optional #
17# 4 for external #
18# We should consider only edges with weight lower or equal to that #
19# specified by the user, but see below. #
20# - we do not want to build the whole book. The user specifies a set of #
21# packages, and we have to consider only nodes reachable from this set #
22# using edges of weight not exceeding the specified weight. #
23# - when doing the topological sort, we want to consider all the edges and #
24# not only those not exceeding the specified weight: If a package A in the #
25# reachable subgraph depends optionally on another package B in the same #
26# subgraph, we want to build B before A. But this means we'll have to #
27# remove cycles for all weights. #
28# - dependencies have another qualifier: before or after. The problem: if a #
29# package A depends on B with an "after" qualifier, and a package C depends #
30# on A with a "before" qualifier, C may need B to be able to use A. So the #
31# only safe way to consider "after" qualifiers is to consider that they are #
32# "before" deps for any parent of the packages considered. There is an #
33# exception to that rule: if B depends on C (possibly through a chain of #
34# several dependencies), then C should still be built before B. #
35# We'll therefore have a 3 pass procedure. First build the set of nodes #
36# reachable from the root set. Second, remove dangling edges (those pointing #
37# to packages outside the node set), and move "after" edges to "before" edges #
38# originating from the parents. Third remove cycles and generate a #
39# topological sort. #
40# #
41# Pass 1: graph generation #
42# ======================== #
43# Data layout for pass 1 #
44# ---------------------- #
45# A node of the tree is represented by a text file <nodeName>.dep. Each edge #
46# starting from this node is represented by a line in this file. We keep #
47# those files in the same directory. We introduce a special node named root, #
48# whose edges point to the list of nodes requested by the user. Each line #
49# contains three fields: #
50# - the weight of the edge #
51# - the qualifier: "before" (b), "after" (a), or "first" (f) #
52# - the name of the destination of the edge (without the ".dep" extension) #
53# #
54# Recursive function "generate_subgraph" #
55# -------------------------------------- #
56# This function treats a node of the graph that is not a leaf and that is #
57# seen for the first time in the DFS. The dependencies of this node are #
58# known, and stored in a .dep file. For each dependency in that file, there #
59# are three cases: #
60# - the weight of the edge leading to that dependency is higher than #
61# requested. This dependency is discarded (some information printed) #
62# - the weight of the edge is lower or equal to requested, but the node #
63# has already been visited (the .dep file exists). Discard too (some #
64# information printed) #
65# - the weight of the edge is lower or equal to requested, and the node #
66# has not been seen: then the dependencies of that node are generated, #
67# and there are two cases: #
68# - the node has no dependencies: just create an empty .dep file, so #
69# that we know the node has been visited #
70# - the node has dependencies: call generate_subgraph for that node #
71# #
72# This function takes four parameters: #
73# - The node filename: this is the only one useful for the algorithm #
74# - The depth: number of steps starting from root (for pretty print only) #
75# - The weight of the edge leading to that node (for printing) #
76# - The qualifier (for printing) #
77# #
78# Pass 2: graph transformation #
79# ============================ #
80# We now have three loops over nodes of the graph #
81# Loop 1: Remove dead edges #
82# ------------------------- #
83# Since some nodes have not been created because the edge leading to them #
84# had too high a weight, the edges leading to them have to be suppressed. #
85# For each existing node file, we make a list of lines to remove by #
86# testing whether the destination exists. We then remove the lines. #
87# Another approach would be to make a temporary file and output only #
88# lines that should stay, then rename the file. This would save a loop. #
89# All in all it is an N*e process, where N is the number of nodes and e #
90# the average number of edges originating from a node. #
91# Loop 2: Treat "after" edges #
92# --------------------------- #
93# If a node is the origin of edges qualified as "after", we want the #
94# nodes which are the destination of those edges to be built after #
95# the origin node, but before any node that depend on the origin #
96# node. For that, the general rule is to change: #
97# P---b--->A---a--->D #
98# to: #
99# P---b--->Agroupxx---b--->A #
100# | #
101# ---b--->D #
102# But there is a problem if D depends on P, possibly through a chain, #
103# because we create a cycle which shouldn't exist. If this is the case, #
104# we leave A as a dependency of P: #
105# P---b--->A #
106# #
107# Agroupxx---b--->A #
108# | #
109# ---b--->D #
110# Doing so, it may happen that Agroupxx has no parent. We then add #
111# Agroupxx as a dependency of root. The problem with this algorithm is #
112# the search for paths from D to A, which may be exponential in the #
113# number of nodes in the graph. #
114# TODO: document other passes #
115# TODO: needs also to document the .tree files #
116# TODO: The following is obsolete #
117# Circular dependencies: #
118# #
119# In case we find a cirdular dependency, it has the form : #
120# parent->dependency_0->...->dependency_n->dependency_0 #
121# If we want to build dependency_n before dependency_0, no problem: #
122# we just prune the tree at dependency_n. If we want to build first #
123# dependency_0, we need to put dependency_n as a dependency of parent, #
124# then erase and rebuild the subtree from there. Now, we may have met #
125# another circular dependency in the subtree, and erasing the tree makes #
126# us forget the decision which was made. So, after first generating the #
127# list of dependencies from packages.xml, we keep the generated list in #
128# a file <nodeName>.odep, which we modify according to the decision which #
129# was made. #
130#---------------------------------------------------------------------------#
131
132# Global variables:
133# A string of spaces for indenting:
134declare -a spaceSTR=" "
135# When we are backing up from a circular dependency, `parentNode'
136# contains the node which has an edge entering the cycle
137declare parentNode
138
139#---------------------#
140generate_subgraph() { #
141#---------------------#
142: <<inline_doc
143 function: Create a subgraph of all the nodes reachable from the node
144 represented by the file whose name is $1. The edges considered
145 are those with maximal weight DEP_LEVEL (recursive function).
146 input vars: $1 : file name corresponding to the node whose edges will be
147 followed for the DFS
148 $2 : weight of the edge leading to this node
149 $3 : depth (root is 1)
150 $4 : qualifier (a for after, b for before, f for first)
151 externals: vars: DEP_LEVEL contains 1 if we want to build the
152 tree only for required dependencies,
153 2 if we want also recommended ones,
154 3 if we want also optional ones, but only
155 for the requested packages,
156 4 if we want all the dependencies
157 (excluding external of course)
158 MAIL_SERVER contains the name of the MTA we want to use.
159 files: ../xsl/dependencies.xsl: stylesheet for creating the
160 .dep files
161 ../packages.xml: File containing packages id
162 and dependencies
163 returns: 0 if the tree has been successfully created
164 output: files: for each node reachable from $1, a file <node>.dep.
165 on error: nothing
166 on success: nothing
167inline_doc
168
169local depFile=$1
170local -i weight=$2
171local -i depth=$3
172local qualifier=$4
173local -i spacing=0
174local priostring
175local buildstring
176local id_of_dep
177local prio_of_dep
178local build_of_dep
179local dep_level
180
181if (( depth < 10 )); then spacing=1; fi
182case $weight in
183 1) priostring=required ;;
184 2) priostring=recommended ;;
185 3) priostring=optional ;;
186esac
187case $qualifier in
188 a) buildstring=runtime ;;
189 b|f) buildstring= ;;
190esac
191dep_level=$DEP_LEVEL
192if [ "$dep_level" = 3 ] && [ "$depth" -gt 2 ]; then dep_level=2; fi
193if [ "$dep_level" -gt 3 ]; then dep_level=3; fi
194echo -en "\nNode: $depth${spaceSTR:0:$(( depth + spacing ))}${RED}${depFile%.dep}${OFF} $priostring $buildstring"
195
196depth=$(( depth + 1 ))
197if (( depth < 10 )); then spacing=1; else spacing=0; fi
198# Start of loop
199{
200while read prio_of_dep build_of_dep id_of_dep; do
201 case $prio_of_dep in
202 1) priostring=required ;;
203 2) priostring=recommended ;;
204 3) priostring=optional ;;
205 4) priostring=external ;;
206 esac
207 case $build_of_dep in
208 a ) buildstring=runtime ;;
209 b|f) buildstring= ;;
210 esac
211# Has this entry already been seen?
212# TODO: no there is no special case!
213# We have a special case here: if the entry has been seen at depth > 2
214# and now depth=2 and DEP_LEVEL=3, optional deps have not been processed.
215# If this is the case, just consider it has not been seen.
216 if [ -f ${id_of_dep}.dep ] ; then
217 case $depth$DEP_LEVEL in
218 23) ;;
219 *)
220# Just display it and proceed.
221 echo -en "\nEdge: $depth${spaceSTR:0:$((depth + spacing))}${MAGENTA}${id_of_dep}${OFF} $priostring $buildstring"
222 continue
223 ;;
224 esac
225 fi
226# Is the weight higher than requested?
227 if [ "$prio_of_dep" -gt $dep_level ]; then
228# Just display it and proceed.
229 echo -en "\n Out: $depth${spaceSTR:0:$((depth + spacing))}${YELLOW}${id_of_dep}${OFF} $priostring $buildstring"
230 continue
231 fi
232# Otherwise, let's build the corresponding subgraph.
233 xsltproc --stringparam idofdep "$id_of_dep" \
234 --stringparam MTA "$MAIL_SERVER" \
235 -o ${id_of_dep}.dep \
236 ../xsl/dependencies.xsl ../packages.xml
237
238 if [[ -s ${id_of_dep}.dep ]]; then # this dependency has dependencies
239 generate_subgraph ${id_of_dep}.dep $prio_of_dep $depth $build_of_dep
240 else # id_of_dep has no dependencies, just touch the file and display
241 touch ${id_of_dep}.dep
242 echo -en "\nLeaf: $depth${spaceSTR:0:$((depth + spacing))}${CYAN}${id_of_dep}${OFF} $priostring $buildstring"
243 fi
244done
245} <$depFile
246depth=$(( depth - 1 ))
247if (( depth < 10 )); then spacing=1; else spacing=0; fi
248echo -en "\n End: $depth${spaceSTR:0:$((depth + spacing))}${GREEN}${depFile%.dep}${OFF}"
249return 0
250}
251
252#-----------#
253path_to() { #
254#-----------#
255: <<inline_doc
256 function: check whether there is a path from $1 to $2 on the graph
257 input vars: $1 contains the filename of the starting node.
258 $2 contains the name of the node to reach
259 $3 contains the weight above which we do not want to
260 follow an edge
261 $seen (global) contains the list of already seen nodes.
262 It must ve set to " " prior to calling the function
263 returns: 0 if the node has been found
264 1 if not
265 on error: nothing
266 on success: nothing
267inline_doc
268local start=$1
269local seek=$2
270local prio=$3
271local prio_of_dep
272local build_of_dep
273local id_of_dep
274local r
275
276if test "${start%.dep}" = "$seek"; then return 0; fi
277seen="$seen${start%.dep} "
278if test -s $start; then
279 {
280 while read prio_of_dep build_of_dep id_of_dep; do
281 if test "$prio" -lt "$prio_of_dep"; then continue; fi
282 if ! test "${seen% $id_of_dep *}" = "$seen"; then continue; fi
283 if path_to ${id_of_dep}.dep $seek $prio; then return 0; fi
284 done
285 } < $start
286fi
287return 1
288}
289#------------------#
290clean_subgraph() { #
291#------------------#
292: <<inline_doc
293 function: Remove dangling edges and create groups of deps for "after"
294 deps: A-before->B-after->C becomes:
295 A -before-> Bgroupxx -before-> B
296 \
297 -before-> C
298 the name of the group is chosen so that it is unlikely as
299 a package name (so that it is removed when building the
300 xml book).
301 input vars: None
302 files: <node>.dep files containing dangling edges and
303 "after" qualifiers
304 returns: 0
305 output: files: <node>.dep files containing no dangling edges and
306 no "after" qualifiers
307 on error: nothing
308 on success: nothing
309inline_doc
310
311local node
312local id_of_dep
313local prio_of_dep
314local build_of_dep
315local lines_to_remove
316local lines_to_change
317local parent
318local p
319local b
320local start
321local seen
322
323for node in $(ls *.dep); do
324 if test $node = root.dep; then continue; fi
325 echo Cleaning $node
326 lines_to_remove=
327 {
328 while read prio_of_dep build_of_dep id_of_dep; do
329 if ! test -f ${id_of_dep}.dep; then
330 lines_to_remove="$lines_to_remove $id_of_dep"
331 continue
332 fi
333 done
334 } <$node
335 for id_of_dep in $lines_to_remove; do
336 sed "/\ $id_of_dep\$/d" -i $node
337 done
338done
339for node in $(grep -l ' a ' *.dep); do
340 lines_to_remove=
341 echo Process "runtime" deps in $node
342 if ! [ -e ${node%.dep}groupxx.dep ]; then
343 b=0 # Nothing depends on <node>groupxx
344 for parent in $(grep -l ${node%.dep}\$ *); do
345 p=0 # No "after" dependency depends on this parent
346 for start in $(grep ' a ' $node | cut -d' ' -f3); do
347 seen=" " # global variable used in "path_to"
348 if path_to ${start}.dep ${parent%.dep} 3; then p=1; break; fi
349 done
350 if test $p = 0; then
351 b=1
352 sed -i "s/\ ${node%.dep}\$/&groupxx/" $parent
353 fi
354 done
355 echo "1 b ${node%.dep}" > ${node%.dep}groupxx.dep
356 if test $b = 0; then echo "1 b ${node%.dep}groupxx" >> root.dep; fi
357 fi
358 {
359 while read prio_of_dep build_of_dep id_of_dep; do
360 if test $build_of_dep = a; then
361 if ! grep -q ${id_of_dep} ${node%.dep}groupxx.dep; then
362 echo "$prio_of_dep b ${id_of_dep}" >> ${node%.dep}groupxx.dep
363 fi
364 lines_to_remove="$lines_to_remove $id_of_dep"
365 fi
366 done
367 } <$node
368 for id_of_dep in $lines_to_remove; do
369 sed "/\ $id_of_dep\$/d" -i $node
370 done
371done
372for node in $(grep -l ' f ' *); do
373 echo Process "first" deps in $node
374 lines_to_change=
375 {
376 while read prio_of_dep build_of_dep id_of_dep; do
377 if test $build_of_dep = f; then
378 if ! test -f ${id_of_dep}-pass1.dep; then
379 cp ${id_of_dep}{,-pass1}.dep;
380 fi
381 lines_to_change="$lines_to_change $id_of_dep"
382 unset lr # lines to remove in -pass1
383 {
384 while read p b start; do
385 seen=" " # global variable used in "path_to"
386 if path_to ${start}.dep ${node%.dep} $p; then
387 lr="$lr $start"
388 fi
389 done
390 } < ${id_of_dep}-pass1.dep
391 for p in $lr; do
392 sed "/\ $p\$/d" -i ${id_of_dep}-pass1.dep
393 done
394 fi
395 done
396 } <$node
397 for id_of_dep in $lines_to_change; do
398 sed "/\ $id_of_dep\$/"'{s/[[:digit:]] f /1 b /;s/$/-pass1/}' -i $node
399 done
400done
401} # End clean_subgraph
402
403#----------------------------#
404generate_dependency_tree() { #
405#----------------------------#
406: <<inline_doc
407 function: Create a subtree of the dependency tree
408 (recursive function)
409 input vars: $1 : file with a list of targets (child nodes)
410 the first line of the file is the link
411 $2 : priority (1=req, 2=rec, 3=opt)
412 returns: 0 if the tree has been successfully created
413 1 if we are backing up to the parent of a circular dep
414 and there are only required deps in the cycle
415 2 if we are backing up to the parent of a circular dep
416 and there are recommended deps and no optional deps in the
417 cycle
418 3 if we are backing up to the parent of a circular dep
419 and there are optiional deps in the cycle
420 modifies: vars: ParentNode is set when return is not 0
421 output: files: for each <pkg> with dependencies in $1,
422 a file <pkg>.tree and its dependencies
423 on error: nothing
424 on success: nothing
425inline_doc
426
427local depFile=$1
428local priority=$2
429local -a rootlink
430local -a priolink
431local -a otherlink
432local -i depth
433local -i count=0
434local id_of_dep
435local build_of_dep
436local prio_of_dep
437local parent
438local lines_to_remove=
439local srootlink
440local priostring
441local dpriostring
442local i
443
444{
445read -a rootlink
446depth=${#rootlink[*]}
447read -a priolink
448srootlink="${rootlink[*]} "
449case $priority in
450 1) priostring=required ;;
451 2) priostring=recommended ;;
452 3) priostring=optional ;;
453esac
454# start of depFile
455echo -en "\nNode: $depth${spaceSTR:0:$depth}${RED}${depFile%.tree}${OFF} $priostring"
456
457while read prio_of_dep build_of_dep id_of_dep; do
458 case $prio_of_dep in
459 1) dpriostring=required ;;
460 2) dpriostring=recommended ;;
461 3) dpriostring=optional ;;
462 esac
463# count entries in file
464 (( count++ ))
465# Has this entry already been seen?
466 if [ -f ${id_of_dep}.tree ]; then # found ${id_of_dep}.tree already in tree
467 otherlink=($(head -n 1 ${id_of_dep}.tree))
468 if [ -z "${otherlink[*]}" ]
469 then echo otherlink empty for $id_of_dep.tree
470 echo This should not happen, but happens to happen...
471 exit 1
472 fi
473#Do not use "${rootlink[*]}" =~ "${otherlink[*]}": case rootlink=(1 11)
474# and otherlink=(1 1)
475 if [[ ${srootlink#"${otherlink[*]} "} != ${srootlink} ]]; then # cir. dep
476 echo -en "\nCirc: $((depth+1))${spaceSTR:0:$((depth+1))}${YELLOW}${id_of_dep}${OFF} $dpriostring"
477# Find lowest priority in branch from parent to depFile:
478 p2=0
479 for (( i=${#otherlink[*]}; i < $depth ; i++ )) ; do
480 if (( ${priolink[i]} > $p2 )); then p2=${priolink[i]}; fi
481 done
482 if (( $prio_of_dep >= $p2 )); then # prune
483 lines_to_remove="$lines_to_remove $id_of_dep"
484 sed -i "/$id_of_dep/d" ${depFile/.tree/.dep}
485 else # find and set parent, then return lowest priority
486# The parent has the same link without the last entry.
487# We do not need otherlink anymore so just destroy the last element
488 unset otherlink[-1]
489 parentNode=$(grep ^"${otherlink[*]}"\$ -l *)
490 return $p2
491 fi
492 else # not circular: prune tree (but not .dep, since it may happen that
493 # the tree is destroyed and rebuilt in another order)
494 lines_to_remove="$lines_to_remove $id_of_dep"
495 fi # circular or not
496 continue # this dependency has already been seen, and the associated
497 # subtree computed. We are done
498 fi # Has this entry already been seen?
499# So, this entry has not already been seen. Let's build the corresponding
500# subtree. First check there is a subtree...
501# Use -s, because it may happen that after removing lines, .dep exists
502# but is empty.
503 if [[ -s ${id_of_dep}.dep ]]; then # this dependency has dependencies
504 sed "1i${rootlink[*]} $count\\
505${priolink[*]} $prio_of_dep" < ${id_of_dep}.dep \
506 > ${id_of_dep}.tree # add link and priolink
507 generate_dependency_tree ${id_of_dep}.tree $prio_of_dep
508# Test return value, in case we exchange dependencies
509 p2=$?
510 case $p2 in
511 0) # Normal return
512 ;;
513 $prio_of_dep) # we remove this dep, but since it may become unreachable,
514 # move it to be built later (as a dep of parent).
515 tree_erase ${id_of_dep}.tree
516 lines_to_remove="$lines_to_remove $id_of_dep"
517 sed -i "/${id_of_dep}/d" ${depFile/.tree/.dep}
518 echo "$prio_of_dep b $id_of_dep" >> $parentNode
519# must be added to .dep in case parentNode is destroyed when erasing
520# the tree
521 echo "$prio_of_dep b $id_of_dep" >> ${parentNode/.tree/.dep}
522 continue
523 ;;
524 *) # We are backing up
525 return $p2
526 ;;
527 esac
528 else # id_of_dep has no dependencies, just record the link in a file
529 # and print
530 echo "${rootlink[*]} $count" > ${id_of_dep}.tree
531 echo -en "\nLeaf: $(($depth+1))${spaceSTR:0:$(($depth+1))}${CYAN}${id_of_dep}${OFF} $dpriostring"
532 fi
533done
534echo -en "\n End: $depth${spaceSTR:0:$depth}${GREEN}${depFile%.tree}${OFF}"
535} <$depFile
536# It may happen that a file is created with several times
537# the same line. Normally, all those lines but one
538# would be flagged to be removed (or all of them if
539# the dependency appeared before). A simple sed /$line/d
540# destroys all the lines. We should instead remove
541# only one for each appearance of it in lines_to_remove.
542# so first get the position of last line and then delete
543# that line
544for line in $lines_to_remove
545 do lineno=$(sed -n /^[[:digit:]]\ b\ $line\$/= $depFile | tail -n1)
546 sed -i ${lineno}d $depFile
547done
548return 0
549}
550
551
552#---------------#
553tree_browse() { #
554#---------------#
555local file=$1
556local f
557
558#echo file=$file
559for f in $(grep '[^0-9 ]' $file | sed 's/.* //'); do
560# echo f=$f
561 if grep -q '[^0-9 ]' ${f}.tree ; then
562 tree_browse ${f}.tree
563 fi
564 echo $f
565done
566}
567
568#--------------#
569tree_erase() { #
570#--------------#
571local file=$1
572local f
573local rootlink
574local rootlink2
575
576#echo file=$file
577rootlink="$(head -n1 $file) "
578for f in $(grep '[^0-9 ]' $file | sed 's/.* //'); do
579 if [ -f ${f}.tree ]; then
580 rootlink2="$(head -n1 ${f}.tree) "
581# We want two things:
582# i) do not erase the file if it is in another branch
583# ii) do not erase the file if there is a circular dependency
584# for case i), we test that rootlink is contained in rootlink2
585# for case ii), we test that rootlink2 is not contained in
586# rootlink.
587# See comment above about srootlink
588 if [[ ${rootlink2#${rootlink}} != ${rootlink2} &&
589 ${rootlink#${rootlink2}} == ${rootlink} ]] ; then
590 tree_erase ${f}.tree
591 fi
592 fi
593done
594rm -f $file
595}
Note: See TracBrowser for help on using the repository browser.