Opened 6 years ago

Closed 5 years ago

#11286 closed enhancement (fixed)

mdadm-4.1 (wait until 4.2)

Reported by: Bruce Dubbs Owned by: thomas
Priority: normal Milestone: hold
Component: BOOK Version: SVN
Severity: normal Keywords:
Cc:

Description

New minor version.

Change History (17)

comment:1 by thomas, 6 years ago

Owner: changed from blfs-book to thomas
Status: newassigned

sed to avoid compile error no longer needed.

Checking the sed for test suite...

comment:2 by thomas, 6 years ago

sed for testsuite has no effect, could drop it.

mdassemble no longer exists

Last edited 6 years ago by thomas (previous) (diff)

comment:3 by Douglas R. Reno, 6 years ago

That's kind of a weird binary to remove. I know we don't use it in our initramfs, but it's still important for some systems that are currently in production. I know my server uses it, at least.

comment:4 by thomas, 6 years ago

man mdassemble of mdadm-4.0 says "... Invoking mdassemble has the same effect as invoking mdadm --assemble --scan. ...". In 4.1 that manpage disappeared, too.

Assuming that mdadm can be used instead. If doing "make everything" also a mdadm.static is generated which {sh,c}ould be used in initrd.

comment:5 by Bruce Dubbs, 6 years ago

This is a ticket from the filesystems chapter that has been hanging for about a month. If you want, I can take it.

comment:6 by Bruce Dubbs, 6 years ago

ANNOUNCE-4.1:

The update constitutes more than one year of enhancements and bug fixes including for IMSM RAID, Partial Parity Log, clustered RAID support, improved testing, and gcc-8 support.

comment:7 by thomas, 6 years ago

I've the feeling that there has changed a bit more than just a few bugfixes (mdassemble disappeared) and the changes should be somehow tested. I've a machine with RAID5 volumes but its the main node (hosting all the VMs and such things) so testing is not just an easy task like building another LFS instance. But of course, if you want to upgrade right now, just do it.

From time to time it happens that the array gets rebuilt when doing a normal reboot (than, system is more or less unusable slow for 1.5-2h which makes testing not really funny). After applying a patch (which prevents mdadm from being killed while shutdown and the devices still in use) to the LFS-bootscripts I sent months ago to the LFS-dev ML, its much better but it still happens for whatever reason. With new mdadm I'd like to poke around again in this field. That's the reason why the ticket remains open but yes, thats independed of which version is in the book.

So, if you want to upgrade right now, go for it. Maybe a one-liner script could be added to simulate a mdassemble for compatibility reasons (if it would make sense).

comment:8 by Bruce Dubbs, 6 years ago

You shouldn't need to to use raid to run the tests. The tests should use things like /dev/loop or create /dev/md0 and use that. I tried to run the tests on a system that did not have mdadm installed and got an error (mdadm command not found!). So I ran the tests after install, but still got a lot of failures:

I got 21 failures (out of 80):

fail01r5integ.log
fail01raid6integ.log
fail04r5swap.log
fail05r1-re-add-nosuper.log
fail07autoassemble.log
fail07autodetect.log
fail07changelevelintr.log
fail07changelevels.log
fail07reshape5intr.log
fail07revert-grow.log
fail07revert-inplace.log
fail07revert-shrink.log
fail07testreshape5.log
fail09imsm-create-fail-rebuild.log
fail09imsm-overlap.log
fail10ddf-assemble-missing.log
fail10ddf-fail-create-race.log
fail10ddf-fail-twice.log
fail10ddf-incremental-wrong-order.log
fail19raid6auto-repair.log
fail19raid6repair.log

Thomas, did you get something different?

The errors may have something to do with the amount of ram on the system. I have 16G but still got several errors like: dd: error writing '/dev/loop2': No space left on device

Note that I have 16G on my test system.

comment:9 by thomas, 6 years ago

You even have not to have mdadm active on the system where you want to run the tests. That makes it even harder. I've a 32GB machine but all disks are RAID5 (well, yes, that design allows discussions) so its not easy for me to run the test suite.

In 8GB VMs (where no mdadm is active) i see also several tests failing. Just re-running the tests to compare the results to your list.

The tests i spoke about are about taking mdadm to live and work stable in the real environment. That includes the questions about when to terminate mdadm process at shutdown and how, what causes an array to be rebuilt at startup even there was no crash but a very clean and normal reboot and such.

comment:10 by thomas, 6 years ago

I have different results. It seems that the tests do not even go over test 10ddf-create. Tests terminated after this failure.

/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00linear... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00multipath... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00names... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00raid0... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00raid1... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00raid10... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00raid4... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00raid5... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00raid6... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/00readonly... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/01r1fail... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/01r5fail... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/01r5integ... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/01raid6integ... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/01replace... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/02lineargrow... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/02r1add... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/02r1grow... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/02r5grow... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/02r6grow... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/03assem-incr... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/03r0assem... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/03r5assem... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/03r5assem-failed... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/03r5assemV1... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/04r0update... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/04r1update... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/04r5swap... FAILED - see test-logs/04r5swap.log and test-logs/fail04r5swap.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/04update-metadata... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/04update-uuid... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-add-internalbitmap... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-add-internalbitmap-v1a... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-add-internalbitmap-v1b... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-add-internalbitmap-v1c... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-bitmapfile... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-grow-external... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-grow-internal... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-grow-internal-1... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-internalbitmap... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-internalbitmap-v1a... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-internalbitmap-v1b... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-internalbitmap-v1c... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-n3-bitmapfile... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-re-add... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-re-add-nosuper... FAILED - see test-logs/05r1-re-add-nosuper.log and test-logs/fail05r1-re-add-nosuper.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-remove-internalbitmap... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-remove-internalbitmap-v1a... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-remove-internalbitmap-v1b... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r1-remove-internalbitmap-v1c... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r5-bitmapfile... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r5-internalbitmap... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r6-bitmapfile... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/05r6tor0... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/06name... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/06sysfs... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/06wrmostly... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07autoassemble... FAILED - see test-logs/07autoassemble.log and test-logs/fail07autoassemble.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07autodetect... FAILED - see test-logs/07autodetect.log and test-logs/fail07autodetect.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07changelevelintr... FAILED - see test-logs/07changelevelintr.log and test-logs/fail07changelevelintr.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07changelevels... FAILED - see test-logs/07changelevels.log and test-logs/fail07changelevels.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07layouts... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07reshape5intr... FAILED - see test-logs/07reshape5intr.log and test-logs/fail07reshape5intr.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07revert-grow... FAILED - see test-logs/07revert-grow.log and test-logs/fail07revert-grow.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07revert-inplace... FAILED - see test-logs/07revert-inplace.log and test-logs/fail07revert-inplace.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07revert-shrink... FAILED - see test-logs/07revert-shrink.log and test-logs/fail07revert-shrink.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/07testreshape5... FAILED - see test-logs/07testreshape5.log and test-logs/fail07testreshape5.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/09imsm-assemble... succeeded
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/09imsm-create-fail-rebuild... FAILED - see test-logs/09imsm-create-fail-rebuild.log and test-logs/fail09imsm-create-fail-rebuild.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/09imsm-overlap... FAILED - see test-logs/09imsm-overlap.log and test-logs/fail09imsm-overlap.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/10ddf-assemble-missing... FAILED - see test-logs/10ddf-assemble-missing.log and test-logs/fail10ddf-assemble-missing.log for details
/home/lfs/tmp/mdadm/build/mdadm-4.1/tests/10ddf-create... 
	ERROR: dmesg prints errors when testing 10ddf-create! 

FAILED - see test-logs/10ddf-create.log and test-logs/fail10ddf-create.log for details

comment:11 by Bruce Dubbs, 6 years ago

It seems pretty odd that we would get these types of failures. About the only thing I can think of is that we have different kernel configurations. I did look at mine a while ago, but I cant figure out what it would be.

One thing we might try is to try running the build/test on a commercial distro like Debian and compare the test log from that.

comment:12 by Pierre Labastie, 6 years ago

Thomas, looks like the last test which failed complained about dmesg printing messages. Maybe you could try to issue dmesg -D. I'm presently running the tests under debian on my machine.

comment:13 by Pierre Labastie, 6 years ago

Forgot to say: don't forget to issue dmesg -E after the tests run...

comment:14 by Bruce Dubbs, 6 years ago

I tried running the tests on a fresh Debian 9.6.0 install and got essentially the same error messages you did. We could say that the package does not come with an operating test suite or say that many tests fail for unknown reasons.

I did post a message upstream a few years ago and they were only marginally helpful. Should we try again?

https://www.spinics.net/lists/raid/msg51535.html

comment:15 by Bruce Dubbs, 6 years ago

Milestone: 8.4hold
Summary: mdadm-4.1mdadm-4.1 (wait until 4.2)

I'm moving this to hold. There are just too many issues identified by the regression tests. I have spent a lot of time trying to determine why so many tests fail. The failures are for different reasons.

For instance 05r1-re-add-nosuper fails because a script variable, $dir, is not defined. If that is done, then the test passes.

I checked 07autoassemble and the failure is more complex. What is happenign is that it is creating two md devices:

  • mdadm -CR /dev/md1 -l1 -n2 /dev/loop0 /dev/loop1 --homehost=testing
  • mdadm -CR /dev/md0 -l0 -n2 /dev/md1 /dev/loop2 --homehost=testing

and then stopping and auto-reassembling:

  • mdadm -Ss
  • mdadm -As -c /dev/null --homehost=testing

The problem here is that it reassembles md0 before md1 and fails.

I'll also note that the package was released Oct 27, 2018 and Arch has not yet updated yet although it is flagged out of date, AFAICT Fedora and Debian have not updated yet either.

in reply to:  8 comment:16 by Bruce Dubbs, 6 years ago

Replying to bdubbs:

You shouldn't need to to use raid to run the tests.

I found this comment in the gentoo instructions:

# The tests edit values in /proc and run tests on software raid devices.
# Thus, they shouldn't be run on systems with active software RAID devices.
Last edited 6 years ago by Bruce Dubbs (previous) (diff)

comment:17 by Bruce Dubbs, 5 years ago

Resolution: fixed
Status: assignedclosed

Fixed at revision 22192.

Note: See TracTickets for help on using tickets.