Opened 16 years ago

Closed 16 years ago

#2189 closed defect (fixed)

cleanfs may not clean /var files when system clock is localtime

Reported by: dnicholson@… Owned by: DJ Lucas
Priority: normal Milestone: 6.4
Component: Bootscripts Version: SVN
Severity: normal Keywords: bootscripts
Cc:

Description

Continuation from #2160. The cleanfs script checks if files are older than /proc before removing them /var/run and /var/lock. However, /proc is mounted before the system clock is synced from the hardware clock in the clock script. This can cause the modification time of /proc to be inaccurate, especially when the hardware clock is in localtime and the time on /proc will be the offset from UTC.

This can cause files in /var to be from a previous boot to appear to be newer than /proc, and they will not be removed by cleanfs. For either pid or lock files, you will have stale files which confuse the other bootscripts.

There are two possible solutions:

  1. Don't use a marker file at all. I noticed that on Fedora they just reap everything in /var/run and /var/lock during sysinit. One drawback would be that no processes could write to those directories before cleanfs, but that shouldn't be a problem if cleanfs is ordered right after mountfs. Since /var isn't guaranteed to be mounted until mountfs, you wouldn't lose anything.
  1. Choose a different marker file. I think that /etc/mtab is a good candidate. Again, if you put cleanfs right after mountfs, that's the first time /var is guaranteed to be writable. And /etc/mtab is updated during mountfs, so you would be sure that everything older than mtab is from a previous boot. However, if you run cleanfs sometime after boot, that's criteria is probably not accurate anymore.

I'd personally rather go with 1.

Change History (17)

comment:1 by bdubbs@…, 16 years ago

Just a thought.

Why not mount /var/run and /var/lock as tmpfs and do away with cleaning those directories?

in reply to:  1 ; comment:2 by DJ Lucas, 16 years ago

Replying to bdubbs@linuxfromscratch.org:

Just a thought.

I think it is a good idea for most users. I personally will not use tmpfs. I actually use the bootscript output in a job, but I'm probably a very strange exception. Also, make sure that utmp is created if it doesn't exist...something from the depinit maintainer's notes. Probably doesn't affect us, but I'm not sure.

This still doesn't fix #2160 (but I think Dan's proposed patch does, I just haven't used the main scripts in a long time), and Gerrard mentioned moving setclock up in the order in another part of the thread on LFS-Dev. setclock should still be moved if for nothing other than better logging IMO.

in reply to:  2 comment:3 by dnicholson@…, 16 years ago

Replying to dj@linuxfromscratch.org:

This still doesn't fix #2160 (but I think Dan's proposed patch does, I just haven't used the main scripts in a long time), and Gerrard mentioned moving setclock up in the order in another part of the thread on LFS-Dev. setclock should still be moved if for nothing other than better logging IMO.

There are actually three issues here.

The patch for #2160 is still needed in any situation, I think. Whether or not there's goofiness from initial boot with stale pid files or it happens at a later time when you're trying to start services, I think it's best to just ignore pid files that point to dead or incorrect processes and remove them.

Then there's the cleanfs issue with trying to remove files state files that are from a previous boot. Mounting /var/run and /var/lock as tmpfs would solve that issue. I think it's a bit of overkill, but I wouldn't be opposed to it.

Then there's the issue of getting the clock synchronized as early as possible. It wouldn't affect either of the above two issues directly, but there are plenty of other good reasons to do it.

comment:5 by bdubbs@…, 16 years ago

Milestone: 7.06.4

comment:6 by bdubbs@…, 16 years ago

Component: BookBootscripts

comment:7 by DJ Lucas, 16 years ago

Owner: changed from lfs-book@… to DJ Lucas
Status: newassigned

comment:8 by DJ Lucas, 16 years ago

I disagree about the third comment in the last entry regarding how it would affect the cleanfs script. Looking at the following output, ISTM that setclock could actually be run even before mountkernfs or udev. The advantage is for bootlogging, *all* logged output will have (at least according to the hwclock) the correct date and time.

[dj@name25 lfs-bootscripts-20081023]# sudo /sbin/hwclock --test --debug --directisa --noadjfile --hctosys --utc
hwclock from util-linux-2.12r
Using direct I/O instructions to ISA clock.
Assuming hardware clock is kept in UTC time.
Waiting for clock tick...
...got clock tick
Time read from Hardware Clock: 2008/10/26 06:04:26
Hw clock time : 2008/10/26 06:04:26 = 1225001066 seconds since 1969
Calling settimeofday:
	tv.tv_sec = 1225001066, tv.tv_usec = 0
	tz.tz_minuteswest = 300
Not setting system clock because running in test mode.
[dj@name25 lfs-bootscripts-20081023]# 

The only problem is the command as given is only valid on x86. The --directisa switch is not valid on other arches, however, it can left out by redirecting stderr to the static node in /lib/udev/devices/null.

[dj@name25 lfs-bootscripts-20081023]# sudo mv /dev/rtc /dev/rtc-bak
[dj@name25 lfs-bootscripts-20081023]# sudo /sbin/hwclock --test --debug --noadjfile --hctosys --utc
hwclock from util-linux-2.12r
hwclock: Open of /dev/rtc failed, errno=2: No such file or directory.
Using direct I/O instructions to ISA clock.
<snip otherwise similar output>
[dj@name25 lfs-bootscripts-20081023]# sudo /sbin/hwclock --test --debug --noadjfile --hctosys --utc 2>/lib/udev/devices/null
hwclock from util-linux-2.12r
Using direct I/O instructions to ISA clock.
<snip>
[dj@name25 lfs-bootscripts-20081023]# sudo mv /dev/rtc-bak /dev/rtc

Off to testing...

in reply to:  8 comment:9 by bryan@linuxfromscratch.org, 16 years ago

Replying to dj@linuxfromscratch.org:

The only problem is the command as given is only valid on x86.

And not even x86-64! No, it's not supported yet, but I thought it was planned for 7.0?

(I've just tried it there, with a 64-bit hwclock from an older build of CLFS:

# mv /dev/rtc /dev/rtc-old
# hwclock --test --debug --noadjfile --hctosys --utc
hwclock from util-linux-2.12r
hwclock: Open of /dev/rtc failed, errno=2: No such file or directory.
No usable clock interface found.
Cannot access the Hardware Clock via any known method.
# echo $?
1

to confirm.)

The --directisa switch is not valid on other arches, however, it can left out by redirecting stderr to the static node in /lib/udev/devices/null.

It's not that the switch can be left out on other arches (it doesn't make my 64-bit hwclock fail, for instance). It's that "direct ISA" mode doesn't work at all on other arches. ;-)

I think directly after udev makes the most sense. No, hwclock doesn't require any particular device node on x86-32, but x86-32 is pretty much almost-dead at this point anyway.

(I suspect, but am not sure, that the lack of direct-ISA support is due to the 64-bit kernel, not the 64-bit hwclock. But I haven't tried to confirm or refute this.)

comment:10 by DJ Lucas, 16 years ago

OK. No go on that solution. I'm tempted to suggest adding a device in the root filesystem before the tempfs is mounted, and may do so if the community decides to go with lsb-v3 for 7.0 to cover the accurate bootloging issue. Would be nice if we could choose the rtc. For now, however, stock lfs-bootscripts do not have a need for it. Immediately after udev is the correct place. Thanks for testing this.

comment:11 by DJ Lucas, 16 years ago

Oops. Lost track of the focus of this bug. The above change does nothing about the unclean /var because mountkernfs is still run before the system clock is set. Getting rid of '! -newer /proc' and moving to just after mountfs is the best solution for this ticket.

OT until after 6.4: I'll ping -dev later, but a suggestion for future is to create /dev/rtc on the root filesystem, so that setclock can be run very first on boot. It will later be 'covered' by the tmpfs mount. To test would be something like this from a rescue CD or the host environment (not runlevel 1):

mount $LFS &&
mknod $LFS/dev/rtc c 10 135 &&
mv $LFS/etc/rc.d/rcsysinit.d/S{25,00}setclock &&
mv $LFS/etc/rc.d/rcsysinit.d/S0{0,1}mountkernfs &&
sed -i '/hctosys/s@/dev/null@/lib/udev/devices/null@' $LFS/etc/init.d/setclock

And of course to undo, 'rm $LFS/dev/rtc' and reverse all 3 substitutions above. :-)

comment:12 by bdubbs@…, 16 years ago

DJ, Lets just fix it now. Go ahead and add /dev/rtc to

  • 6.2.1. Creating Initial Device Nodes

And move the boot script order.

We'll let -rc1 stay out a while and use that to ID problems.

comment:13 by DJ Lucas, 16 years ago

I'm not sure if that works on x86_64, though I can't see why it wouldn't. I realize that x86_64 is not the official target, but I doubt there are many left that use only 32 (except for me). I don't have any flavor of Linux on my 64 boxes at this time. I really need to remedy that. Anyway, would somebody mind testing the above changes?

in reply to:  13 comment:14 by bdubbs@…, 16 years ago

Replying to dj@linuxfromscratch.org:

I'm not sure if that works on x86_64, though I can't see why it wouldn't. I realize that x86_64 is not the official target, but I doubt there are many left that use only 32 (except for me).

I don't have a x86_64 and I don't know why I would want one. A Intel(R) Pentium(R) 4 CPU 3.20GHz with Hyper-Threading works quite well for everything I need. I run dual 24" monitors, 2G RAM, and have lots of disk space, both local and via nfs. That lets me run a couple of vmware instances, browser, mail, and compiles without breaking a sweat.

comment:15 by bryan@linuxfromscratch.org, 16 years ago

/dev/rtc should work fine for x86_64. That's what setclock uses even if you do pass --directisa (since --directisa doesn't work).

HOWEVER: Which /dev/rtc are you planning on using? There's more than one -- my kernels use the new (well, new as of 2.6.20something IIRC) RTC class devices; specifically, rtc-cmos.

/dev/rtc is a symlink to rtc0 (via udev), and rtc0 is currently major 252, minor 0. This doesn't match the old /dev/rtc misc-device (major 10, minor 135), whose support is disabled in my kernel. (The ioctl and read/write interface to user programs is the same, just the kernel side support is different.) And we can't create both as /dev/rtc.

No matter which device gets created, users of one or the other of those setups will break...

(Could just run setclock from the initramfs, of course, after udev finishes. That will still be well before any other bootscripts, and should work with any /dev/rtc. ;-) )

comment:16 by DJ Lucas, 16 years ago

There goes that idea. The host's /dev/rtc is not reliable because of the kernel config option (BTW, is this HPET or something I haven't seen?). x86_64 and Macs (I guess all macs) have no alternate device or direct I/O option, and without sysfs or proc, there is no easy way to find out what the correct device should be. Hopefully direct I/O is introduced for at least x86_64 in the future. At any rate, this particular problem is fixed by the original suggestion, and was inadvertently never introduced into the lsb-v3 scripts.

comment:17 by DJ Lucas, 16 years ago

Add any further info on this to #2266 please.

comment:18 by DJ Lucas, 16 years ago

Resolution: fixed
Status: assignedclosed

Fixed in r8721 and r8722.

Note: See TracTickets for help on using tickets.