Jiffies, uptime, WTF?!?!
Friday, February 17th, 2006A week or so ago, I noticed that a machine I admin had been rebooted 50 days ago, and another had been rebooted 10 days ago. “Bit odd!”, thinks I, “Who did that?. I’m enormously busy at the moment though, so I just had a look every now and then through the week to see if I could understand why. The two boxes that had been rebooted were Debian Sarges, while an OpenBSD next to them happiy reported 500+ days of uptime.
Tonight I had a closer look. I noticed some curious bits. Continuum thought it had been up for 45 days on a machine that had been up 10 days. In fact, top thought that it had had 12 days of CPU time, on a machine that had been up 10 days. Damn that’s impressive. The ps command was no help though, its idea of when things had started was pathetic, it thought Continuum had been up since 2007.
Same kind of ps issues on the other machine. Cron had been up since 2007, along with some others. Bizarre. Something messed up in the process stuff?
Then I had my brainwave. Each machine can be randomly screwed up, but it’s very unlikely that they can be the screwed up in sync. /var/log/dmesg shows the boot message for a machine - I’d thought it curious that each machine should still show the boot messages from 500+ days ago. So I compared them. The difference between their start times 2 years ago? 40 days. The difference between the surprising events? 40 days. Huh?
A friend of mine who’d been hearing all this in an IM (Thanks Jon!) did the right thing and had a quick google. Lo and behold, 497 days uptime is the day when your Linux OS will start lying to you - http://www.uwsg.iu.edu/hypermail/linux/kernel/0202.2/0337.html.
You live and learn. And curse and blaspheme and feel down for a week. As for the jiffies? I’ve no idea - some kernel thing and at 3am I’ve lost the mental focus to understand.
