Having now worked so much with Linux, I have developed an unhealthy approach to file system care. Initially by accident, then by design - to test, now by habit, when I shut a Linux machine, I just slam the lid - generally just pull the power on it (I'm talking laptops here - not really advisible on servers, especially of a production, live variety).
Well, I've been waiting for some kind of corruption, longing for the fsck to be forced on me, waiting to sit there for the entire afternoon typing "yes", because I've forgotten to provide the -y option (and worrying that ^C would upset it further - which it wouldn't!).
Nothing! A bit of a delay on subsequent boots, but no disaster of any description.
This week we are working on porting our generic UNIX admin course onto SPARCs with Solaris 9 (I'm still not up to speed with the new Solaris 10 features, but that's a totally different story).
We are working at the brand new offices in Birmingham (on the verge of the China Town - wonderful, but that's yet another different story). The machines were shipped from London. Once connected - power on, and... three out of four boxes booted OK, the last one would not.
Guess what: had to do a manual fsck on the /var partition. Guess what: forgot to do the -y option. Except that I did interrupt it this time ;-)
Fsck declared the partition healthy - boot into runlevel 3 - no joy: utmpx file missing. Back to single user: it turned out that most of the /var partition is missing! Having recreated most of it by hand, the machine came up - eventually.
Now then - I might be totally unfair to Solaris at this point, but the corruption has no known history or explanation. The machines had been installed at the same time from a jumpstart (logs confirm it), then they were shutdown and shipped. Did the engineer pull the power? Very likely. Did the engineer remove by hand some of the /var tree - doubt it very much.
Well, at least I had a chance to reacquaint myself with the good old fsck !
No comments:
Post a Comment