I like driving in my clown-car.
Sep. 18th, 2010 04:00 pm(This will be incoherent and likely contain swearing. I am still mostly running on snot and bile.)
There's a montage sequence in 'The first of the few' where RJ Mitchell is smoking a pipe at a problem and the fellows in the brown coats on the shop-floor are working on lathes and vices in order to build one or other of his fine seaplanes. However, there's some confusion over the routing of the oil-lines, so one of the engineer types has to beetle off to the design shop and ask Mr Mitchell about it. Yer man taps the drawings with his pipe, admits that it's not at all clear and promises to have a new design by the AM.
Nothing particularly strange going on there. A new thing is carefully considered over pipes and pints of mild, mock-ups are tested and drawings filled with well-specified terms are made up such that many examples of the new thing can be made without (by and large) the designer being on hand to personally oversee each one.
So I think it would be really quite nice if system administration could perhaps consider getting a clue and using tools and practices that have been with us since the industrial revolution.
There are no particularly good reasons for machines that are hand-built and/or infrastructures that lack DNS, SSO, central logging, patch-management, security management, trivially repeatable machine instantiation or useful reporting/instrumentation.
And yet. The attitude that such things are a bit hard or strange is part of the background noise.
For instance. Puppet's got the makings of quite a useful machine-management tool. However, one of the early types and/or examples was for login management by hand-hacking the passwd file and copying around SSH keys. Which, what?
I believe we've had working Kerberos + LDAP for the thick end of a decade, and yet people still think that keyed SSH access is pretty swish? Jayzus.
Still, I suppose it's not a set of shared root passwords of different classes, depending on machine type. No-one's daft enough to use that any more...
And. Why are people still surprised when disks fill up? I can see that some Java job going bugfuck and filling the /var partition (Oh, wait, we're all on fucking Linux now so it's all one big / partition. D'oh!) might be something of a black swan, but taking some readings and spotting a change in disk-usage delta isn't entirely rocket science.
And. Machine specification and swapfile sizing is still bloody voodoo.
There's a montage sequence in 'The first of the few' where RJ Mitchell is smoking a pipe at a problem and the fellows in the brown coats on the shop-floor are working on lathes and vices in order to build one or other of his fine seaplanes. However, there's some confusion over the routing of the oil-lines, so one of the engineer types has to beetle off to the design shop and ask Mr Mitchell about it. Yer man taps the drawings with his pipe, admits that it's not at all clear and promises to have a new design by the AM.
Nothing particularly strange going on there. A new thing is carefully considered over pipes and pints of mild, mock-ups are tested and drawings filled with well-specified terms are made up such that many examples of the new thing can be made without (by and large) the designer being on hand to personally oversee each one.
So I think it would be really quite nice if system administration could perhaps consider getting a clue and using tools and practices that have been with us since the industrial revolution.
There are no particularly good reasons for machines that are hand-built and/or infrastructures that lack DNS, SSO, central logging, patch-management, security management, trivially repeatable machine instantiation or useful reporting/instrumentation.
And yet. The attitude that such things are a bit hard or strange is part of the background noise.
For instance. Puppet's got the makings of quite a useful machine-management tool. However, one of the early types and/or examples was for login management by hand-hacking the passwd file and copying around SSH keys. Which, what?
I believe we've had working Kerberos + LDAP for the thick end of a decade, and yet people still think that keyed SSH access is pretty swish? Jayzus.
Still, I suppose it's not a set of shared root passwords of different classes, depending on machine type. No-one's daft enough to use that any more...
And. Why are people still surprised when disks fill up? I can see that some Java job going bugfuck and filling the /var partition (Oh, wait, we're all on fucking Linux now so it's all one big / partition. D'oh!) might be something of a black swan, but taking some readings and spotting a change in disk-usage delta isn't entirely rocket science.
And. Machine specification and swapfile sizing is still bloody voodoo.
no subject
Date: 2010-09-18 03:18 pm (UTC)I currently maintaining configs for sixty or so Linux and Solaris systems using cfengine and kickstart/jumpstart, I had to try not to be snotty about being able to reinstall a system from bare metal with basically no interaction beyond doing whatever it takes to PXE-boot it...
I'm still surprised when even huge companies who you'd think would know better obviously haven't bothered with any kind of management infrastructure. Not to name any names, but I worked for six months at a place you've heard of and spent WEEKS doing more-or-less by-hand system installs on production systems, which were then never patched or updated again. News of their financial difficulties were not unexpected, considering how much effort and expense they put into doing things in as difficult a way as possible.
no subject
Date: 2010-09-18 03:57 pm (UTC)I mean, I run a nameserver & DHCP on my home network because both of those things are slightly simpler than falling off a log.
Kerberos is worth the bother as soon as someone leaves and you have to change all the passwords.
Running a local repo/package-management rig is a right faff the first time, but...
Central logging happens the day after a box gets owned.
Etc.
no subject
Date: 2010-09-18 03:25 pm (UTC)Much of my time before then was spent running machines that had to keep working and you had to be able to log into them nomatter what else was broken or not, so centralised login stuff was out. It can be hacked up so things will keep working in outages (cached creds) but mostly it's still more effort than it's worth unless you've got quite large scales or you really want to keep a bunch of machines identically useful as workstations and such.
no subject
Date: 2010-09-18 03:51 pm (UTC)Thus in some ways I'm a bit late to the party and this is the sound of me working things out in longhand.
It seems to me that the backup plan for SSO going bugfuck (which generally means that the network's expired and you have bigger problems than not being able to login as yourself) was OOB remote login and the big list of root p/ws that was kept in the fire safe (in a different building, obv). Certainly that approach Worked For Us when Slammer melted one network segment, but there are always corner cases.
no subject
Date: 2010-09-18 04:16 pm (UTC)Sure centralised login is a lower risk to a machine than most for services but when you only have a small sysadmin team with accounts and odd side effects happen it can be sensible to keep them reliant on nothing else (in that case I had a system to sync accounts out for the sysadmins) so on the odd occasion when everything goes down you can be sure they come up cleanly... which is important for central infrastructure.
no subject
Date: 2010-09-18 04:01 pm (UTC)no subject
Date: 2010-09-18 04:09 pm (UTC)In theory, it shouldn't be a problem since we have resizeable filesystems on most useful kit. Also, if you've an automated build process, not including some basic machine-state monitoring is a bit careless.
no subject
Date: 2010-09-20 11:36 am (UTC)But in essence, all I'm trying to do is /var on a separate partion and then ln -s /var/home /home
The problem is that whilst partitioning is a sensible thing to do, there aren't any good defaults other than "on a desktop install you probably just want everything on one partition". For any value of server... it depends.
no subject
Date: 2010-09-18 05:04 pm (UTC)C'est tout.
no subject
Date: 2010-09-18 05:21 pm (UTC)A lot of my work has been decrufting our chain in general. It's amazing the festering shite I've found. Like discovering the six month old hamburger that fell behind the cooker. "Ah, that's where the maggots are coming from!"
I was looking at Puppet and cfengine and couldn't quite work out from their descriptions if they would do what I was thinking of. I'm seriously this close to writing my own configuration pusher using svn, shell scripts and scp.
no subject
Date: 2010-09-18 06:21 pm (UTC)The cfengine docs are kind of stupid.
no subject
Date: 2010-09-18 06:27 pm (UTC)At the moment I want to type "do this please" (whatever the command is) and have everything I need set up for me.
no subject
Date: 2010-09-18 10:27 pm (UTC)The cool kids are using Chef. The seriously cool ones are using Kokki. Probably.
no subject
Date: 2010-09-19 09:35 am (UTC)no subject
Date: 2010-09-19 09:58 am (UTC)However, I've gone well past the number of machines where I'm happy going 'for $machine in @list; do ssh $machine, etc' because it's shit.
.bash_history is not a substitute for documentation.
no subject
Date: 2010-09-19 11:21 am (UTC)