"That which is overdesigned, too highly specific, anticipates outcome; the anticipation of outcome guarantees, if not failure, the absence of grace."
-- William Gibson, All Tomorrow's Parties
January 22, 2004

I've been thinking recently (again) about how to securely connect to a machine you have to administer.

For users, this isn't a major problem. You can have random passwords for each host you need to connect to as long as you set up ssh keys with passphrases and alternatively, ssh-agent to deal with the annoying parts of actually connecting to the machine. As long as you have "trusted" hosts which house the private ssh keys, you're good.

For admins, it's different.

January 23, 2004

I've been slowing gearing up for a major security kick as of late. So much bad, and not enough good. The SSH Trust Web idea is just part of it. Later today, probably after I get some sleep, I'll write a "secure" server policy, which details mount points, kernel security settings (grsecurity, etc) and the like.

Last week I scheduled downtime for all the production servers at work, as they all need reboots for kernel upgrades.

If I get the security policy written in time for a cursory approval from some of the more security conscious people I know, I'll reinstall the LAN firewall following it. I already know it's going to be somewhat of a pain, as keeping all suid binaries on their own partition tends to be a minor annoyance. However, like all things, it's easily scripted around.

A few months ago I played around with grsecurity, but at the time didn't care enough to consider implementing it on real machines. I guess I care now, and recompiled eos.int.walnutfactory.org's kernel, enabling just about every option that looked sane. I'm curious to see how usable the machine is, for day-to-day use, but none of the PWF kids currently use the machine for anything (as its still new to the Factory).

There are a few things I need to dig into with regards to grsecurity, the most interesting being the "learning mode" for ACLs.

Speaking of which, the most time-intensive aspect of enable grsecurity is going to be writing a sane ACL policy, assuming I don't let it figure it out on its own, and then actually turn ACLs on. In one respect, it's good that the majority of my machines are Debian GNU/Linux. Of course, generally speaking, completely homogenous networks are not the love, but it sure does making it easy when rolling out new technologies scripting for system administration.

Considering the amount of documentation I'm going to have to produce in a short amount of time, I really should consider starting to write it all in LaTeX.

After a few minutes of looking around, I've discovered quite a bit of useful documentation, most of it describing the use of shiny things which Just Work.

I've been using ssh forever, and yet never knew about the ForwardAgent feature... because I've never really thought to use ssh-agent. Well, it's nice to think "Hm, it'd be useful if..." and have it already done.

This series of articles written by the Gentoo Linux Chief Architect describe a few nifty tricks with ssh and ssh-agent, using the Gentoo-supplied keychain program.

This howto has a few useful aliases in it for starting keychain on login and sourcing the env vars so you can get access to your agents identities. Don't forget the "-q" switch so you don't get that Gentoo-default green and blue spam screen. What's with Gentoo and green and blue console messages, anyway?

At any rate, all of this will be included in my eventual HOW-TO for work, which will be released here as well.

So much useful software, and so easy to use.

February 1, 2004

Spent the majority of the day yesterday hanging out with Kyle and Pete at Factory. Went to lunch at the supergood Mexican place on ~9th and Washington, then chilled at the space until 2000 or so.

Got SMTP-TLS working on the Factory mailserver, did a little more work on gate, the new firewall, and spent a couple hours reading bash.org.

With regards to SMTP-TLS, a couple years ago I waded through getting Postfix TLS and sasldb for a machine at work. This run through, I just used The Perfect Setup HOWTO and was done with it.

The only thing that really bugs me was having to use a couple backports for libsasl2, which possibly I didn't need (since I'm using the pwcheck daemon, authenticating against /etc/shadow), but I didn't think about it too much.

There are also a few useful notes here.

Around 2000, Ian showed up with his friend Mike and Samid. Ian had his LinuxWorld swag, including a copy of Sun's Java Desktop System, which we've all been very interested in seeing.

Shortly after that, Pete and I took off, as it was getting late and it was already hovering around 0 degrees out.

Ah, Pennsylvania winters. How I love you like truck.

March 19, 2004

Sometime yesterday afternoon, the test fileserver's netatalk install decided to stop displaying directories in the root share. I'd stayed home to code, so when my co-worker called and informed me of this, of course I had just put my clothes (all of them) in the wash.

So a few hours later, when I had pants to wear, I headed down to Factory to see what was up. After some screwing around, I couldn't determine if I'd managed to fix the problem or not. So I had to come into work (where I am now, half-awake and reading Perl docs), getting in just before it started snowing in earnest.

The problem was caused, I think, by having shares within shares (we have a root volume, with "client" volumes underneath, which are just directories containing jobs for a specific client). Pretty sure it made the .AppleDB databases sad in some way. Unfortunately I don't know enough about netatalk (I suppose I could take eniak's approach and read the source, but, gar, reading C gives me a headache) to be sure.

Luckily the resource forks (which lives in .AppleDouble) didn't explode.

Volumes that didn't have shares below them didn't exhibit any problems, so my solution was to move the .AppleDB directory out of the way, let it get recreated, and remove all the sub-shares.

As I'm looking to move primary fileserver to Linux/netatalk, hopefully we won't be running into too many of these issues...

My favorite part is where I google for the error I'm getting, and all I find is some German bulletin board.

Lovely, that.

April 15, 2004

Machines get compromised. Pretty much just the way things are, out here on the Internet. However, hastur getting owned was, to the best of my knowledge, the first time one of my UNIX machines has been popped.

hastur runs mirrorshades.org/net, foreword.com, amongthechosen.com, mail and DNS for all of it. It was a lame install, about two years old, before I started enacting filesystem-level security measures (half a dozen partitions, locked down mount options, filesystem checking utilities like AIDE). It was running Snort, but Snort can only detect so much, and looking back at the logs (which are emailed to me every morning -- obviously not the best solution, as they can be munged by an attacker who gains root), I don't see anything that would suggest the attack.

Which isn't Snort's fault, as this was an application-level fault.

But let's back up.

May 10, 2004

For the past several weeks I've been working as an admin for the metawire.org project, a free shell/hosting service. It's been an interesting experience so far.

I saw the undeadly.org story and signed up. zerash remembered TDYC! and the happybox, which I mentioned in my signup application, and we got to talking. I wrote a couple quick specs for a planned upgrade, and have been helping out with administration tasks since.

It's something of a challenge. The machine is running OpenBSD (3.5, as zee upgraded it over the weekend), and overall is set up okay. They're running custom user admin utilities, which we're slowly working on re-writing to be more abstract and portable (which reminds me, I need to get working on Unix::Admin this week, it's still very larval). I wrote a "hardening" script for OpenBSD, and we ran that on the box, locking file perms down pretty tightly.

metawire has a couple dozen domains attached to it, so users have a pretty good choice of "where" they want their stuff served from. We aren't doing actual virtual domains for mail yet, but that'll come along in a few weeks. There are some issues with CGI and PHP (namely, it's not running as CGI), but those will also be fixed as we have more time to work on the machine. It's already very popular (I'd guess ~100 applications a day), and now it's just a matter of defining where the issues lie and repairing them.

The challenge for me really comes in once you take the users into account. About 80% of the kids are using the machine to good purpose, but the rest are punks from various countries around the world. The majority of them are just script kiddies, but there have been one or two with some amount of skill. Finding the smart ones has proven to be a bit of a luck thing, which bothers me. The kiddies are simple to find. They all use the same stupid tricks, and seem to go from not having a clue how to use a shell to downloading exploits and running them (after some work) against remote hosts.

There've been a few problems with mailbombing as well, which annoys the crap out of me. Luckily Postfix is love, so it throttles and just keeps on trucking no matter what you throw at it.

I think zee, blister, mjc (someone also new to the project I suggested be added as an admin) and I will eventually start working on a known-sploit finder. mjc had the idea of doing binary checks for shellcode, which is a good idea, but might be sort of slow considering the number of files we're going to have to be checking (~2500 users on the box, perhaps a tenth who actually use it on a regular basis; that's still a lot of users). My idea was to just maintain an archive of MD5 sums of found exploit code and binaries. There's a lot of problems with this method, unfortunately. I can't think of anything better without figuring out how to do fuzzy matching, and I'm pretty damn sure I'm not smart enough for that. O'Donnell will have some good suggestions, I'm sure.

Anyway, I've been meaning to write about this for a while, but hadn't been able to find the time. If you're interested in shell communities at all, check out #metawire on irc.metawire.org, and sign up for an account.

Try to make sure that your signup application reason doesn't involve running BNC or "learning Linux", and you should be okay. ;-)

The planned upgrade is going to take some donations, which we seem to be doing okay on. If after a few weeks you find the service useful, try to drop us a few bucks to make it better. Jordan also had the idea of throwing a logo contest and start selling metawire.org wares, which is a pretty good idea. That's on-going. I haven't seen any of the submissions yet, but hopefully someone will hook us up with something good.

I'm enjoying working on metawire; it's going to get me to actually write useful software, I think, and it's a big boon to my actually learning stuff I haven't had a lot of access to in the past for whatever reason.

Anyway, check it out.

May 20, 2004

Spent three hours this afternoon trying to install OpenBSD on an Ultra10. There's a known issue where the damn things don't like booting floppies. So I grab the OpenBSD boot CD, and try to boot it. It refuses. So instead of trying the obvious (as Harry eventually did with one of his own U10s) and swapping the CD-ROMs out -- the first thing I would have done on x86 -- I screw around with it for hours. If nothing else, at least I got openboot flashed and all up to date and happy. This machine is actually pretty nice. 450Mhz, gig of RAM, two 20GB drives (though one of those will be pulled, as it's unneeded here).

Yeah, anyway. Grabbing the install sets now. I sure feel pretty stupid.

This is really my first time working on a Sun workstation form factor, and it sort of weirded me out that I had to turn the box upside down to pull the case off. Odd.

May 23, 2004

I wrote a little script to generate a postfix-style virtual table file (as opposed to the Sendmail-style/alias I had been using) yesterday (and had a couple problems with hash assignments... note that list and hash context? Yeah. Important!) and I assume his interest piqued Eric installed Postfix and a random front-end he found to play around with Postfix.

Three (going on four) years ago I wrote this horrible front-end for administering Postfix, Apache and FrontPage (on Apache! guh!) in PHP, feeding into a MySQL backend, with a scary, scary (my first big) Perl script to generate the flatfiles. It's pretty horrible. The high5 postfixadmin app blows it out of the water (considering the simplicity there, that should describe how scary my app is).

(I tried several times to re-write the scary "web panel" app, in PHP, but it never went anywhere because -- in my opinion -- writing big applications in PHP is just too annoying. Writing it in Perl with CGI::Application, Class::DBI and the Template Toolkit would be almost trivial, though.)

That front-end also comes with a HOW-TO, which details installing Postfix+MySQL+IMAP. Decent howto, it looks like. Postfix+IMAP is sort of old hand to me now, though I still view the whole thing as being slightly magickal, though that's entirely due to the eight thousand ways to do auth for the POP/IMAP daemon. Not deep voodoo, just kind of obnoxious, I think.

(The whole high5.net project seems to be pretty cool, in fact.)

Not sure how I feel about throwing my virtual tables into a relational database. The lookups overhead would, I think, tax the machine unduly (though I sort of suspect that Postfix is smart enough to do caching -- I haven't really looked into it, but Postfix hasn't even done anything that made me think it was in any way stupid). The current mailserver at work gets hit enough (what with the spam processing) to skew the clock without ntpd running (this didn't use to happen when the machine was a webserver).

The reason I installed OpenBSD on that Sparc, in fact, was to be a backup mailserver while I reinstalled our current mailserver at work, which is a three-year-old mess. It's sort of amazing the things you can learn about processes, automation, programming, and systems in three years.

The most important thing I've learned, though, is how much there is still to learn...

May 25, 2004

The majority of my mailservers to date have run Debian and Postfix, and most of the machines running local MTAs have dnscache bound to the loopback. So earlier tonight I noticed an OpenBSD machine I had installed last week hadn't been sending me logcheck reports after I had moved it from the internal network (where I do installs) to the DMZ.

I go check, and it appears that's it's unable to do DNS lookups to get MX records. After a few minutes of screwing around, I notice this error:

May 25 05:22:14 clortho postfix/postfix-script: warning: /var/spool/postfix/etc/resolv.conf and /etc/resolv.conf differ

Easily fixed, and then I go look at Debian's Postfix init script:

FILES="etc/localtime etc/services etc/resolv.conf etc/hosts \
for file in $FILES; do
[ -d ${file%/*} ] || mkdir -p ${file%/*}
if [ -f /${file} ]; then rm -f ${file} && cp /${file} ${file}; fi
if [ -f ${file} ]; then chmod a+rX ${file}; fi

...yup. Debian is awesome, because it does so much for you. And Debian is bad, because it can make your brain lazy, leaving you to wonder why something you haven't had any issues with previously us suddenly acting weird.

This is why I'm really starting to like OpenBSD, I think. No coddling without being obnoxious.

Harry brought a mailserver down to Factory last night, and we spent two hours getting it installed. The problem is something to do with the ISP we get our connection from... there is some incredible weirdness going on.

The layout is like this: We have an uplink from the ISP plugged into our external switch (who has recently been acting up --- some ports have been dying on any packets larger than 206 bytes; kudos to Eric on figuring that one out last week), a firewall with three NICs: WAN, LAN, DMZ.

The ISP has kindly given us a number of IPs... the problem is if you start swapping machines and addresses, whatever upstream hub/switch/router from our switch seems to not refresh its ARP cache. Ever. So when Harry brought his mailserver down, we found that none of our remaining IPs would route past our switch.

Andrew and I spent several hours the other night trying to figure this one out, and somehow managed to get it to work, when we installed the new firewall (OpenBSD 3.5 box). It could have just been coincidence, I really don't know. You can sit there watching arp traffic and you'll see the router ask "Who has $x?" and the host respond "Me!", and then the router proceeds to ask again. The host in question can sit there and see some traffic from the network downstairs (upstream), like netbios, other arp traffic... but not all. I suspect that's because there's some device segmenting the network, and the router and other core stuff is on one side of that, and random other stuff (like workstations and printservers) is on our side.

It's the weirdest thing and we don't have access to the ISP's equipment (obviously) to fix it.

So last night Harry and I ran into this problem, and eventually I just gave up on trying to force a refresh on the router, or whatever the hell is upstream of us (I wish I were cool enough to figure out timing to accurately map a network of transparent devices...), and just plugged his mailserver into the (so far unused, though this will change once we figure out the arp thing) DMZ port of the firewall, and just port forwarded for it. So ghetto, but it worked fine, and I would have felt worse if he had had to take his machine home.

The only hitch was me forgetting the following rule while hacking out the new pf rules:

pass in log on $wan_if inet proto tcp from any to port $smtp_services keep state

Because I'd forgotten how pf translated IPs, or more specifically, when.

And then Harry ran into one or two problems getting the ldap server on the machine up... back he quickly fixed that.

It was pretty nice debugging with Harry, actually. He knows stuff, and we swapped position at my laptop to work on the various problems without any issues. Sort of like eXtreme systems administration or something. :)

May 26, 2004

Just wrote a very bare-bones HOWTO (if you can even call it that) for installing Snort on OpenBSD 3.5.

Harry brought up the uid issue (my useradd statement will just add _snort as a user, and not within the daemon uid range, typically the 500s), so I checked out Postfix's pkg INSTALL script:

useradd \
-g =uid \
-c "Postfix Daemon" \
-d /nonexistent \
-s /sbin/nologin \
-u 507 _postfix

The fact that it's hard-coded suggests to me that there's a daemon to uid map for OpenBSD somewhere, but I'll be damned if I know how to find it.

I'll ask mjc (a fellow metawire.org admin and OpenBSD monkey).

June 8, 2004

Spent yesterday fighting with our crappy backup staging server (where things go before they're taped, and stay live for a period of time) at work. The thing is a junk Gateway "server" box that was super cheap (always a prevailing concern for purchasing hardware there if we can't offset the cost somehow), but has since proven to be an enormous pain in the butt (get what you pay for).

The machine has, at various points, had its motherboard replaced, its RAM, its CPU, and finally I get the thing working, and the primary IDE bus blows out. Pretty awesome, but a minor fix, as it has three (getting that third one to work is something I really should have documented; it was a pain, iirc).

So initially the machine was running Linux with Reiserfs on a software RAID across three IDE disks on dedicated busses. Slow as hell, but it worked.

So the reiser journals blew their trees all over the place, and since the data is taped anyway, I figured I'd give OpenBSD's software RAID (RAIDFrame, ported from NetBSD) a whirl.

I unplugged the RAID disks (habit) so I wouldn't get confused during the install ( a good habit), pointed the installer at an FTP mirror and went and did other work for a while.

After the machine installed itself (not counting download times, about 10 minutes of work... Much less-than-three to OBSD) I recompiled the kernel, pinning wd0 to the first channel of the secondary IDE bus (it wanted to boot off the , and started fighting with getting the machine to get the third IDE bus recognized (the BIOS and bootloader saw the drive on it fine, but the kernel refused to see it). I spent an hour or so trying to figure out how to get the bus's device attributes (what device number it is, etc), and failed pretty badly. I don't remember how I got it for the previous Linux install, though I did try booting Debian as I have vague recollections of the default Debian kernel seeing it.

Eventually gave up on that and threw another PCI IDE card in the machine. Pinned the RAID disks in place (config -e with -o is pretty great) so they couldn't move around on me ever, and started setting up the array as described in raidctl(8). Pretty simple stuff, though I have to admit that it seemed odd (at first) that I needed to have a FS_RAID type disklabel on the array's drives.

Get the RAID device formatted, mount it, and... it's somehow managed to lose about 100GB of space. There are three drives: two 160s, and one 120. So there should be somewhere in the vicinity of 410GB usuable space, 440 total.

Machine would only see 330GB tops. I pulled the 120 out, fed it another 160, rebuild the array... yeah. Pulled the 160 out, build the array up, and it would only see 300... There's an error stating that it's "truncating" the last disk, but googling for the warning returns nothing.

Next stop is mailing lists and looking at the raidframe code to see what causes it to happen.

Obviously I'm doing something incredibly stupid here. My initial thought was block size, but... That seems unlikely considering the amount of space involved here.

So yeah. If anyone has any ideas on this one, I'd appreciate it before I just reinstall Linux on the box (tomorrow, heh, as I'm tired of not having an easy place to do backups to).

June 9, 2004

This is somewhat depressing. I took a few minutes this morning to play with SpamStats on work's mailserver.

These stats start at 0630 Jun 02:

Total number of emails processed by the spam filter : 60038
Number of spams : 43548 ( 72.53%)
Number of clean messages : 16490 ( 27.47%)
Average message analysis time : 3.61 seconds
Average spam analysis time : 3.37 seconds
Average clean message analysis time : 4.17 seconds
Average message score : 7.02
Average spam score : 10.43
Average clean message score : -1.20
Total spam volume : 102 Mbytes
Total clean volume : 69 Mbytes

It's also a default, non-tweaked install of SpamAssassin, so I would wager somewhere between a third and half of those "clean" messages really aren't.

My next step is going to be finally throw a.mx at our colo and have it dump anything over a 7, then relay the rest to b.mx at our offices.


June 10, 2004

Spent yesterday working on moving the services living on hastur.mirrorshades.net to ligur.mirrorshades.net. Considering some of the deals that ServerBeach offers, it was a pretty simple decision to make for Dan and I.

I had some initial issues with the machine... like the kernel I compiled didn't take, then grub's fallback mechanism didn't kick in. Apparently the NIC driver I used (eepro100, which is the module the current, vendor-supplied kernel is using) wasn't happy and thus networking didn't come back up. I had planned on writing a little at job to run when the machine comes up to check if it can get 'net, and reboot into a known-good kernel if it can't, but hit a wall with regards to available time.

Yesterday was crazy at work, so the migration only got a few minutes here and there of my time. Overall it was a simple process, and took maybe an hour an a half of my actual attention. Go UNIX. Try doing this shit with Windows, eh.

I kept a relatively information-less log of what I did for the migration (this being what, the fourth time? for this machine) if anyone cares that much.

As always, the easiest portion of the migration involved Postfix. So much love for that piece of software.

Anyway, I should probably shower and get ready for another twelve hour day at work...

June 11, 2004

I just spent the last hour fighting with raidframe on another machine (the new production backup server).

The error message:

raidlookup on device: /dev/wd0a failed !

The config:

START array
# numRow numCol numSpare
1 2 0

START disks

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_0
64 1 1 0

START queue
fifo 100

All looks okay, right?

Wait... look at the error message again...

"wd0a "


Adam O'Donnell: i would patch the source
Adam O'Donnell: and submit the patch.
Bryan Allen: http://monkey.org/openbsd/archive/misc/0010/msg01366.html
Adam O'Donnell: because no one wrote the patch
Adam O'Donnell: i will do it with you sometime soon if you want
Adam O'Donnell: there is no "strip()" in C.
Bryan Allen: I know. Or =~ s/^\s+|\s+$//gs;

heh. :)

July 1, 2004

Why is mail such a pain in the ass?



So many pieces for something that is actually relatively simple. And the problem is that the pieces themselves are actually relatively well designed. (In the case of say, Postfix, exceedingly well-designed.)

But taken as a whole, it suddenly becomes something that requires a flowchart with various colors denoting things.

And that doesn't even get us to the point where we're talking about being able to authenticate users sixteen different ways bi-directionally or relaying mail between machines or talking about backups or anything remotely interesting.

For something so simple, it sure does get complex quickly.

That said, it took roughly an hour to get IMAP-SSL running on the new company mailserver today. That includes compiling. And spending a half hour not working on it. So.

Easy. But it still seems way too involved for some reason.


July 8, 2004

These last two weeks have not been my superhappyfuntime.

The company I work at is merging with another company and their IT guy, who was not only lazy and shall we say, somewhat cavalier with regards to his duties as a systems administrator, but well... the emphasis here is was.

So at the moment I sort of have two jobs. I'm getting tired of 12-16 hour days.

Not to mention the two 24 hour days.

The biggest problem I have is that their entire shop is Windows-based, except for two Macs in pre-press and two in design. That leaves about two dozen Windows workstations, half of which are infected with various forms of viruses, and three Windows servers.

Including, for some as-yet-to-be-determind, an MSSQL machine.

I suppose that explains the "FixBlaster.exe" binary on the PDC's desktop.

I'm just not used to a Windows environment, I think. Tuesday, network connectivity was being super spotty; "Crap," I think. "That 486 junk firewall I replaced their horrible SonicWall with is dying on me." So I go and steal a disk out of a machine whose processor fan had recently failed, install OpenBSD on it, and waste half an hour of bandwidth and a half hour of my time (counting interruptions to deal with other stuff) that the network is still thrashing.

"Bloody Hell," I says, watching it take six packets to get anything anywhere. "hm, are my pf rules screwy?" pf is turned off and the connection is again happy. "Well, I suppose that hopefully rules out the NICs and the hardware," I thinks to meself.

So I finally do what I should have done in the first place:

tcpdump -eni ep0


"Golly gee, that's a lot of 135 and 445 traffic going to space... space that doesn't exist. Invalid subnets. Damnit!"

And let's not forget the 6667 traffic fleeing outbound to the world, doing gods know what...

So I quickly block all egress traffic save for a few required ports, and connectivity is somewhat happier, though hardly not at all. So I ponder to myself, "Ponder ponder, what's the probl-- oh. Queues."

Yes indeedy. It was taken half a dozen to a dozen packets to fall up the goddamn stack and get routed. Luckily I'm a complainer and Andrew quickly suggested that I just block all non-valid traffic on the internal interface, so the junk never gets processed.

Word to Andrew.

tcpdump -c 50000 -eni xl0 src net and dst net and dst net \! > infected_hosts ; awk '{print $6}' infected_hosts |sed -e /.....$/s///|sort|uniq

(My regexp sucks so much.)

That was an adventure!

And not the only one for that day, but the only one that I can remember, because it involved me being stupid. And I always remember those stories.

Today was also pretty awful, but I got a lot done. It's funny how that works. I spent about an hour swapping machines because one of the managers decided to upgrade a piece of software on an operating system that doesn't support... something or other the new version of the application needs. So yesterday Adam installed it on a Win2k box, which is what it wanted.

Only the guy forgot to mention that some printers needed to sort of be hooked up to that box... "Looks like a job for bda!"

So this new machine is actually one of our old ones, but it's been at the new building for maybe three months. And it was caked with dust. And the older box that I was swapping out? Oh. I think at one point, it was probably that stupid tan color old machines all are. But it was grey.

And my clothes? Well, they were black when I started. By 1130, though, they were white.


Luckily I still had a box of Christmas clothes in my cube at the other building, so I could change and not be covered in goddamn dust all day. Whee!

The only thing I feel even remotely good about is that the new mailserver appears to be operating optimally. There was some issue with IMAP and Mail.app... namely, if you create a folder on the IMAP server, then add a message to it... delete the message... and then delete the folder, Mail.app cries. "Can't SELECT!" Because it doesn't refresh after deleting and before opening again. And it was connecting way too much.

But I realized I was blocking the UDP ports IMAP wants on the box, and that seems to have fixed the issue. I didn't look too much into it... tomorrow I'll see exactly why that might be. It seems... odd to use UDP for those operations. But what do I know.

I'm not even going to get into the dozen or so "omfg!" fires that people came to be about, causing me to not clean the infected Windows machines. argh. You'd think that'd be my priority, and it is, but it still hasn't happened. Gods willing, I'll get to that tomorrow morning and afternoon.

What else, what else.

Apparently the previous IT guy's default responses to anything anyone ever asked him to do were:

  • No.

  • I can't do that.

And if you came to him with something broken?

  • Deal with it.

Needless to say, this did not go over well with the users (You know, his fellow employees? The people he was being paid to assist?), and they are all somewhat shocked, I think, to find Adam (who has been at the new building for a month or so now, and also helping them out) and myself somewhat... helpful.

And pleasant.

And useful.

And they seem truly astounded perhaps not by our annoyance and the broken state of affairs, but by our wanting to make things better.

For instance! Two sales guys have a printer in their office, a big HP 8500. Nice printer. It speaks JetDirect. The two designers, who use Macs find it with no problem. Humans ask the "sysadmin" if they can print to it. He tells them, "No, you can't. Windows can't print to that printer."

A week after his ass gets canned, the matter is brought to Adam's attention, who says "wtf?" and yesterday asks me to take care of it today.

I poke around for a few minutes, having absolutely no idea how to get a printer without a real printserver to work on Windows. In OS X-land, it's trivial to get it working (and, I assume, just as trivial with AppleShare/AppleTalk in OS 9 or whatever, as that's what the designers use). However, I am a somewhat astute observer of human behavior, so I check the "sysadmin's" WindowsXP workstation, which I have access to.

Lo and behold, he has the printer added. I check to see how it's configured, and apparently you add the thing as a local printer, then configure the port via IP... pretty silly, I think to myself, but exceedingly straight-forward.

I go to add the printer on the sales guy's workstations, and one of them tells me that the "sysadmin" had told him once: "Yeah, you can use that printer. You just have to install the drivers and figure out the IP. I'm sure you can do it." And walked away.

This is, of course, while the machine I was touching was pulling the drivers off the fucking printer and installing them.

In all, this process took perhaps fifteen minutes, five of which I had spent poking at the printer itself like a retarded monkey with a dopamine problem.

(And then a Mac OS 9 box ate its "Volume Header", which I presume to be some sort of MBR analogue, and after I screwed with Open Firmware for ten minutes, I got someone to bust out a Norton Utilities CD and that fixed it right up.)

So that's what I'm up against. Years of that kind of "administration." The place is an enormous mess, and I think it's going to drive me insane. That was just an example. I could go into detail about the problems with the network itself... but it would all be stupid stuff like the gateway's IP being

The only lights on my horizon at this point is that I've been promised an Xserve and a terabyte XRAID, with which I can get rid of the NT4 PDC and manage both the Macs (which will outnumber the Windows boxes once my company finally gets into the new building) and the Windows boxes.

Joy. Network authentication and control and gods willing some form of remote patch management.

Also, Hunter and the company librarian (the guy who deals with backups) and I finally managed to get together and have a nice productive meeting about Archivist, the NetBackup replacement (and job management, and archiver, and possibly some form of remote data access and preview functionality stuff) I designed and started writing months ago... and then stopped because this merger started happening. But with Hunter coding, it should actually get somewhere, and become useful, and with Adam driving Hunter, it should get done. The backend stuff is all designed following the Postfix model... which is to say, the UNIX model, which is to say... Hopefully I won't fuck up a good thing.

And it'll be OSS. Yay.

During this meeting the owner was sitting at his desk (his office is in the conference room) and was half-listening to us. At one point we were talking about system failures and he said "Woah, I don't want to hear that talk?" "What?" "I don't tolerate system failures." "No, you plan for them."

"Bah!" says he.

And now? Now I'm going to sleep. Because I deserve it.

(I realize the above examples seem somewhat trivial and probably childish. But fuck you. It's obnoxious. It's Windows. I am a UNIX ADMINISTRATOR DAMNIT. I'll whine if I want to while I'm getting all this Microsoft garbage shoved down my throat.)

/* Oh. And I'm missing HOPE because the things I mentioned are 15% of what I wanted to get done this week, and because I want to start moving into my new apartment with my friend Pete this weekend. Which is also something to look forward to. To put it mildly. */

August 2, 2004

It really annoys the hell out of me when an application has an "Import" function, but refuses to let you point it at an arbitrary directory.

Say if you're swapping machines for a user, and want to get them off Outlook.
You have to run Outlook first, copy the pst file to the correct location, and then have Thunderbird import. That's asinine.

I realize I could have just used some other tool to convert the pst to mbox and dump the files into the Thunderbird directory, but... why? Why not just let me say "Import THIS file"?


August 5, 2004

I'm sure Engler posted about this at some point (as he knew what I needed when I was bitching earlier), but: FileMon.

Useful for when you have an application that's writing temp data to some random directory, you have no idea where, but seems to require Admin group privs on the local box.

I hate Windows. But Windows with lsof is slightly less obnoxious.

Sort of.

Many other useful tools on sysinternals as well.

August 11, 2004

Got bored today and decided to install Solaris 10 beta 5 on some boxes. Keeping in mind that my experiences with commercial UNIX has always left a sour taste in my mouth (IRIX, AIX), and that I have very specific ideas about what UNIX is, you greybeards may want to take this with a shot of J.D. or something. Also keep in mind that this is beta software.

August 12, 2004

10:20 <@bda> 916 qtimageser 79.2% 0:23.08 1 36 104 344M+ 2.80M
189M- 1.94G
10:20 <@bda> wtf is that.
10:20 <@bda> Oh.
10:20 <@bda> Jesus fuck.
10:20 <@bda> I love how Mac OS will continue to thumbnail files even after
10:20 <@bda> So unmounting the volume is unpossible.
10:20 <@bda> This is such bullshit.
10:20 <@bda> Finder--
10:21 < mdxi> this is what happens when you put the user first
10:21 <@bda> Yes.
10:21 * bda kills it.

August 17, 2004

So we're a pre-press shop. Everyone uses OS9 or Classic with OS X.

I get a call this morning from one of the operators who tells me that "Classic crashed, and won't start again."

So I go over there and mess with the machine for three or four hours. I copy System Folders from other boxes, none of them get recognized as being bootable.


16:15 < solios> copy FROM the running OS 9 box TO the share.
16:15 < solios> do it the other way around and you'll get Pain in your face.

And that works. OS9 and Classic are happy. Unfortunately all the prefs, serial numbers, etc, from the original System are angry.

So I start copying crap around, becoming more and more annoyed with the situation.

And then Mark, another operator comes back from a smoke and says:

"You know, this used to happen... and we would just copy System Folder:System from another machine and replace the local copy and it would be okay again."

So yeah. I hate computers.

(Also, Norton 7.0 will destroy symlinks in / for OS X. Just a heads up. Fixing them is easy enough: Just re-symlink them from /private. Annoying, but hey. How often do you get to see single-user mode, eh, Mac guy?)

August 27, 2004

I've always wondered about this, and I needed to know this morning.

mount -u -rw /

Muchos gracias to ejp for taking five minutes to read the man page while I was putting out local fires.

September 15, 2004

I could install a cluster of UNIX machines with a pair of tweezers and a magnet faster than I can do a single Windows workstation install.

I'm going to laugh pretty hard if this machine gets owned by some host on the LAN I missed in last week's annual Trojaned Bitch sweep before I get it updated.

...and then I'd go the hell home.

September 24, 2004

I have an ext3 drive with 119G of junk on it that I'd like to feed onto this XRAID we just got. Of course, doing that over a network will take ages, so being able to read ext2 from Mac OS X would be pretty keen.

Enter ext2fsx.

Thanks to ejp for the link.

September 27, 2004

Fighting with OS X Server is something I have decided I do not enjoy.

There are two problems:

  1. The tools suck.

  2. I don't entirely understand how OpenDirectory is interfacing with those tools. Because changes made in one tool doesn't seem to actually propagate into Actual Use, even though it's Apparently Working. So not cool.

So who's stupid? Me or the software?

A little from column A, a little from column B...

September 30, 2004

Totally awesome.

Just ran into this.

I am doing the most ghetto backup solution I think I have ever done. Details later.

October 11, 2004

I've been fighting with the OS X Server since we got it. Getting it backed up has proven to be damn near unpossible within the context of our current backup system.

This is an email I just threw together after I spent the weekend troubleshooting the machine and various pieces of software that have mashed together.

Just installed OpenBSD on a box. Hardware: one Adaptec 29160, one 3WARE IDE RAID controller. Made sure that the controller I wanted to boot off got detected by the BIOS first (the Adaptec) and boot off floppy35.fs (as it supports both the Adaptec and the 3WARE card, as well as the awful, awful onboard Broadcom NIC -- it's pretty awesome that all three of those random devices are supported in GENERIC).

Do the install, no problems. It detects the single drive on the Adaptec chain as sd0, as expected. Reboot the machine.

Comes up complaining that rsd0* has not been configured. I say whadafuh and get a sneaking suspicion...

Sure enough, dmesg confirms that the damn thing swapped the Adaptec and 3WARE cards, so the RAID card is now sd0. What the hell?

Mount the filesystems, edit /etc/fstab, and one %s/sd0/sd1/g line later, all is well.

The whole day has been like this, though.

First the IDE bus in that machine dies, and then...

Man. I don't even want to talk about it any more. I just want to go home and play X-Men Legends.

October 13, 2004

Use NFS.

That's really all I have to say.

If you're a printshop, you probably have a bunch of idiotic characters in your filenames. Some of these idiotic filenames have ":" or possibly even "\" in them, and the ":"? They're probably actually part of some stupid character's hex analog. And you can't just mangle the filename on copy because it'll cause linking problems with say, Quark.

So what do you do?

You stop using Samba and use NFS.

Just note that you'll need to use the -P option with mount_nfs on the OS X box.

I'll have a wrapper script for ditto by tomorrow probably so it'll do update syncs as opposed to simply copying every damn thing all the time (gah). Considering how slow this already is, I can only imagine that it's going to get a lot slower. :\

From a default OS X install:

[bda@10-1-2-74]:[~]$ grep NFS /etc/daily
# Clean up NFS turds. May be useful on NFS servers.


[via esch]

October 15, 2004

Here's that ditto wrapper script I mentioned.

Re-wrote it this morning as I had no Internet access (I'm at Steve's, in Plattsburgh, for his wedding) and therefore actually got work done. Funny how that works.

The comments at the top explain it all pretty well, I think.

Now to see if it actually works in production. The tests were fine.

November 1, 2004

Nick Holland wrote a nifty FAQ on upgrading OpenBSD from 3.5 to 3.6, focusing on keeping system configuration in sync with the release.

I got my CDs a couple weeks ago but have yet to have reason to put install it on anything, or upgrade any machines.

November 4, 2004

Lots of runaway disk-bound processes on our OS X Server earlier, so I killed them. The machine killed my shell and network latency started oscillating between 1ms and 9000ms. Then it rebooted itself.

21:24 <@bda> [root@sobek]:[~]# vim
21:24 <@bda> E575: viminfo: Illegal starting char in line: b0VIM 6.2
21:24 <@bda> Hit ENTER or type command to continue
21:24 <@ejp> rjbs: bda has shame?
21:24 <@bda> That's new.
21:24 <@rjbs> bda: delete viminfo
21:24 <@ejp> (moded cows)++
21:25 <@bda> I did.
21:25 * bda isn't dumb. :(
21:25 <@bda> I just don't know what that means is all.
21:25 <@rjbs> (modded cows)==
21:25 <@rjbs> bda: it means you need to delete viminfo
21:25 <@bda> k.
21:25 <@rjbs> bda: viminfo gets corrupt every once in a blue moon
21:25 <@bda> ah.
21:25 <@bda> Well. The machine rebooting itself would definitely cause that.
21:25 <@bda> Though, unfortunately, it does it far more often that once in a blue moon.
21:32 < solios> :|
21:33 <@ejp> maybe you shouldn't pee on it every full moon then?
21:37 <@bda> Naw. It needs it.
21:38 <@bda> Or it won't grow.
21:39 < solios> hahah
21:40 <@ejp> and suddenly I undestand all your computer problems.

November 6, 2004

Updated selene (my PowerBook, PB12A) last night. Seemed fine. Also updated helios (Mystic, dual G4 500), and it has displayed no weird behavior.

This morning, I woke selene up to check email and do some admin stuff. I was connected via wlan via AirPort. Had Keychain Access open, and was in the process of opening TextEdit. They both SPOD'd, which is really, really weird (never seen it happen before), so I killed them, restarted KA -- and all my keychains were gone. KA keeps references to files, and doesn't actually try to manage the files unless you insist, so the actual keychain files had not been touched. So I went to re-add login.keychain (which is kind of important), and it failed silently. At this point I'm more than a little annoyed, so I close all my apps and go to log out.

The machine boots itself into single user mode. Awesome.

After failing to log in a couple times, I hardboot it and it comes back up. I reconnect to our wlan network, and kick open some apps. They start displaying the same behavior. I kill them, turn off AirPort, and plug in a wire. Everything is fine.

So either someone has something unreported and is spamming at my machine via wlan, or Apple managed to fuck up AirPort somehow with 10.3.6.

As Adam said, "Occam's Razor."


November 8, 2004

I was whining about OS X and caching, and Rik bothered to think for two seconds:

<@rjbs> lookupd -flushcache


November 11, 2004

Had to crack a Win2k box this afternoon as it was "appropriated" and no one knew the Administrator password.

I used Austrumi, which in turn uses (I think) ntpasswd to do the actual password changing.

I'm sure I've made posts similiar to this in the past, but I never actually bothered doing any of this (generally speaking, any workstations we would have gotten from elsewhere are riddled with viruses, trojans, and random stupid crap users install on machines, so it's easier to just reinstall most of the time).

Anyway, it worked well.

November 16, 2004

pyopenbsd, a set of Python classes for interfacing with OpenBDS and associated libs.

<@newsham> awesome. now you dont have to be a C programmer to enjoy the diverging APIs of unix systems!!

Had an idea to do the same thing with a set of Perl modules. Got so far as registering the namespace on the CPAN before getting distracted.

This was, of course, six months ago.


November 30, 2004
December 6, 2004

Finally got around to "installing" dovecot.

I say "installing" because it was just a make install, then editing the config file to change the available daemons from the default (imap,imaps) to imaps only.

That was very possibly the most painless piece of server software I have ever installed.

December 13, 2004

02:18 -!- yuckf00 [yuckf00@west.philly.ghetto.org] has quit [Read error: Connection reset by peer]
02:18 -!- pthread [pthread@west.philly.ghetto.org] has quit [Read error: Connection reset by peer]
02:18 -!- devi0us [devi0us@west.philly.ghetto.org] has quit [Read error: Connection reset by peer]
02:18 -!- asm_ [asm@west.philly.ghetto.org] has quit [Write error: Connection reset by peer]
02:18 -!- javaman [javaman@west.philly.ghetto.org] has quit [Write error: Connection reset by peer]
02:18 -!- |8^D [enkrypted@west.philly.ghetto.org] has quit [Write error: Connection reset by peer]
02:18 -!- sonic [sonic@west.philly.ghetto.org] has quit [Read error: Connection reset by peer]
02:18 -!- binary [binary@west.philly.ghetto.org] has quit [Write error: Broken pipe]
02:18 <@bda> whups.

It's funny when you enable pf, because it has no state table.

December 15, 2004

Friend of mine asked me to pull his data off a busted Windows install, so I said sure.

Finally get around to actually plugging the drive into my workstation (which is a Mac) and then go looking for Linux PPC LiveCDs that don't suck. Well. They all had problems. The Knoppix images I tried didn't even boot all the way. The Gentoo image would boot, then start throwing "vt: argh data_driver is NULL !" errors. Booting with noapic fixed it half the time, at least. But then, of course, NTFS wasn't supported by the kernel and I had just about no interest in recompiling their kernel and re-burning the image for it.

So I do a quick google run for third-party NTFS apps for Mac OS X and hit the Mac OS X Filesystems list, which seems very comprehensive.

And of course OS X has NTFS read-only support. So I boot and start pulling data off to a network drive. It's really not the fastest thing in the world, but it appears to be mostly working (there are some I/O errors getting thrown around; I dunno if that's physical or driver, though).

Anyway, it's saved me a trip to Factory or work in 23 degree weather, so I'm okay with it taking all damn day if it wants to.

December 18, 2004

Stayed at work late last night with the intention of rebuilding the LAN firewall, and replacing the router with an OpenBSD box.

Unfortunately I had to move the mx to get at the firewall... and the mx has a really twitchy root drive, which finally decided to kill itself. Manged to pull the passwd file and some of the postfix configs off.

The upshot of this is that I spent about five hours building machines and migrating users and data.

  • Wrote a really lame Linux (Sixth Edition) password file to BSD master.passwd converter. I had been up since 0630 could barely see by this point, which was kind of fun. The awk line was ripped off from this.
  • Documented my default actions during an OpenBSD install.

Got everything up and running (and users shouldn't notice any changes, except being asked to save a new cert if they're using pop3s) around 0330.

Walked to the train station and saw more creepy people in Camden last night than I think I have in four years of late nights. Got home around 0445 and slept for six hours before my next door neighbor decided it was a good time to start shooting aliens and woke me up.

(Good news is today is Sophy and Adam's potluck.)

Let me know what you think about the obsd install doc (though it's more script than doc). I'm not sure about the harden_obsd.pl script any more, but.

An addendum... I forgot to make a seperate /var/mail partition for mailspools as I've gotten used to delivering to Maildir in the user home directories. But I've gotten in the habit, lately, of having spillover drive space mounted to /vol/scratch for just such occurrances.

# umount /vol/scratch
# postfix stop
# mv /var/mail /var/mail.foo
# mkdir /var/mail
# disklabel -E /dev/wd0c

Kill the scratch partition, add /var/mail, re-add /vol/scratch...

# newfs /dev/wd0{$x}
# newfs /dev/wd0{$y}
# mount /vol/scratch
# mount /var/mail
# mv /var/mail.foo/* /var/mail
# rm -r /var/mail.foo
# vi /etc/fstab


December 20, 2004

So earlier today I noticed that a 'pop3' process on our (newly installed, thanks to the old one's root drive burning itself out) user mailserver eating CPU and thrashing.

After a few minutes investigation (and a suggestion from ejp), it seems that the mbox index cache that dovecot builds got corrupted, and spun out of control. Blowing the cache away fixed the problem.

I also upgraded the user's MUA from Thunderbird 0.9 to 1.0, though I figure it's sort of unlikely that was the cause.

Worries me that dovecot will do that in the first place, though, and I suspect this may be one of the things Harry was talking about when he said that dovecot doesn't scale...

(Note: You can turn caching off.)

  • Some jackass turned off the machine that serves the estimating software.
  • Had to troubleshoot that fun dovecot bug.
  • Co-workers machine got owned and was hammering random machines on the Interweb with ssh brute force attacks. Common forensics software doesn't see shit (linux), but I'm not surprised.
  • Had to set up the firewall rules for reflection, as I forgot to do it on Friday.
  • Discovered that rsync_hfs apparently does not work with netatalk2, in that it does not keep resource forks. Which is its entire purpose for being. psync, ditto, CpMac, etc, all work as expected. Rewrote wrapper scripts to use psync. Last two weeks of data have no resforks. Will be very entertaining when someone asks for a restore. Note to self: Test all software more thoroughly after upgrades instead of just assuming because it doesn't cause OS X to reboot itself doesn't mean it's more gooder.
  • It was so windy and cold this morning that the side of my neck which took the brunt of the wind is still red, seven hours later.
  • Haven't got to work on my graphing project at all, which is annoying but fair.
  • Need to set up OpenVPN on the company firewall.
  • Need to set up OpenVPN on PWF's firewall and set up tunnels to CCCP and the Hasty Pastry. Need to read up on routing protocols (or just leave it up to porkchop who seems way more interested in it than I do).


January 3, 2005

Someone spammed this to misc@openbsd yesterday. Unattended OpenBSD install media. Awesome. Will definitely be playing with this once I get home.

Pulling the config based on system stuff is definitely something I might be interested in working on as well.

January 4, 2005

I just added the metawire.org Apache logs to newsyslog.conf:

find /var/www/logs -name "*_log" |sort |sed 's/$/ root:daemon 640 10 * 24 Z "apachectl stop ; apachectl start"/' >> /etc/newsyslog.conf

newsyslog -v -f /etc/newsyslog.conf

And it took a good few minutes, as they've never been rotated and weighed in around 1.2G. The loadavg kicked up to 80 while the files were being compressed, which was pretty entertaining.

A more sane solution to the apachectl command above would be a script that stops Apache, waits until any httpd-related ports aren't being returned by netstat, and then start it back up.

January 21, 2005

The other day I refactored the PWF network, changed internal addressing, set up a DMZ off the firewall, etc. I also stole conduit Mk I (the PWF mailserver) and reinstalled it for use as the fileserver Kyle has been talking about for a while. Did another install on some other random box for conduit Mk II. Spent a few hours down here, made sure it all worked, then went home around 2200...

Kyle msg'd me this afternoon and told me he couldn't get to conduit Mk II. As everything was working fine last night, I was somewhat confused as to why it would be broken. But yeah, it was inaccessable. Figuring it was a hardware problem, I became suitably annoyed, since I have yet to set up decent automated OpenBSD install stuff. Showered and got to Factory around 1520, plugged a keyboard and head into the box, logged in. Everything seemed fine. I let Kyle know, then went out to Sev for some soda.

Came back and noticed that conduit's power light was blinking. I tried to ping the box, and hey... nothing. So I smacked the keyboard that was still plugged into it. Display woke up, but still no network. Power light went solid. The damn thing was sleeping.

Rebooted, fixed the BIOS...

I don't think I've ever had that happen before. Possibly that's because I usually removed APM stuff from the Linux boxes I admin (unless it's required for P4 HT, but those are all server motherboards anyway), and most of the OBSD installed I've done have been on previously known-good hardware. This was just one of the random Andrew-hoarded junk PCs we have laying around here.

I took some pictures. That first one is my bedroom. The rest are of Factory. Ian noted that the date on the camera is wrong. It reset after I put new batteries in it, and I guess I wasn't paying enough attention while licking the buttons. Pressing. Pressing the buttons.

Couple hours later Bryce came down and we cleaned up a bit. By "clean" I suppose I mean we cleaned off the desks and Bryce did some organizing. I suppose I should take some "after" pictures, but that seems pointless.

My next few Factory projects: mail filter box, list server, VPN...


And the latch on my PowerBook broke. That sucks.

January 26, 2005

Harry bitched at me for making that Red Hat joke the other day, so just to be an ass I went ahead and downloaded the Fedora Core 3 ISOs. Finally got around to installing it on a machine today:

Dual P3 900-something, 512MB RAM, SCSI, Ensoniq something or other, NVidia something, Intel EEPro.

It's a Penguin Computing workstation, so all the parts are pretty much guaranteed to work, or they're bad.

Anyway, it booted, saw stuff, installed.

So far it's not awful. I just went with the Workstation install, just to screw with it, since it's just a toy to me. The up2date tool is nice. The fact that it's in the menubar at launch is good stuff. I just pulled a 130MB of updates just now and it's installing.

Netfilter defaults to on, as does SELinux, so there's actual Workstation Security stuff going on, which is pretty awesome.

GNOME 2.8 is fast. The menu layout still sucks. After using OS X for the past two years, I'm not used to wading through menus to get at things anymore, especially not simply configuration/preferences. There should be some sort of central location for that stuff. (Is there? gconf doesn't count.)

The keychain icon in the toolbar when you auth to root is good stuff as well.

Overall, thus far, I would say it's a pretty good product.

That said, some people have run into problems with the install or various other things. Perhaps I'll hit those, but probably not before I install something else on the machine. :)

It should also be noted that the Windows Browser thing is still broken. I've never seen a distro where it actually does work, though, so you can't really hold it against RH (I guess).

I had initially intended on getting some RH server action and doing a real review, and how it stood up against other server OSes (Sol10, OpenBSD, etc), but obviously I can't get at the RH Enterprise bits, and reviewing FC3:Server against those just doesn't really seem fair.

I would reccommend it to someone who just wants a workstation, anyway.

I just got done fumbling around creating a ccd on OpenBSD; spent about an hour on it, or a little more.

Background: This is a machine I'm sure I've complained about in the past. Gateway "server" with three dead IDE busses. In its current iteration, it's meant to be used as a mirror of our production data and server backups. These will get taped off nightly.

I "repurposed" a 200G SCSI drive that had been hanging off the O2000 a couple months ago. But it'd been laying on the server room floor (sigh) for a while, so it was up for grabs. I didn't realize it was 200G until I mounted /vol/scratch, though. Bit of a shock.

Anyway, creating a ccd is super trivial. It's in GENERIC, so there's no need to recompile. By default, you have four available ccd's (ccd0-ccd3).

First, create disklabels on the component devices. Make sure your track offset is 2. This is what bit my ass for over an hour, because I wasn't thinking.

I had to read this to actually get it. And then it was all made clear.

Anyway, this machine was meant to eat four 200G IDE drives, but there's no way I can fit the fourth drive in there; the IDE cables just won't have it. If I had some velcro I could ghettohack it, but I haven't got any. So, anyway.

Once you have your diskabels made, it's just a matter of:

[root@dua]:[~]# cat /etc/ccd.conf
# $OpenBSD: ccd.conf,v 1.1 1996/08/24 20:52:22 deraadt Exp $
# Configuration file for concatenated disk devices
# ccd ileave flags component devices
#ccd0 16 none /dev/sd2e /dev/sd3e
ccd0 16 none /dev/wd0a /dev/wd1a /dev/wd2a

[root@dua]:[~]# ccdconfig -C
[root@dua]:[~]# ccdconfig -g
ccd0 16 8 /dev/wd0a /dev/wd1a /dev/wd2a

ccdconfig creates a non-zero partition table... "c", which is usually used to symbolize the whole disk is in this case a whole partition encompassing the full disk.

If you want to cut the ccd up into smaller partitions:

disklabel -E ccd0

and use the "z" command to zero the partitions and then create your partitions as you normally would. The FAQ fails to mention this, and it was not immediately obvious to me (but that's probably simply because I'm stupid and miss the obvious as times). ccd(4) and ccdconfig(8) do not mention it either, though, so...

Anyway, once you have your partitions set up:

[root@dua]:[~]# newfs /dev/ccd0c
[root@dua]:[~]# mount /dev/ccd0c /vol/backups/dam
[root@dua]:[~]# df -h |grep dam
/dev/ccd0c 550G 2.0K 522G 0% /vol/backups/dam

Pretty easy.

January 27, 2005

So I'm installing the machine that will replace both hastur and ligur, named ligur Mk II. I'm installing postfix, and when it pulls the tls/ipv6/pf patch, it throws a checksum error. "What the hell," says I, and grab an md5 of the file. Sure enough, it doesn't match the checksum listed in distinfo. So I go check on another box, and sure enough... so then I uncompress the two patches, get digests, and they're the same. I copy the patches to a third machine and diff the "bad" and known good patches. No differences.

Same filesize, same chars, same digest. So I recompress the "good" patch on the working box, and copy it over the new box. Same checksum error.

After a few minutes of screwing around, I think to myself...

[bda@selene]:[~]$ touch foobar ; gzip foobar ; md5 foobar.gz
MD5 (foobar.gz) = 36b0031ef3f51c3ceaa0700d8546de41
[bda@selene]:[~]$ rm foobar.gz; touch foobar ; gzip foobar ; md5 foobar.gz
MD5 (foobar.gz) = 997d552d8d6835a6f2b4ea719ba350d5

Apparently gzip flips bits as part of its compression algo. Useful so you know if a file has been recompressed (which must have happened on the mirror I pulled the patch from originally).

February 26, 2005

As an ADC member, I get access to the latest Mac OS 10.4 builds. I can't talk about them due to that whole NDA thing, but they cause me to reinstall my machines on a fairly regular basis.

8a393 came out today and I installed it on my laptop when I got home. I went to install it on my PowerMac (a dual G4 gigE, "Mystic"), and the DVD-RAM in it pretty much said "No. Piss off." This is pretty common with the junk-ass drive, so I thought about it for a few minutes.

First I figured I would just dd the Tiger image onto my spare 2g 10G iPod. This didn't work so well.

Then I realized... you can boot off firewire drives. So it stands to reason you can fucking install onto them. I have a firewire enclosure.

A few minutes of screwing around in the Mystic's insides later, I had the root drive out, attached to the enclosure, and plugged into my laptop. A reboot later, and I installed off the DVD. The install rebooted and asked me if I had another Mac I wanted to sync off of for this new install... as a matter of fact, I did: The laptop's boot volume. Ten minutes later (sigh, slow drives), I had a nice mirror of my laptop.

And now? Booted off the firewire drive with my keychain, my Mail, iChat, Safari, Terminal, etc, etc, settings all happy.

Sure it's just a matter of ditto/cpMac'ing files around... but damn. When it's so easy I don't have to think about it, just say "Yes, do that thing" and it works?

Well, that's why all my workstations are Macs these days.

Word up.

March 21, 2005

So Pete brought home a copy of WoW for me to waste my life on. Super, I think, but me being me and unable to just leave well enough alone, I don't want to wait for my MacMini to arrive. So I go to install it on my laptop, as my PowerMac a) has no CD-ROM and b) is a dual G4 500, not something WoW will be happy on.

Unfortunately WoW says "wtf is this Tiger bullshit? You ADC membership-having sucker, I won't play nice with this", so I figure, fine, 8a414 came out last week, I need to update anyway. I'll just dump my data onto my workstation and install Panther before going to bed, then burn the new Tiger build tomorrow and install it on another partition.


I boot my laptop into target disk mode, as that will be much faster than copying 25G over ethernet. No big. Then I hear the enclosure my workstation's root drive is in go "Pop!" and suddenly get real quiet. Anything not currently in memory on my workstation stop working. "Awesome!" I think, and reboot.

It boots my laptop OS. "Not awesome! But whatever!"

So figuring I'll just go ahead and continue on with dumping my data onto the PowerMac's data drives, I roll my chair back to grab the power cord for the laptop... and crack. I roll over it. "Fucking damnit!" says I, and look at it. It looks sad, but plugs in and charges the machine happily. "Whoo," I think.

So now I am copying all my crap off my laptop into my workstation which has no OS drive on it. Likely I just blew the enclosure and not the drive itself (which seems to spin up okay from what I can hear), so I'll just shove it back onto IDE later.

Well, now that the machine has free IDE. I spend the majority of yesterday moving data around... the new mirrorshades.net box (ligur Mk II) now has two 200G Maxtors in it, concatenated into around 355G. Yay.

I like that all of this happens at midnight. I really should know better by now...

April 23, 2005

Well, the ligur to crowley migration seems to be mostly over. Just a few little things left to do but it's answering DNS, serving web pages, and eating mail, which in my book means it's pretty much done.

As a few of the users aren't used to BSD (crowley runs OpenBSD) but Linux, there are some issues there. :)

Installing amavis was just as big a pain in the ass as I remember, but I just dug out my link to the Fairly Secure Anti-Spam Wiki and ran with it. Some modifications to their stuff... I need to clean up the script I generated to actually install the stuff, but eh.

Took about four hours to set up the new box and move everything over, I think (data had already been getting sync'd). Meh.

April 24, 2005

Migrating the POP accounts was quite painless. Now, getting away from that commercial webmail client with the obfuscated format... that kind of sucked.

May 20, 2005

The gutted PowerMac on the floor *was* running a Tiger beta, which had some interesting issues (DNS would stop resolving. AFP enjoyed eating a CPU and not responding to requests -- taking any other hosts which had it mounted with it), so I figured I would reinstall it tonight.

Realized that I would either have to find a DVD drive, or pull the root disk out and plug it into one of my other Macs and install it via firewire...

Figured it was all too much of a pain in the ass and OpenBSD is just so much easier to deal with.

I love OS X, but freakin' Finder should have been replaced in 10.4. Punkass bitches.

May 21, 2005

From the BitTorrent FAQ.

On some unices, BSD libc has a bug that causes BitTorrent to be very processor intensive. Run the client with the "--enable_bad_libc_workaround 1" option to fix this.

Apparently OS X/Darwin is not one of those libcs, but OpenBSD is. Good to know.

June 2, 2005

Really tired of this bug in OS X where if I sleep my laptop, wake it up, sleep it again without authenticating, the next time I wake it and do auth, it will go back to sleep. Usually it resets the brightness to the lowest level as well.

If someone could fix it, that'd be super kthx.

July 16, 2005

Someone emailed in response to this misc@openbsd post asking for pointers on getting AV and spam filtering running on OpenBSD. I've gone ahead and cleaned my notes up slightly and dumped them in my scripts dir...

Here are my amavis install on OpenBSD notes.

As I've said before, I've used the Fairly Secure Anti-Spam Wiki as a basis.

Like I told Charles... YMMV. :-)

I've become a big proponent of TRAC in the last month or so. It's a very simple, very efficient project management system and svn client. It's good stuff. Many projects (including Catalyst) have adopted it.

I got bored this morning and decided to install a personal copy on mnet, which required installing mod_python and setting up a bunch of other junk for it.

So here are some more "Installing stuff on OpenBSD" docs:

Installing mod_python on OpenBSD
Installing TRAC on OpenBSD

If you find any issues with them, drop me a line.

July 19, 2005

dmesg getting filled with SCSI media error garbage and screwing with line output can cause pretty funny things:

SCOpenBSD 3.6-stable (GENERIC) #1: Thu Jan 13 07:57:07 EST 2005

July 20, 2005

I just had mod_bonjour cause httpd to crash. This is after using it all day... I wonder if someone is spamming bad bonjour/rendevous packets. iChat doesn't seem affected, but, eh.

Commenting out the module fixed it. Here's the trace.

Killing my connection and trying to start Apache didn't help any. mod_bonjour doesn't appear to have been corrupted (md5 same as on my home machine).

That sucked.

September 5, 2005

Couple security fixes in OpenSSH 4.2 so it was time to go on an update spree. I have:

  1. breen
  2. gordon
  3. kleiner
  4. citadel
  5. philtered
  6. ghetto
  7. valve
  8. conduit
  9. punchclock
  10. hyperion
  11. gibson
  12. hastur

A few of those are still running 3.6, and OSSH 4.2 hit 3.6 and 3.8 a few days ago, so they were already updated. But overall? 10 minutes to update those hosts (counting cvsup time), manually, with no script (which would be trivial to do).

Nowhere near the number of machines I had while working at DCI, but there I would have just scripted the updates.

And of course now I have to wait for the few Debian boxes I still maintain, whenever the debsec team releases a package... grr.

[root@kleiner]:[~]# cvsup -g /etc/cvs-supfile
[root@kleiner]:[~]# cd /usr/src/usr.bin/ssh
[root@kleiner]:[/usr/src/usr.bin/ssh]# make clean && make depend \
 && make && make install
[root@kleiner]:[/usr/src/usr.bin/ssh]# cp ssh_config sshd_config /etc/ssh
[root@kleiner]:[/usr/src/usr.bin/ssh]# pkill -f /usr/sbin/sshd
[root@kleiner]:[/usr/src/usr.bin/ssh]# /usr/sbin/sshd

If you made changes to the ssh config files you might want to do a little diff action.

And test.

[bda@eos]:[~]$ ssh kleiner
Last login: Mon Sep  5 23:50:48 2005 from
OpenBSD 3.7-stable (GENERIC) #0: Thu Aug 25 16:30:04 EDT 2005

[bda@kleiner]:[~]$ ssh -V
OpenSSH_4.2, OpenSSL 0.9.7d 17 Mar 2004

Teh yay.

December 13, 2005

I've been around. I've done lots of stupid shit with computers. Upgrading the RAM in my Mac mini might just have been the most annoying one.

I was about to order a putty knife (which everyone suggests using, but damages the case, when ejp reminded me that a while back I had posted a link to some guys who used a strand of CAT5 to get the case open. That single side tab on the front was the worst one.

Twenty minutes later, I had the case open, the RAM swapped, and booted the machine. All was happy.

But Buddha on a pogo-stick, man. What a pain.

As an added bonus, I pulled the 512 stick out of the mini and dumped it into my x86 workstation that never gets used. All of my machines now have a gig or more of RAM in them now.

selene (PowerBook) 1.25G
eos (Mac mini) 1G
hyperion (Dell SC1400, fw/fserv) 1G
helios (x86 bitchbox) 1G

Kinda neat. Weird, though.

January 21, 2006

pkg_find is a nice little shell script from Michael Erdely which lets you search packages for a given string and returns a list of matching packages (one of which you can then choose to install). It keeps a local copy of the index, and updates it every n days. There is a port tarball available.

It's only a couple hundred lines, and replaces the ghetto manual index grep I've been doing for a while. I kept meaning to write something exactly like this, but yay apathy. According to the comments on the post, he wrote it to get away from exactly that. :)

The next version of pkg_add will include -i, which will apparently do the same thing. Marc Espie has been kicking ass with the package system.

I should probably start tracking -CURRENT somewhere.

January 22, 2006

untaring src.tar.gz into /usr instead of /usr/src kind of sucks.

It sure is a good thing OpenBSD's ftp is statically compiled. Just had to grab base38.tgz to another box (gzip likes to link to libraries more than directories; blowing out /usr/lib kind of sucks), uncompress it, copy it over to the hosed box, and untar it in /tmp. Copy /usr/local to /tmp/usr/local, as local shouldn't have been touched, then just get rid of the hosed /usr and copy over /tmp/usr.

The other solution would be to build the new /usr on a fresh partition (I like to have one or two spare at the end of the root disk), and then swap them.

Note to self: Have a static sshd and ssh handy. And start using screen-static instead of screen. :)

So for a while now I've been trying to find a decent of rotating virtual host logs... finally tonight I got around to spending the twenty minutes to write a couple little scripts to deal with it for me.

There's a few problem with Apache logs, especially if you've got logs dumping on a per-vhost basis, and are running some form of stat generation against them. Previously I was doing an extraordinarily lame ghetto hack, with every vhost having two entries in /etc/newsyslog.conf, one for access log, the other for error, and doing a svc -t /service/httpd after every log rotation (svc is part of djb's daemontools, -t signifies you want to HUP the service). Needless to say, that's pretty crap; more or less the same as doing a graceful n times a night.

The rotatelogs program seems to be far from awesome, seeing as how it doesn't seem to want to fork() per-vhost, and anyway it looks like it bases rotation time from server restart... which is totally useless for cron jobs. Maybe I'm wrong. I suspect I don't care. newsyslog slipped me a twenty, so.

The solution I came up with is rather simple. I have the following script generate a newsyslog.conf for my Apache configs:


use strict;
use warnings;

use vars qw/ @ARGV @domains /;

unless ($ARGV[0]) { die; }

my $file = $ARGV[0];

open (FILE,"<$file");

while (<FILE>) {
push @domains,$_;

close FILE;

my $log_string = <<EOF;
/var/www/logs/access_log root:daemon 640 10 * \$D0 Z
/var/www/logs/error_log root:daemon 640 10 * \$D0 Z
/var/www/logs/ssl_engine_log root:daemon 640 10 * \$D0 Z
/var/www/logs/suexec_log root:daemon 640 10 * \$D0 Z

for my $domain (@domains) {
$log_string .= <<EOF;
/var/www/logs/vhosts/$domain/access_log root:daemon 640 10 * \$D0 Z
/var/www/logs/vhosts/$domain/error_log root:daemon 640 10 * \$D0 Z

print $log_string;


# /root/bin/apache_newsyslog_build.pl /root/etc/domains.txt \< /root/etc/apache-newsyslog.conf

And then in root's crontab:

00 0 * * * newsyslog -f /root/etc/apache-newsyslog.conf && /usr/local/bin/svc -t /service/httpd && /root/bin/run_webalizer.pl

Where run_webalizer.pl is:


# This script expects webalizer configs to be in the format:
# domain.com.conf

use strict;
use warnings;

my $www = "/var/www";
my $vhosts = "$www/vhosts";
my $vhosts_logs = "$www/logs/vhosts";
my $access_log = "access_log.0";

my $webalizer = "/usr/local/bin/webalizer";
my $webalizer_confs = "/root/etc/webalizer";
my $gunzip = "/usr/bin/gunzip";
my $gzip = "/usr/bin/gzip";

opendir (DIR,$webalizer_confs) or die ("Couldn't open $webalizer_confs: $!\n");
my @configs = grep (!/^\..*/, readdir (DIR));
closedir (DIR);

foreach my $domain (@configs) {
$domain =~ s/\.conf$//;
if (-d "$vhosts_logs/$domain") {
print "Found $vhosts_logs/$domain.\n";
if (-d "$vhosts/$domain") {
print "Analyzing $domain... ";
unless (-d "$vhosts/$domain/stats") {
print "Creating stats dir. ";
mkdir("$vhosts/$domain/stats"); chmod 0755, "$vhosts/$domain/stats";
if (-f "$vhosts_logs/$domain/$access_log.gz") { my $decompress = `$gunzip $vhosts_logs/$domain/$access_log.gz`; }
my $analyze = `$webalizer -c $webalizer_confs/$domain.conf`;
if (-f "$vhosts_logs/$domain/$access_log") { my $compress = `$gzip $vhosts_logs/$domain/$access_log`; }
print "Done.\n";
else { print "Skipping $domain.\n"; }

Pretty simple.

Anyway, I guess we'll see how well it works tonight. :)

February 19, 2006

My away msg until a few minutes ago:

I like companies who "live test" their BRAND NEW power backup system by pulling the mains on the customer racks. I'm in New Jersey right now, bringing up a couple machines that didn't take too fucking kindly to a hard reboot. Thanks, hostremote.net, for your awesome service! If you were answering your support line I'm sure you would have plenty of useful word-sounding noises to make at me!

Not. Happy.

How do I know they did the above? Well, the uptimes on all the machines I have there are the same! Hmm!

And it killed hastur as well, which was just hosting a couple small CVS/SVN repos. So hastur is now sitting on my floor getting hacked on instead of in Jersey hosting things. Now I get to figure out where to put the crap that was on this machine -- because it is not going back to them.

Had to go to New Jersey with a hangover and fix stuff. Rage.

February 21, 2006

Installed Solaris 10 on my Dell PowerEdge 1400SC the other day. Just got around to logging into it. Man, I don't remember anything about Solaris except ps -ef. Pretty sad. It took a damn long time to get installed, and did not enjoy playing with my cheaper, far more generic, bitch system. Had to swap the PE out of being a fileserver so I could get Solaris installed.

Going to be playing with the Sun LDAP server, methinks. If Solaris Zones actually did virtualization, I would consider using them for some virtual server stuff I'm going to need to do, but pity, they aren't. Still, they're pretty awesome. I just don't know what I'd want to give someone a shell on a Solaris box for. ;-)

(Note: Image does not actually inspire confidence.)

March 25, 2006

I guess almost two months ago now I started playing around with Solaris 10. I spent a lot of time reading up on it, and even ordered a SunFire X2100 because I figured I might actually want to run it in production (things like Zones, DTrace, SMF, etc, just are that awesome). I probably wouldn't have thought of getting the SunFire, but mdxi was very positive about his experience with the machine.

As I get older, I seem to notice the noises computers in my room makes more and more. Whenever my fileserver (which is across the room -- but it's a really small room) runs it cron jobs at night (which are all very heavy I/O) it sounds like a small motor is chunking through a duck or something. Last night I finally got tired enough of it that I cleaned out my closet with the intention of moving any machines that don't require a display in there. I also figured I would take the opportunity to do something productive with Solaris 10: Replace my current OpenBSD Samba server.

April 12, 2006

So since I turned my MacMini into a server box (previous system was dying), I've been using my laptop as a workstation at home. When plugged into a 24" LCD it has a fun tendency to kick the fan on high when I do, oh, just about anything at all. The sound was driving me nuts so I figured I'd pick up another MacMini. Well, after seeing Half-Life 2 running on an iMac I changed my mind and ordered a new Intel iMac, decked out for gaming, now that Macs have an easy way to dual-boot Windows.

It got delivered this afternoon, and I've got it set up pretty nicely. The only issue I'm having is freaking Keychain Access not letting me add existing keychains. I've had this problem before, but I can't remember what the deal was. Driving me nuts as KA is one of the apps I live in.

I did just notice something kind of odd, and yuckf00 pointed out the likely cause:

On the iMac:

[bda@moneta]:[~]$ uname -m
[bda@moneta]:[~]$ dmesg
Unable to obtain kernel buffer: Operation not permitted
usage: sudo dmesg
[bda@moneta]:[~]$ ls -l /sbin/dmesg
-r-xr-xr-x 1 root wheel 34908 Feb 21 16:24 /sbin/dmesg*

On a PowerBook:

[bda@selene]:[~]$ uname -m
Power Macintosh
[bda@selene]:[~]$ dmesg | head -n 2
standard timeslicing quantum is 10000 us
vm_page_bootstrap: 316093 free pages
[bda@selene]:[~]$ ls -l /sbin/dmesg
-r-xr-sr-x 1 root kmem 14752 Mar 20 2005 /sbin/dmesg*

Kinda weird.

Anyway, woot:

May 11, 2006

Stopping Spotlight Indexing

While I don't really have a problem with metadata indexing in theory, I really dislike having mds kick up to 50% CPU on my laptop, which engages the Fan Whose Pitch Is Just Right to Set My Teeth On Edge (Of Doom County). While I'm not sure if I'll actually disable indexing, the Spotlight overview there was pretty informative.

Unlike, say, this, which is just full of retardedness.

May 30, 2006

Friday I installed OBSD 3.9 on two Dell 1850s and configured CARP and pfsync. It was amazingly trivial. If you need failover systems of pretty much any sort, this is the way to go.

To quote the OpenBSD FAQ page:

CARP is the Common Address Redundancy Protocol. Its primary purpose is to allow multiple hosts on the same network segment to share an IP address. CARP is a secure, free alternative to the Virtual Router Redundancy Protocol and the Hot Standby Router Protocol.

It takes about five minutes to set up, and about fifteen minutes playing "plug/unplug the systems and watch the ifconfig state change, tee-hee!". Kind of like that episode of the Simpsons where Homer keeps pulling on the pig's tail.

"Curly! Straight! Curly! Straight!"

Only CARP just does what it does instead of biting your face off like a certain piggy.

pfsync is, simply, a way to sync your firewall state tables to a group of hosts on a trusted network of some sort. So when your primary firewall/proxy/whatever dies, and a backup kicks in, your users don't notice anything -- they don't lose their sessions. Quite awesome.

Firewall Failover with pfsync and CARP

PF: Firewall Redundancy with CARP and pfsync

The PF page there is pretty much all you need. Getting it working is maddenly easy and it Just Works.

August 21, 2006

So I was putting together a test backup server using rdiff-backup last week, and I wanted to (for some strange reason) backup up the various OpenBSD machines I have installed since starting there.

It's pretty trivial:

pkg_add popt
pkg_add -i python

wget http://easynews.dl.sourceforge.net/sourceforge/librsync/librsync-0.9.7.tar.gz
wget http://savannah.nongnu.org/download/rdiff-backup/rdiff-backup-1.0.4.tar.gz

tar -xzf librsync-0.9.7.tar.gz
cd librsync-0.9.7
make all check
make install

tar -xzf rdiff-backup-1.0.4.tar.gz
cd rdiff-backup-1.0.4
python setup.py install --prefix=/usr/local --librsync-dir=/usr/local

If you are using 64-bit hardware, you'll need to use use --with-pic for librsync

The next step is to involved a hacked up version of the littlest backup wrapper script that could, resync 0.3, and bang, done.

rdiff-backup is pretty sweet. Check out the examples, this howto on unattended backups, maybe this arstech article, and this here wiki.

I need to clean up resync a bit (getting it back in VCS will give me an excuse to try out git, too) and then I'll throw it up on code.

November 23, 2006

I don't really want to get into why I did this (let's just say today has sucked), but you can change MAC addrs on OBSD since 3.8 without digging up `sea.c`.

# ifconfig bge0 lladdr 0a:0b:0c:0d:0e:0f


February 5, 2007

For the last week and a half I've been learning up on Solaris 10 again. The last time I touched it, about a year ago, I was just screwing around with no real interest in using it in a production environment. After reading a few posts over at Theo Schlossnagle's blog regarding choosing Solaris over Linux and his OSCON slides relating to the same, both relating to PostgreSQL performance, I became much more interested in Solaris 10.

(hdp made noises about how evidently the company Schlossnagle works for wrote OmniMTA, which is what the Cloudmark Gateway product uses, among other things; evidently it's a small enough world, after all.)

We have a service at work which stores spam for 30 days. We refer to the messages as "discards", because the system has decided you probably don't want to see them, but it's not like we're going to drop the things on the floor. The thing is, it's insanely slow, right to the very edge of usability (and probably beyond for the vast majority of people). Getting results out of the database takes minutes.

There are a number of issues with the system as a whole, but evidently Postgres configuration is not one of them (jcap, my predecessor, set the service up properly, and a PgSQL Guy agreed there wasn't much else could be done on the service end). So that leaves hardware and OS optimizations. The hardware is fine, save for the fact it's running on metadisk, which is running on SATA (read: bloody slow, especially for PgSQL, which is a pig for disk I/O). We'll be fixing that with a SCSI JBOD and a nice SCSI RAID card RSN. The OS is Linux, and has also been optimized... to a point. Screwing with the scheduler would probably get me something. However, based on my own research (I've read both the Basic Administration and Advanced Administration books over at docs.sun, as well as numerous websites, etc), and Schlossnagle's posts, I've made up my mind that Solaris is the way to go here. So what sold me?

Well, there's the standard features all new to Solaris 10:

  • ZFS
  • StorageTek Availability Suite (I can't seem to get away from network block devices... we use DRBD right now, and frankly I've really come to hate it; but the basic idea is sound enough and far too useful to ignore)
  • Fault Management
  • Zones (not very useful to me in this case)
  • Service Management Facility (while not a deal-breaker or maker, it's incredibly nice being able to define service dependencies and milestones, it also ties into FM)
  • DTrace (for me, this is a deal-maker; check out the DTraceToolkit for examples why, compared to debugging problems under Linux, it's a huge win for any admin)
  • Trusted Extensions (while really interesting and hardcore, not something I care much about just yet)
  • Stability (not only in terms of the system itself, but the APIs, ABIs, and the like; you can use any device driver written for the last three major versions of Solaris in 10 -- compare not only the technology there, but the philosophy behind it, to any freenix)
  • RBAC (while not something I'm going to use immediately, it's something that I really want to utilize moving forward)

That's a fair feature-set that should get any admin to perk up and take notice. Of course, if it weren't for OpenSolaris I wouldn't care. Solaris 8 and 9 are sturdy and well-known systems, but I have no interest in them. They don't get me anything except service contracts and annoying interfaces. With OpenSolaris, Sun is actively making progress in the same friendly directions freenixes have always tried for -- while adding some seriously engineered and robust tech into the mix. It's a nice change. A more open development model, with lots of incremental releases (building into an official Solaris 10 release every six months or so) give me the warm fuzzies.

So, now that the advertisement is out of the way, what are my impressions after a week of configuring and using it?

Well, Solaris with great new features is still Solaris. Config files are in strange places for legacy reasons, there are symlinks to binaries (or binaries) in /etc, SunSSH is ... well. SunSSH (perhaps sometime soon they'll just switch over to OpenSSH and all the evil politicking can be forgotten, yes?). /home is not /home because it's really /export/home.

Commands exist in odd locations that aren't in my path by default, logging is strange. In short, it's a commercial UNIX. It's steeped in history and the reasons for things are not always immediately clear. The documentation (both from docs.sun, OpenSolaris, and the man pages) is excellent. I am not coming to Solaris as a total newb. I've used it before, but not particularly extensively; the learning curve is expectedly high.

As always, UNIX is UNIX. Nothing changes but the dialect and where they put the salad fork.

So, I've got this core system that does lots of really great stuff, some of which is confusing and maybe not so great, but overall it's a pretty obvious win. Unfortunately it has a bunch of tools I'm not used to, or don't like, and it lacks a lot of tools I require. So I need to go out and find a third-party package management utility. Well, you've got Sun Freeware, which is pretty basic. There's Blastwave, which has a large repository of software, a trivial way of interfacing with it all, but seems to have some QA issues (that's an old impression and may have become invalidated).

And then there's pkgsrc, the NetBSD Ports System. And you know what? It's pretty great. After bootstrapping pkgsrc's gcc from Sun Freeware's (Sun packages gcc now, so you have access to a compiler with the OS -- this was not true before -- but apparently Sun's gcc is Weird and not to be trusted), I was building packages with no issues whatsoever. OpenSSH, Postfix, PgSQL, vim7... Anyway, with an hour's worth of work (which only ever need be done once, on one system, to build the packages), you've got all the programs you're used to using, or require. Suddenly the weird and craggy vastness of Solaris -- expat from the world of commercial UNIX -- becomes much more friendly and livable.

A couple simple hints about your environment: Set TERM=dtterm and add TERMINFO=/usr/share/lib/terminfo. The former seems to be the proper shell for xterm or xterm-like terminals, and the latter fixes pgup/pgdown in pagers and vim, though not in Sun's vi.

It's also easy to create your own packages -- something we've been wanting to do at work for a long time (before I started, certainly). Moving our current custom "packaging" system to pkgsrc would be tedious, but certainly something we could automate with some work. Standardizing on it would be a big win not just for the Solaris servers, but for the architecture as a whole. So, a double win.

(I would be remiss not to mention Nexenta, a project which merges GNU software into OpenSolaris's base. It's very, very interesting, especially in that they use Ubuntu's repos, but regardless of the purported stability of their alpha releases, I can't say I am very interested in running it on my servers. Still, it's definitely something someone who wants to give Solaris 10 a poke without too much effort should take a look at. The same way that Ubuntu is there for people who want to try out Linux. I imagine, frankly, that eventually they will occupy the same ecological niche.)

As you might have guessed, I'm quite happy with my week and change of testing. All the basic management is similar to my BSD experience, and the vast wealth of information I can trivially get out of the system compared to other UNIXes makes it hard to argue against. pkgsrc means not only I, but our developers, have access to everything they need or are used to. The default firewall is ipf, which I'm not thrilled about (pf is my bag), but is certainly usable, and no doubt an improvement over whatever came before.

My next step is to take a Reduced Networking installation and build it up into a Postgres server running Slony-1 for replication services. I expect it to go pretty smoothly. The step after that will be a JumpStart server to live next to my (now beloved) FAI server.

There are a few things I need to pursue before we roll to live testing, including root RAID (the X2100's nvidia "RAID" is not supported by Solaris 10, weirdly enough). ZFS root is apparently possible, though tedious and weird. It would give me a trivial way to do mirroring, though. A install server would probably make it easier to do (though that's just a guess). Barring that, I'm guessing that a ZFS pool for system volumes (/usr, /var) and a ZFS pool for data/applications would be good enough. Mirroring / in the typical way (which certainly appears to be simple), until ZFS on root becomes common and supported.

I expect I'll drop some more updates as I move forward. Hopefully with good news for our PgSQL performance. ;-)

<bda> "Every block is checksummed to prevent silent data corruption, and the data is self-healing in replicated (mirrored or RAID) configurations. If one copy is damaged, ZFS will detect it and use another copy to repair it."
* bda sighs.
<bda> It's so dreamy.
<kitten-> You really need a girlfriend.
<bda> I doubt she'd come with fault management and a scriptable kernel debugger.
<kitten-> I suppose you're right.

February 8, 2007

Continuing on my "pkgsrc is pretty awesome" schtick, here's how easy it is to get it running on OS X. The most annoying part is downloading XCode (if you haven't got it already).

Once you have the DevTools/XCode installed, you'll need to create a case-sensitive volume for pkgsrc. Until somewhat recently you couldn't resize volumes, and this would have been far more annoying. However, these days it's pretty trivial.

Another option would be to create a disk image and run pkgsrc out of that. The documentation suggests this course. It has the added bonus of being portable (pkgsrc on your iPod?); I'm just doing this on my laptop and don't want to have to deal with managing a dmg, so I'm going the resize route.

[root@selene]:[~]# diskutil list
#: type name size identifier
0: GUID_partition_scheme *74.5 GB disk0
1: EFI 200.0 MB disk0s1
2: Apple_HFS selene 74.2 GB disk0s2
[root@selene]:[~]# diskutil resizeVolume disk0s2 70G "Case-sensitive Journaled HFS+" pkgsrc 4.2G

Once it's done resizing the volume, you'll need to reboot.

The reboot is required because we're monkeying around with the boot volume. If you're doing this on an external disk, you can just refresh diskarbitrationd with disktool -r.

[root@selene]:[~]# diskutil list

#: type name size identifier
0: GUID_partition_scheme *74.5 GB disk0
1: EFI 200.0 MB disk0s1
2: Apple_HFS selene 70.0 GB disk0s2
3: Apple_HFS 4.1 GB disk0s3

Well, leetsauce, our volume exists. But it's not mounted, because resizeVolume doesn't actually format it.

[root@selene]:[~]# diskutil eraseVolume "Case-sensitive Journaled HFS+" pkgsrc disk0s3
Started erase on disk disk0s3


Mounting Disk

Finished erase on disk disk0s3 pkgsrc

And now it shows up happily:

[root@selene]:[~]# mount
/dev/disk0s2 on / (local, journaled)
devfs on /dev (local)
fdesc on /dev (union)
on /.vol
automount -nsl [182] on /Network (automounted)
automount -fstab [189] on /automount/Servers (automounted)
automount -static [189] on /automount/static (automounted)
/dev/disk0s3 on /Volumes/pkgsrc (local, journaled)

And is it indeed case-sensitive:

[root@selene]:[/Volumes/pkgsrc]# touch foo Foo
[root@selene]:[/Volumes/pkgsrc]# ls -l ?oo
-rw-r--r-- 1 root wheel 0 Feb 8 02:09 Foo
-rw-r--r-- 1 root wheel 0 Feb 8 02:09 foo

If you care to, you can change the mountpoint using Netinfo Manager, but I'm lazy and just symlinked /Volumes/pkgsrc -> /usr/pkg. You could also change --prefix=/Volumes/pkgsrc, but as I'm using it on other OSes, I like having it all in the same place. Personal preference.

(When Netinfo Manager stops SPODing at startup I'll probably change the mountpoint.)


[root@selene]:[~]# cd /usr/pkg
[root@selene]:[/usr/pkg]# curl -O ftp://ftp.NetBSD.org/pub/pkgsrc/pkgsrc-2006Q3/pkgsrc-2006Q3.tar.gz
[root@selene]:[/usr/pkg]# tar -xzf pkgsrc-2006Q3.tar.gz
[root@selene]:[/usr/pkg]# cd pkgsrc/bootstrap
[root@selene]:[/usr/pkg/pkgsrc/bootstrap]# ./bootstrap

And off it goes.

===> bootstrap started: Thu Feb 8 02:37:50 EST 2007
===> bootstrap ended: Thu Feb 8 02:40:15 EST 2007
[root@selene]:[/usr/pkg/pkgsrc/bootstrap]# mkdir ../../etc/
[root@selene]:[/usr/pkg/pkgsrc/bootstrap]# cp /usr/pkg/pkgsrc/bootstrap/work/mk.conf.example ../../etc/mk.conf

Prepend /usr/pkg/bin:/usr/pkg/sbin to your $PATH.

The main reason I did this was to upgrade vim to 7.x, because I want tab support (I never really got the hang of managing buffers, but to the dismay of all my Elite Vim hax0r Friends).


[root@selene]:[/usr/pkg]# cd pkgsrc/editors/vim
[root@selene]:[/usr/pkg]# bmake package

And after downloading a few billion patches and compilation...

[root@selene]:[/usr/pkg/pkgsrc/editors/vim]# which vim
[root@selene]:[/usr/pkg/pkgsrc/editors/vim]# vim --version
VIM - Vi IMproved 7.0 (2006 May 7, compiled Feb 8 2007 02:59:09)


The LSI SCSI card for our new discards database server (Sun X2100 M2) came in today. I ran up to UniCity to get some cables from Harry (thanks, Harry!) and plugged the Dell PowerVault 210S we got into the thing. All the disks showed up happily in cfgadm -al, but not in format. Telling cfgadm to configure didn't seem to do much for me, so I decided to be lazy and touch /reconfigure and reboot. Once that was done, all seemed to be quite happy.

The goal here is to create a ZFS pool on the JBOD for Postgres to live on. As you can see, it is super easy:

[root@mimas]:[~]# zpool create data2 raidz c3t0d0 c3t1d0 c3t2d0 c3t3d0 c3t4d0 c3t5d0 c3t8d0 c3t9d0 c3t10d0 c3t11d0 c3t12d0 c3t13d0
[root@mimas]:[~]# zpool status
pool: data
state: ONLINE
scrub: none requested

data ONLINE 0 0 0
mirror ONLINE 0 0 0
c1d0p3 ONLINE 0 0 0
c2d0p3 ONLINE 0 0 0

errors: No known data errors

pool: data2
state: ONLINE
scrub: none requested

data2 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c3t0d0 ONLINE 0 0 0
c3t1d0 ONLINE 0 0 0
c3t2d0 ONLINE 0 0 0
c3t3d0 ONLINE 0 0 0
c3t4d0 ONLINE 0 0 0
c3t5d0 ONLINE 0 0 0
c3t8d0 ONLINE 0 0 0
c3t9d0 ONLINE 0 0 0
c3t10d0 ONLINE 0 0 0
c3t11d0 ONLINE 0 0 0
c3t12d0 ONLINE 0 0 0
c3t13d0 ONLINE 0 0 0

errors: No known data errors
[root@mimas]:[~]# zfs create data2/postgresql
[root@mimas]:[~]# zfs set mountpoint=/var/postgresql2 data2/postgresql
[root@mimas]:[~]# mv /var/postgresql/* /var/postgresql2/
[root@mimas]:[~]# zfs umount data/postgresql
[root@mimas]:[~]# zfs destroy data/postgresql
[root@mimas]:[~]# zfs umount data2/postgresql
[root@mimas]:[~]# zfs set mountpoint=/var/postgresql data2/postgresql
[root@mimas]:[~]# zfs mount data2/postgresql
[root@mimas]:[~]# zfs list
data 885M 124G 24.5K /data
data/pkg 885M 124G 885M /usr/pkg
data2 17.1G 346G 44.8K /data2
data2/postgresql 17.1G 346G 17.1G /var/postgresql

RAID-Z is explained in the zpool man page.

While I am a little worried that we really want hardware RAID, I didn't want to spend the extra cash. If we do end up needing the extra performance, trust me, the system can be repurposed easily. ;-)


February 10, 2007

In which our intrepid sysadmin may as well have been eating wall candy as doing work for all it got him.

I spent all of Friday and most of last night trying to figure out why pgbench was giving me what seemed to be totally insane results.

  • >1000 tps on SATA 3.0Gb/s (Linux 2.6, XFS, metadisk + LVM)
  • ~200 tps on SATA 3.0Gb/s (Solaris 10 11/06, UFS)
  • ~500 tps on SATA 3.0Gb/s (Solaris 10 11/06, ZFS mirror)
  • <100 tps on 10x U320 SCSI (Solaris 10 11/06, ZFS RAID-Z)

Setting recordsize=8192 for the ZFS volume is helpful. Or whatever your PostgreSQL blocksize is compiled with.

It basically came down to Postgres fsync() absolutely murdering ZFS's write performance. I couldn't understand why XFS would perform 10 times as well as ZFS on blocking writes. Totally brainlocked.

And of course the bonnie++ benchmarks were exactly what I expected: the SCSI array was kicking ass all over the place... except with -b enabled, while the lone SATA drive just sort of sucked air next to it on any OS/filesystem.

The zpool iostat <pool> <interval> command is seriously useful. Give it -v to get per-disk stats.

Anyway, I was banging my head against the wall literally the entire day. The ticket for "Test PgSQL on Sol10" is more pages long than I'd like to think about. Finally I bitched about it in #slony, not out of any particular thought that I'd get an answer, just frustration. mastermind pointed out that, hey, cheap SATA drives have, you know... write caches. And like? SCSI drives? They have it like, turned off by default, because reliability is far more important than performance for most people. (Valley-girlese added to relate how stupid I felt once he mentioned it.) And anyway, that's what battery-backed controllers are for: So you get the perf and reliability.

Once he said "write cache", it clicked, total epiphany, and I felt like a complete jackass for spending more than ten hours fighting with what amounts to bad data. In the process I read a fair amount on good database architecture, though, and that will be very helpful in the coming weeks for making our discards database(s) not suck.

Live and learn, I suppose.

The whole experience has really driven home the point that getting information out of a Solaris box is so much less tedious than a Linux box, though. While I was under the impression that the SATA drives were not actually lying to me about their relative performance, I kept thinking "These Sol10 tools are so frakking nice, I don't want to get stuck with Linux tools for this stuff!" Especially the ZFS interaces.

February 22, 2007

OpenSolaris/Solaris Relationships

A useful entry for people who are confused by how OpenSolaris turns into Solaris 10, and what the differences actually are.

Mercurial: a fast, lightweight Source Control Management system designed for efficient handling of very large distributed projects.

dragorn pointed this out the other day. After having to endure the fist-shaking of my co-workers at svn/svk perhaps they will be satiated. Or not.

OpenSolaris also uses hg, it seems.

hdp noted that since our databases are now entirely InnoDBized (all the MyISAMisms have been ripped out), we no longer need to lock the entire database to get a consistant dump.

Others have written about MySQL backups, so I will just say that it makes me happy I no longer have to entirely rely on the validity of our replicas to get good backups. (Or lock the master for half an hour to get a dump.)

(Of course, all may not be light and fairies in the land of MySqueel, but if you've used it extensively for any period of time, I guess you know that already.)

And yes, it terrifies me to consider that the replicas are out of sync with the master, but errors definitely creep in. It saddens me.

In happier news, there is a new Slony-I site in the works, it seems.

February 26, 2007

[Full-disclosure] Local user to root escalation in apache 1.3.34 (Debian only)

Version 1.3.34-4 of Apache in the Debian Linux distribution contains a hole that allows a local user to access a root shell if the webserver has been restarted manually. This bug does not exist in the upstream apache distribution, and was patched in specifically by the Debian distribution. The bug report is located at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=357561 . At the time of writing (over a month since the root hole was clarified), there has been no official acknowledgement. It is believed that most of the developers are tied up in more urgent work, getting the TI-86 distribution of Debian building in
time for release.

Unlike every other daemon, apache does not abdicate its controlling tty on startup, and allows it to be inherited by a cgi script (for example, a local user's CGI executed using suexec). When apache is manually restarted, the inherited ctty is the stdin of the (presumably root) shell that invoked the new instance of apache. Any process is permitted to invoke the TIOCSTI ioctl on the fd corresponding to its ctty, which allows it to inject characters that appear to come from the terminal master. Thus, a user created CGI script can inject and have executed any input into the shell that spawned apache.

As a Debian user, this concerns me greatly, as any non-privileged user would be able to install non-free documentation (GFDL) on any system I run.


March 14, 2007

So as part of freeing up some rackspace at work, I'm throwing a bunch of systems into Solaris Zones. However, some of these systems, while not "mission critical" are pretty important and their IP addresses really shouldn't change (DNS propagation lag would suck).

So my Solaris Zones box is sitting on one our subnets at the colo, the one with the most free addresses. Two of these other systems, however, are on another subnet. There's no good way to currently add a default route for a local zone when the global zone is not also part of that network. I could either waste an IP in that subnet (which I don't want to do), or follow this suggestion and ghetto-hack around it:

[root@chironex]:[~]# cat /etc/hostname.nge0\:99
[root@chironex]:[~]# ifconfig nge0:99 plumb up
[root@chironex]:[~]# ifconfig -a
nge0:99: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet netmask ff000000
[root@chironex]:[~]# zonecfg -z ircd info
zonename: ircd
zonepath: /export/zones/ircd
autoboot: true
dir: /opt
special: /opt
raw not specified
type: lofs
options: [ro,nodevices]
address: A.B.C.D
physical: nge0
[root@chironex]:[~]# ifconfig nge0:99 A.B.C.D netmask A.B.C.248
[root@chironex]:[~]# route add default
add net default: gateway
[root@chironex]:[~]# ifconfig nge0:99 netmask
[root@chironex]:[~]# zoneadm -z ircd boot
[root@chironex]:[~]# ifconfig -a
nge0:5: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
zone ircd
inet A.B.C.D netmask fffffff8 broadcast

Works just fine, though.

(If it came down to some network-contention problems, I could pull the same trick on bge0, another physical device in the system... but it won't.)

March 15, 2007

< bda> I hate MySQL.
< confound> it crashed?
< confound> it crashed at 8:55
< confound> no idea why it was restarted at 9:55.
< bda> I don't think there are any scripts that know supervise.
< confound> I'm sure it wasn't that
< confound> it looks crash-related
< confound> it may even have been another crash at 9:55. it's hard to tell
< bda> Yeah, that looks likely. Or at least it came back up enough to be replicating.
< bda> (at 0855)
< bda> "You trust your data to that pile of junk? You're braver than I thought."
< bda> Sometimes MySQL can't make the jump to lightspeed, and you have to find a princess to get out and push...

March 16, 2007
March 17, 2007

Picked up off osnews. Yeah, I should know better. I might catch something.

Solaris 10 11/06 (Update 3) Review

Solaris Express is coming along; and for those who do want bleeding edge, ultra-super-duper features, then Solaris probably isn't your best bet, then again, assuming you're into that stuff, you'd be better catered for by the likes of Gentoo for example - for those of us who would prefer to have stability above features, then give Solaris a go and if you can make a contribution to Solaris by way of code contributions, then by all means do so.

Recommending Gentoo over Solaris.


To stay in context, though, he is talking about the desktop market. So why did he review Solaris 10? Why not Nexenta, which is geared to for exactly that?

And Gentoo. Instead of say... Ubuntu.

Also, Solaris lacking features that Linux has? That's a bloody joke and a half:


There was this noise a minute ago? Weird wooshing sound? Right over someone's head.

This has been your monthly blog rant against some other blog post some blogging guy somewhere wrote about something.

April 8, 2007

Well, it looks like Debian 4.0 (Etch) has been released. And they have a new project leader. And they're talking about trying to get releases out every two years.

* bda peaks out the window, looking for amphibian precipitation or airborne porcine.

The whole dunc-tank thing was, in a lot of ways, the final straw for me. Not the fact that some Debian leads and devs got paid for their work. Who cares, as long as they were doing the work? No, the fact that a bunch of essentially commie programmers jumped ship from the leading commie Linux distro to work on Ubuntu, which is pretty damn far from the Debian project's ideals (regardless of the noise Ubuntu people make).

But when it all comes down to it, I don't care about this crap anymore. I don't care that an OpenBSD dev goofed up and commited GPL'd code to the public CVS repo, I don't care that there was a huge flame-out on linux-wireless@, I don't care about ridiculous community in-fighting.

At the end of the day I want two things:

  • Something that works
  • Something with a stable release cycle

Maybe Debian can get there again, though as Ian Murdock recently said during one of his interviews about being hired at Sun, Debian is all about the process these days. And their process is broken.

April 12, 2007

While I was at the colo tonight doing other stuff, I installed Debian 4.0 on one of our SuperMicros (older rev SATA cards which aren't supported by Solaris). The install was relatively painless. I got my metadisks and volumes set up with ease, it didn't ask any stupid questions, and there wasn't any post-install setup.

I chose the "standard" install, as I didn't want www, mx, or anything else going on. I just wanted the standard base Debian install I've been used to for the last ten years. The system gets to a login prompt, I unplug the display, and go back to my other tasks.

When I finally get home, I log into the... wait. What?

[bda@selene]:[~]$ ssh root@moon
ssh: connect to host moon port 22: Connection refused

I... What?

So I think to myself: Maybe I am crazy. Maybe there were some post-install setup questions and I just wasn't paying attention. After a quick install into a Parallels virtual machine, it's quite apparent that, at least in this particular context, I am not insane.

No OpenSSH by default in Debian 4.0.

But hey, nfs-common, portmap, and inetd are all running! So ... that's something.

It's like Debian is saying "We need to be more like Ubuntu. How can we do that? Hey, they don't ship with sshd by default, let's do that!"

This is a load of bollocks. It's an incredibly basic policy change (one I've relied on for as long as I've used Debian -- ten fucking years!) and it wasn't mentioned in any of the fucking release notes or announcements.

This is total bullshit.

So a dude uses Cocktail to screw around with his system, and discovers /usr is now visible in Finder (it is not by default). So he deletes it and then blames Apple for not protecting him against his own malicious stupidity.

After a couple dozen people point and laugh, he cries out:

"Regardless, I expected folks within the Apple realm to be of help, rather than be righteous and condescending."

And of course anyone who has actually ever met a Mac user laughs insanely.

Amid all the justified "you are an idiot" and "lol rm -rf /" noise, there is this gem:

"You can't have your userland cake and eat your hacker cake too."

One wonders what flavor and toppings the respective cakes consist of.

April 16, 2007

Live Upgrade for Idiots

Huzzah, I was going to start looking for something like that. :-)

Added that site to my feeds as well. woop.

[via stevel]

May 12, 2007

Ben Rockwood responds to Paul Boutin talking smack? maybe? about sysadmins.

My own experience with becoming a system administrator did not involve drawing Sun logos onto my Trapper Keepers or memorizing IBM hardware line-ups instead of important Civil War battles. I wasn't even really aware of those things. No, I fell into it from the bottom-up. The first UNIX box I touched was Linux, and I didn't even really understand there was a whole ecosphere of UNIXes out there for a couple years after that. I knew they existed, I suppose, but they were like funny birds you hear about in far-off countries.

Looking back on the (almost) ten years of my "career", it's only now that I actually feel I'm edging up onto the ramp of competency. It's more an understanding of all the things I don't know than a pride in the things I do, though. In some respects that's heartening, because it means I'm becoming good enough to know what I'm not good at, instead of simply being blindly ignorant. It's also disheartening, though, because there's a great deal of areas in which I know I need a great deal of improvement.

When I get down about this, I'll break out my (rather beat up) copy of Hagakure and read the following excerpt:

A certain swordsman in his declining years said the following:

In one's life, there are levels in the pursuit of study. In the lowest level, a person studies but nothing comes of it, and he feels that both he and others are unskillful. At this point he is worthless. In the middle level he is still useless but is aware of his own insufficiencies and can also see the insufficiencies of others. In a higher level he has pride concerning his own ability, rejoices in praise from others, and laments the lack of ability in his fellows. This man has worth. In the highest level a man has the look of knowing nothing.

These are the levels in general. But there is one transcending level, and this is the most excellent of all. This person is aware of the endlessness of entering deeply into a certain Way and never thinks of himself as having finished. He truly knows his own insufficiencies and never in his whole life thinks that he has succeeded. He has no thoughts of pride but with self-abasement knows the Way to the end. It is said that Master Yagyu once remarked, "I do not know the way to defeat others, but the way to defeat myself."

Throughout your life advance daily, becoming more skillful than yesterday, more skillful than today. This is never-ending.

It doesn't necessarily make me feel better, but it usually makes me hate the world less. Getting back to even, maybe.

My job is often stressful, and mainly seems to be ever more rare islands of "ok, now this is cool" amongst a sea of frustrations. I wish I could say this condition makes it easy to lose sight of why I started down this path -- but like I said: I fell into it. I'm still here because I don't know anything else. I suppose it's enjoyable enough, and typically pays well enough, that the Irish genes kick in.

Like Colin Sullivan says in The Departed:

I'm fucking Irish, I'll deal with something being wrong for the rest of my life.

(Still not sure what it means, being California Irish by way of pretty much everywhere.)

And lately, anytime something ridiculously stupid occurs, it's harder to treat it as nothing more than a challenge. Now it's just another reason to stumble off this path and find another.

Good luck with that, me.

May 18, 2007

I've been spending a lot of time working on consolidating our services. It's tedious, because we have been a Linux shop since the company was started: there are many Linux and GNUisms. I have yet to question the decision to move as much as I can to Solaris 10, however. Consider:

We have, in the past, had two (more or less) dedicated systems for running MySQL backups for our two production databases. These replicas would stop their slave threads, dump, compress, and archive to a central system. Pretty common. But they were both taking up resources and rackspace that could otherwise be utilized.

Enter Solaris Zones. There's no magic code required for mysqldump and bzip2, so moving them was trivial. The most annoying part of building any new MySQL replica is waiting on the import. But, if you admin MySQL clusters you're probably already really, really used to that annoyance.

So I built a new database backup zone to run both instances of MySQL. Creatively, I named it dbbackup. It ran on zhost1 (hostnames changed to protect the innocent). Unfortunately, zhost1 also runs all our development zones (bug tracking, source control, pkgsrc and par builds) as well our @loghost. Needless to say, the addition of two MySQL dbs writing InnoDB pretty much killed I/O (this is on an X2100 M1 with mirrored 500GB Seagate SATA 3.0Gb/s drives), making the system annoying to use.

This week I deployed two new systems to act as zone hosts, one of which is slated for non-interactive use. So last night I brought down the database backup zone and migrated it over.

This document details the process, which is ridiculously trivial. No, really. The most annoying part was waiting on the data transfer (60GB of data is slow anywhere at 0300).

My one piece of extra advice is: Make sure both systems are running the same patch level before you start. PCA makes this pretty trivial to accomplish.

This is a sparse-root zone, but there are two complications:

  • I delegate a ZFS dataset to the zone, so there are a bunch of ZFS volumes hanging off it. However, they all exist under the same root as the zone itself, so it's not really a big deal.
  • I have a ZFS legacy volume set up for pkgsrc. By default pkgsrc lives in /usr/pkg, /usr is not writable since it's a sparse zone, and I don't really want to deal with moving it. It needs to be mounted at boot time (before the lofs site_perl mounts which contain all our Perl modules in the global zone), however, and after a little bit of poking I couldn't figure out how to manipulate zvol boot orders. Legacy volumes get precedence over lofs, though, so. Ghetto, I know.

The volume set up looks like this:

[root@dbbackup]:[~]# zfs list
data 85.3G 348G 24.5K /data
data/zones 41.2G 348G 27.5K /export/zones
data/zones/dbbackup 40.4G 348G 135M /export/zones/dbbackup
data/zones/dbbackup/tank 39.9G 348G 25.5K none
data/zones/dbbackup/tank/mysql 39.9G 348G 8.49G /var/mysql
data/zones/dbbackup/tank/mysql/db2 24.5G 348G 24.5G /var/mysql/db2
data/zones/dbbackup/tank/mysql/db1 6.92G 348G 6.92G /var/mysql/db1

So, my process?

First, shut down and detach the zone in question.

[root@zhost1]:[~]# zlogin dbbackup shutdown -y -i0
[root@zhost1]:[~]# zoneadm -z dbbackup detach

Make a recursive snapshot of the halted zone. This will create a snapshot of each child hanging off the given root, with the vanity name you specify.

[root@zhost1]:[~]# zfs snapshot -r data/zones/dbbackup@migrate

Next, use zfs send to write each snapshot'd volumes to a file.

[root@zhost1]:[~]# zfs send data/zones/dbbackup@migrate > /export/scratch/dbbackup@migrate
[root@zhost1]:[~]# zfs send data/zones/dbbackup/pkgsrc@migrate > /export/scratch/dbbackup-pkgsrc\@migrate
[root@zhost1]:[~]# zfs send data/zones/dbbackup/tank@migrate > /export/scratch/dbbackup-tank\@migrate
[root@zhost1]:[~]# zfs send data/zones/dbbackup/tank/mysql@migrate > /export/scratch/dbbackup-tank-mysql\@migrate
[root@zhost1]:[~]# zfs send data/zones/dbbackup/tank/mysql/db2@migrate > /export/scratch/dbbackup-tank-mysql-db2\@migrate
[root@zhost1]:[~]# zfs send data/zones/dbbackup/tank/mysql/db1@migrate > /export/scratch/dbbackup-tank-mysql-db1\@migrate

Now, copy each of the dumped filesystem images to the new zone host (zhost2), using scp or whatever suits you. Stare at the ceiling for two hours. Or catch up on Veronica Mars and ReGenesis. Whichever.

Once that's finished, use zfs receive to import the images into an existing zpool on the new system.

[root@zhost2]:[/export/scratch]# zfs receive data/zones/dbbackup < dbbackup\@migrate
[root@zhost2]:[/export/scratch]# zfs receive data/zones/dbbackup/pkgsrc < dbbackup-pkgsrc\@migrate
[root@zhost2]:[/export/scratch]# zfs receive data/zones/dbbackup/tank < dbbackup-tank\@migrate
[root@zhost2]:[/export/scratch]# zfs receive data/zones/dbbackup/tank/mysql < dbbackup-tank-mysql\@migrate
[root@zhost2]:[/export/scratch]# zfs receive data/zones/dbbackup/tank/mysql/db2 < dbbackup-tank-mysql-db2\@migrate
[root@zhost2]:[/export/scratch]# zfs receive data/zones/dbbackup/tank/mysql/db1 < dbbackup-tank-mysql-db1\@migrate

Before I could attach the zone, I needed to set the mountpoints for the dataset and legacy volumes properly.

[root@zhost2]:[~]# zfs set mountpoint=legacy data/zones/dbbackup/pkgsrc
[root@zhost2]:[~]# zfs set mountpoint=none data/zones/dbbackup/tank

Also, since the zone was living on a different network than the host system, I needed to add a default route for that network to the interface. I talked about this earlier, and it's a workaround that should be going away once NIC virtualization makes it into Solaris proper from OpenSolaris (I would guess u5?).

[root@zhost2]:[~]# ifconfig nge0:99 A.B.C.D netmask
[root@zhost2]:[~]# route add default A.B.C.1
add net default: gateway A.B.C.1
[root@zhost2]:[~]# ifconfig nge0:99 netmask

Now, create a stub entry for the zone with zonecfg.

[root@zhost2]:[~]# zonecfg -z dbbackup
dbbackup: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:dbbackup> create -a /export/zones/dbbackup
zonecfg:dbbackup> exit

And that's pretty much it.

Attach the zone and boot it and you're done.

[root@zhost2]:[~]# zoneadm -z dbbackup attach
[root@zhost2]:[~]# zoneadm -z dbbackup boot

Once it was up, I logged in, made sure the MySQL instances were replicating happily and closed the ticket for moving the zone.

This level of flexibility and ease of use is key. In addition to the other technologies included in Solaris 10, you'd be crazy not to be utilizing it. (Even with the annoying bits still lurking in Sol10, it's absolutely worth the effort.)

And it's only going to get better.

May 31, 2007

One of my major blocking tasks right now is to rebuild our Perl tree (about 600 modules) as PARs for easy distribution. I'm not using SRV4 or pkgsrc packages for them because I want to have one build system for both our Linux and Solaris systems. Much of our code relies on version-specific behaviors, so the only way for me to actually get stuff ported (without going totally batshit) from the entrenched Linux systems to the Solaris boxes is to rebuild all those modules, at those versions.

Yesterday, I spent most of the day compiling one Perl module (Math::Pari, which relies on the pari math libraries, and engages me in an indecent amount of skullfuckery whenever I try to build it. This particular adventure into stupidity was caused mainly by braindead pkgsrc dependencies... pari relies on teTeX -- to build its documentation -- which relied on X11. I spent a good portion of time trying to get it working the way it wanted until finally just ripping the X bits out. Guess how long that took. Yup. Bare minutes.). Finally I just built pari by hand to /opt and linked Math::Pari against that. Could have saved hours and hours...

Spent all of today building the rest of our Perl modules. Got down from ~600 to 67. Pulling the modules was easy enough was hdp mentioned the by-module listing on the CPAN. The vast majority of modules were well-behaved; it was simply a matter of iterating over the modules, running

perl Makefile.PL --skipdeps && \
make && make test && \
cd blib \
zip -r $DEST/$MODULE-i386-solaris-5.8.8.par *

and then installing it. Not a big deal. Some of them were tenacious and obnoxious, though, and ate up a lot of time. We can theoretically (this has not proven to be completely true) skip deps as any dependencies should exist in our local tree. I'm sure once I'm done I'll have to check to make sure all the modules actually have their deps, but the vast majority should.

Tomorrow I get to finish this up and then maybe get some working code running in some zones. Huz-freakin'-zah.

June 6, 2007

OpenSolaris: Five updates conservative developers should make

Some really good points listed therein.

I dream about the days when our code all runs happily on Solaris, is packaged up in SRV4 streams, and I can open "add DTrace providers to stuff" tickets...

July 19, 2007

Got spam from Dell this morning, about their new lineup of small business laptops and workstations:

No trialware.

Customers said they hated trialware, so we took it away. Vostro systems come without annoying trialware pre-installed. You only get the software you want.

Really? This is an advertisable feature and not a standard?

The amount of useless crap that comes with OS X systems has thankfully been kept to a minimum -- and you can just uncheck any that is there during install.

Poor Windows OEM carriers.

September 1, 2007

bda's maxim for the night:

Availability of tools is never a badness.

September 9, 2007

<@s4rk> It's hard for me to imagine what a "good" SA is like.
< pthread> s4rk: I'll give you a clue, he's a flaming asshole.
<@bda> bah.
<@bda> A good SA is not a flaming asshole.
<@bda> That's a stereotype perpetrated by jerkface programmers.

October 1, 2007

After talking about it for at least half a year, last night I finally started really reading up on Puppet. After watching a BayLISA talk by Luke Kanies (p1, p2), I installed it one of my Solaris 10 test boxes, installed a couple test zones, and started screwing around with it.

I gotta say, it's super easy to get it up and running. The hardest part, conceptually, is going to be modeling the environment in such a way that won't require repeated major refactoring every other week. Minor tweaking, sure, but ripping walls down would get old quick. Thankfully there are documents like Puppet Best Practices to get you going. There's also a fair amount of code under the hood already, and determining how much of it will be usable to me is going to be fun. The zones management type looks really, really useful considering how heavily we currently use zones, and how that isn't going to do anything but increase.

This week I really hope to have all my system tests (written in Test::More) ported over to Puppet in some relatively sane manner.

Configuration management systems are one of those things that simply make your life less hateful.

The way the last few weeks have gone, I'm going to have to start focusing more heavily on automating everything that can be automated, or spiral further into frustrated insanity.

October 4, 2007

After messing around with plain old Jumpstart for a day, I got sick of it and decided to try out Jumpstart Enterprise Toolkit after eryc mentioned it, a bunch of code living on stop of Jumpstart meant to make lives easier. It does. Getting things set up, adding hosts, etc, goes from being kind of tedious to trivial. The real killer for me was dealing with Solaris's DHCP manager. Man, what a weird, annoying thing.

So now I have Jumpstart set up in Parallels on my laptop (that's 30GB I won't be getting back anytime soon), which is a pretty useful thing to have. I suppose next I'll set an FAI VM for those Debian boxes I still haven't replaced...

Here is the HOWTO I used as a starting point, and also the JET wiki.

Someone in #opensolaris yesterday mentioned they had a Debian Etch zone branded zone running. And it looks pretty trivial to do, too.

Derek Crudgington, of Joyent, has a post over on his blog about using DTrace to instrument MySQL (which does not have any DTrace probes). As long as you know the function names you're interested in, you can some really useful information out of it.

The fact that you can get that information, which would typically get you a major performance hit from MySQL itself, without MySQL having to be touched, restarted, or impaired, is just another example of how great DTrace is.

October 9, 2007

Several months ago, after watching Bryan Cantrill's DTrace talk at Google, I went looking for the then-current state of DTrace userstack helpers for Perl. We're a big Perl shop; being able to get ustacks out of Perl would be a pretty major thing for me. I came across a blog post by Alan Burlison who had patched Perl 5.8.8 with subroutine entry/return probes, but couldn't, at the time, find a patch for it. So I forgot about it.

The other day I re-watched that talk and went looking again. Discovering, in the process, that Richard Dawe had reproduced Alan's work and released a diff. Awesome!

So the basic process is this:

  • Get a clean copy of Perl 5.8.8
  • Get Richard's patch
  • Read the instructions in the patch file
    • note that you have to build with a dynamic libperl!
  • Use gpatch to patch the source, and configure Perl as usual
$ cd perl-5.8.8
$ gpatch -p1 -i ../perl-5.8.8-dtrace-20070720.patch
$ sh Configure

Noted by Brendan Gregg, you'll also need to add a perldtrace.o target to two lines in the Makefile (line numbers may differ):

274          -@rm -f miniperl.xok
275          $(LDLIBPTH) $(CC) $(CLDFLAGS) -o miniperl \
276              miniperlmain$(OBJ_EXT) opmini$(OBJ_EXT) $(LLIBPERL) $(libs) perldtrace.o
277          $(LDLIBPTH) ./miniperl -w -Ilib -MExporter -e '' || $(MAKE) minitest
279  perl$(EXE_EXT): $& perlmain$(OBJ_EXT) $(LIBPERL) $(DYNALOADER) $(static_ext) ext.libs $(PERLEXPORT)
280          -@rm -f miniperl.xok
281          $(SHRPENV) $(LDLIBPTH) $(CC) -o perl$(PERL_SUFFIX) $(PERL_PROFILE_LDFLAGS) $(CLDFLAGS) $(CCDLFLAGS) perlmain$(OBJ_EXT) $(DYNALOADER) $(static_ext) $(LLIBPERL) `cat ext.libs` $(libs) perldtrace.o

As the patch instructions state, you'll need to generate a DTrace header file, running:

$ make perldtrace.h
/usr/sbin/dtrace -h -s perldtrace.d -o perldtrace.h
dtrace: illegal option -- h
Usage: dtrace [-32|-64] [-aACeFGHlqSvVwZ] [-b bufsz] [-c cmd] [-D name[=def]]

Ouch, ok, apparently dtrace -h is broken on Solaris 10u3. I mentioned this on #dtrace, and Brendan suggested I find a Perl script posted to dtrace-discuss by Adam Leventhal to emulate dtrace -h behavior.

But I'm lazy and have Solaris 10u4 boxes, so I just generate the header file on one of those and copy it over to the u3 box.

Once you have perldtrace.h in place, run make as normal, get a cuppa, whatever.

As soon as your make is done running, check the patch file for instructions on running a simple test to see if it's working. I have yet to have any issues.

Now, as Alan mentions in his blog, there's a chance you could eat a 5% performance hit. For me, that would be worth it, due to the complexity of our codebase and the fact I am sometimes (though thankfully not recently) called upon to debug something I am wholly unfamiliar with at ungodly hours of the night. Digging around for the problem is hard as adding debugging to running production code is simply not going to happen. With a DTrace-aware Perl, it's simply a matter of crafting proper questions to ask and writing wrappers to make the inquiries.

I'm certainly not at a point where I can do that, but I reckon it won't be long after I've deployed our rebuilt Perl packages that I'll be learning "A is for Apple ... D is for DTrace".

To simply quantify that performance hit, rjbs suggested we run the Perl test suite on various builds. Below I have (again, very simple) metrics on how long each build took to run the tests. As DTrace requires a dynamic libperl, which is going to be a performance hit of some (unknown to me) value, I have both static and dynamic vanilla (no DTrace patch) build times listed.

Build type real/user/sys
Vanilla Perl, static libperl 8m44.880s/3m44.770s/1m41.745s
Vanilla Perl, dynamic libperl 9m41.212s/4m32.217s/1m49.256s
Patched Perl, dynamic libperl, not instrumented 10m17.740s/4m32.825s/1m49.017s

If the tests suite is indeed a useful metric, the hit is certainly not nothin'. I suspect there would be ways to mitigate that hit, though.

As soon as I gain some clue (or beg someone in #dtrace for the answer), I'll run the same tests while instrumenting the Perl processes. Just need to figure out how to do something like

/execname == "perl"/
  self->follow = 1;

perl$1:::sub-entry, perl$1:::sub-return
{ ... }

when the Perl processes I want to trace are completely ephemeral.

October 10, 2007

Noticing the question in my previous post about ephemeral processes, seanmcg in #dtrace suggested I write something akin to this, which did occur to me, vaguely, as a possibility. But it seemed like far more complexity than I wanted to create, and starting/stopping processes to kick off watchers sounded like a good way to impact performance in an already loaded environment (read: our mailservers). I knew there had to be a better way to do it than wrapping DTrace up in Perl so I could monitor Perl, but I couldn't figure out how to do it with the pid::: provider. Well, you can't. But!

< brendang> the wildcard "*" doesn't work properly for the pid provider, but does work for the USDT language providers
< brendang> most of the language examples in the new DTraceToolkit use perl*:::, mysql*:::, javascript*:::, etc

Obviously DTT should have been the first place I looked, instead of whining. :-)

So if you are trying to follow something specific with the pid:::, seanmcg's method is certainly viable. I just wanted to glob onto all Perl processes, though.

Brendan also offered the following (as I was thinking about it backwards in my previous post):

#!/usr/sbin/dtrace -Zs

self->sub = copyinstr(arg0);

/self->sub != NULL/
printf("Perl %s() called syscall %s()", self->sub, probefunc);

perl*:::sub-return {
self->sub = 0;

Start 'er up in Terminal A:

[20071010-00:10:31]:[root@mako]:[~]# ./perlsubs.d
dtrace: script './perlsubs.d' matched 232 probes

Kick off one our simple but venerable helper scripts, with shebang set to the patched Perl:

[20071010-00:10:34]:[root@mako]:[~]# ./spool-sizes.pl -h

usage: spool-sizes.pl [-tabcdimsvh]
-t: global threshold (default = 1000 messages)
-a: active spool threshold (default = $threshold)
-H: hold spool threshold (default = $threshold)
-c: corrupt spool threshold (default = $threshold)
-d: deferred spool threshold (default = $threshold)
-i: incoming spool threshold (default = $threshold)
-n: no mail (do not mail, but create file in /var/tmp/spool-sizes)
-T: add a composite "total" spool
-v: visual (i.e. output to console vs. file and do not mail)
-h: help (this message)

And, back in Terminal A:

0 40463 stat64:entry Perl BEGIN() called syscall stat64()
0 40463 stat64:entry Perl BEGIN() called syscall stat64()
0 40463 stat64:entry Perl BEGIN() called syscall stat64()
0 40097 close:entry Perl BEGIN() called syscall close()
0 40325 systeminfo:entry Perl hostname() called syscall systeminfo()
0 40185 ioctl:entry Perl usage() called syscall ioctl()
0 40467 fstat64:entry Perl usage() called syscall fstat64()
0 40093 write:entry Perl usage() called syscall write()
0 40093 write:entry Perl usage() called syscall write()
0 40093 write:entry Perl usage() called syscall write()

And here's the output of Alan B's example script:

[20071010-00:10:41]:[root@mako]:[~]# ./perlsubs2.d
dtrace: script './perlsubs2.d' matched 7 probes
0 2 :END 2 import /opt/perl/perl5.8.8/lib/5.8.8/warnings.pm
3 import /opt/perl/perl5.8.8/lib/5.8.8/strict.pm
6 BEGIN /opt/perl/perl5.8.8/lib/5.8.8/vars.pm
6 bits /opt/perl/perl5.8.8/lib/5.8.8/strict.pm
11 import /opt/perl/perl5.8.8/lib/5.8.8/AutoLoader.pm
25 import /opt/perl/perl5.8.8/lib/5.8.8/Exporter.pm
26 BEGIN /opt/perl/perl5.8.8/lib/5.8.8/i86pc-solaris/Sys/Hostname.pm
32 load /opt/perl/perl5.8.8/lib/5.8.8/i86pc-solaris/XSLoader.pm
62 AUTOLOAD /opt/perl/perl5.8.8/lib/5.8.8/i86pc-solaris/POSIX.pm
68 BEGIN /opt/perl/perl5.8.8/lib/5.8.8/warnings.pm
85 BEGIN ./spool-sizes.pl

This won't be useful at all. Tomorrow I'm going to try and get back to porting our MX dispatching software to Solaris. hdp says all the tests pass, so it should just be a matter of making sure each of the associated daemons work properly, have manifests, etc.

And then, the fun part: Writing a little something I've been referring to as mailflow.d...

October 13, 2007
October 15, 2007

Based on Albert Lee's howto:

[20071015-08:38:52]:[root@clamour]:[~]# uname -a
SunOS clamour 5.10 Generic_120012-14 i86pc i386 i86pc
[20071015-08:38:53]:[root@clamour]:[~]# zoneadm list -cv
0 global running / native shared
3 control running /export/zones/control native shared
4 lunix running /export/zones/lunix lx shared
[20071015-08:38:56]:[root@clamour]:[~]# zlogin lunix
[Connected to zone 'lunix' pts/5]
Last login: Mon Oct 15 12:37:28 2007 from zone:global on pts/4
Linux lunix 2.4.21 BrandZ fake linux i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

After I stop laughing hysterically, visions of collapsing Linux boxes into Solaris zones dancing through my twitching little mind, I'll have to see how twitchy the install itself is. Already it appears that some stuff is unhappy, though most of it seems to revolve around things that don't matter (ICMP oddities, console oddities wrt determing how smart it is for restarting services -sigh- and a few other easily surmountable or ignorable things).

Overall: Hello, awesome.

(Update: It appears that 6591535 makes this a non-starter. I am now, again, a very sad bda with a bunch of crappy hardware and nowhere to move their services to.)

November 1, 2007

So I have to say that out of all the features hyped in Leopard, Time Machine seems to be worthy of it. It really does make backups trivial.

I had a bunch of tarballs in ~/Desktop which were basically transient. I didn't want them backed up, but backed up they got, eating a couple gigs of space pointlessly. My initial reaction was to just rm them from the backup dirs, but I was told (as root) I could not. I don't really like being told I can't do things when I'm root, but ok, maybe there's other magic required. Maybe it's not as flat-filesystem-y as one is led to believe.

A quick Googling later, and the answer was thankfully trivial. The Time Machine UI is ... totally weird enough ... that I've been avoiding it. But nice to see it's so easy to do something so obvious.

So the "OpenSolaris Developer Preview" was released last night. I spent a few minutes with it, and it has generated a frankly ridiculous amount of controversy inside the community (to the point where my inbox has tripled in size). So what's the deal?

Well, it's actually a pretty decent first release. It has ZFS on root (awesome!), the Image Packaging System (which is way cool), and is almost as trivial to install as Ubuntu. People are whining about a bunch of nonsense (wah, the default shell, wah, no KDE in the first release, wah), but by far the biggest complaints center around Indiana taking the OpenSolaris name. This gets a big fat whatever from me.

I'm wondering if anyone complaining has actually read the FAQ.

Dennis Clarke has some screenshots over at Blastwave.

If you are thinking of giving it a try, you should probably read through the immigrants page. benr++

I installed it without issues in a Parallels VM. It panicked on boot a couple times, though I suspect that it more to do with Parallels than Solaris. I need to make an image of the compiler tools so I can get the NIC supported, but that is a pretty trivial thing. I suspect I will not bother and just build at workstation at the office and throw Indiana on there.

Overall I think this is a fairly exciting milestone for OpenSolaris. Their release schedule of every six months is encouraging, as it works very well for certain other high-quality projects. The barrier for adoption has fallen and now that code has been thrown over the wall, perhaps people can start contributing instead of ... not.

Well, once they get over accusing the Indiana guys of "stabbing the community in the back" and eating babies...

[bda@moneta]:[/usr/bin]$ uname -a
Darwin moneta.int.mirrorshades.net 9.0.0 Darwin Kernel Version 9.0.0: Tue Oct 9 21:35:55 PDT 2007; root:xnu-1228~1/RELEASE_I386 i386
[bda@moneta]:[/usr/bin]$ ./iopending
dtrace: failed to initialize dtrace: DTrace requires additional privileges
[bda@moneta]:[/usr/bin]$ sudo ./iopending 2
Tracing... Please wait.
2007 Nov 1 12:49:23, load: 0.34, disk_r: 0 KB, disk_w: 0 KB

value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 5618
1 | 0

2007 Nov 1 12:49:25, load: 0.31, disk_r: 8 KB, disk_w: 4 KB

value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2810
1 | 16
2 | 0

2007 Nov 1 12:49:27, load: 0.31, disk_r: 0 KB, disk_w: 400 KB

value ------------- Distribution ------------- count
< 0 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2873
1 | 0


[bda@moneta]:[/usr/bin]$ sudo find . -type f -exec grep -H '^#!/usr/sbin/dtrace' {} \; | wc -l
[bda@moneta]:[/usr/bin]$ sudo find . -type f -exec grep -H '^#!/usr/sbin/dtrace' {} \; | awk -F: '{print $1}'


The scripts living in /usr/bin is kinda weird. /usr/sbin is really a better place for them if they're going to live in the default $PATH.

MacTech has an introduction to DTrace which is really decent (except for the page autoreloading, ugh), and worth reading.

His investigation of ln and why why it wouldn't let him create hardlink directories (yeah, OS X has .. hardlink directories support now; required for Time Machine, and a really elegant solution to the problem, but still kinda weird) is pretty entertaining.

If you're me, anyway.

November 3, 2007

onlamp has an interview with the OpenBSD devs on what's new 4.2. Basically: Lots.

The highlights for me are doubled pf performance, IP load-balancing with CARP, and layer 7 hostated support (with HTTP/SSL hackery). Marc Espie's continuing improvements to the package management system are also no doubt going to continue making my life easier.

All in all, a very exciting release! Go buy your CD!

November 16, 2007

I have an OpenSolaris box in pilot at the moment, running build 74. It uses Lori Alt's patched miniroot so I can set up a rootpool and do a profile (network, via Jumpstart) install. It works really well.

Yesterday the box went into a reboot loop, and as there appears there are issues with b74, I figured I would finally get around to learning how to use BFU (which is a change-aware wrapper around cpio that writes to /; it's not something you can back out of). But before I did that, I would need to figure how to boot from a ZFS clone. If the BFU goes south, or if the new build bricks the box, I need a way to boot to back into the old system. It's the poor man's LiveUpgrade, I suppose, but it's still way cool and (I think) much easier.

So that's the goal here: Take a snapshot of the current system, clone the snapshot so it's writable, and then upgrade the clone. This way we can BFU the system and still have a fallback in the event that the BFU fails, or the new OS/Net build bricks our box.

Tim Foster had already written a blog post about how easy this was, so I wasn't expected to run into any problems.

First, grab the ON build tools and the BFU archives for the build you care about.

[root@octopus]:[~] cd /tmp
[root@octopus]:[/tmp]# wget http://dlc.sun.com/osol/on/downloads/b75/SUNWonbld.i386.tar.bz2
[root@octopus]:[/tmp]# wget http://dlc.sun.com/osol/on/downloads/b75/on-bfu-nightly-osol-nd.i386.tar.bz2

You probably want to do that in tmp (which is swap) so when you take your snapshots, big random files are not littering the filesystem forever.

Set up your build environment:

[root@octopus]:[/tmp]# bunzip2 on-bfu-nightly-osol-nd.i386.tar.bz2
[root@octopus]:[/tmp]# tar -xf on-bfu-nightly-osol-nd.i386.tar
[root@octopus]:[/tmp]# bunzip2 SUNWonbld.i386.tar.bz2
[root@octopus]:[/tmp]# tar -xf SUNWonbld.i386.tar 
[root@octopus]:[/tmp]# cd onbld/
[root@octopus]:[/tmp/onbld]# pkgadd -d . SUNWonbld 
[root@octopus]:[/tmp/onbld]# cd
[root@octopus]:[~]# export FASTFS="/opt/onbld/bin/i386/fastfs"
[root@octopus]:[~]# export GZIPBIN="/usr/bin/gzip"
[root@octopus]:[~]# export BFULD="/opt/onbld/bin/`uname -p`/bfuld"
[root@octopus]:[~]# export PATH="/opt/onbld/bin:/opt/onbld/bin/`uname -p`:$PATH"

Now we need to take a snapshot of our current rootfs, clone it is writable, and mount it. In my setup, the rootpool is a legacy mount, and anything under it is also going to inherit the legacy mount property.

[root@octopus]:[~]# zfs snapshot rootpool/b74@upgrade
[root@octopus]:[~]# zfs clone rootpool/b74@upgrade rootpool/b75
[root@octopus]:[~]# zfs set mountpoint=/rootpool/b75 rootpool/b75

Now it's time to do the actual upgrade. I ran into two very minor snags here. First, I don't have BIND installed, so I needed to pass -f to bfu. Secondly, I don't have D-BUS installed, and had to comment that check out of the bfu script. Once that's done, it goes off and does it's thing happily.

Once the BFU finished you'll be put into a safe environment with tools built to work regardless of how horribly the BFU may have messed up your system (not an issue here, as we aren't actually modifying our current rootfs). As soon as it's done, you'll need to resolve the conflicts it lists; thus far I have not had an issue with using Automated Conflict Resolution to merge those files.

[root@octopus]:[~]# bfu -f /tmp/archives-nightly-osol-nd/i386 /rootpool/b75
bfu# /opt/onbld/bin/acr /rootpool/b75

And that's it. Your clone has now been upgraded using BFU. Create a boot archive of the new BE and set it legacy again.

[root@octopus]:[~]# bootadm archive-update -R /rootpool/b75
[root@octopus]:[~]# zfs set mountpoint=legacy rootpool/b75

You have a couple options for managing your boot environments at this point. You can either modify /rootpool/boot/grub/menu.lst yourself, or use Tim Foster's zfs-bootadm.sh to do it for you. The script relies on a property to determine which zfs fs are bootable, so you'll need to set that.

[root@octopus]:[~]# ./zfs-bootadm.sh
Usage: zfs-bootadm.sh [command]

where command is one of:
Creates a new bootable dataset as a clone
of the existing one.
Sets a bootable dataset as the next
dataset to be booted from.
Destroys a bootable dataset. This must not
be the active dataset.
Lists the known bootable datasets.

[root@octopus]:[~]# zfs set bootable:=true rootpool/b75
[root@octopus]:[~]# ./zfs-bootadm.sh list
b74 (current)
[root@octopus]:[~]# ./zfs-bootadm.sh activate b75
Currently booted from bootable dataset rootpool/b74
On next reboot, bootable dataset rootpool/b75 will be activated.
[root@octopus]:[~]# reboot

The box reboots, and...

[bda@moneta]:[~]$ ssh root@octopus
Last login: Fri Nov 16 02:50:21 2007 from
Sun Microsystems Inc. SunOS 5.11 snv_75 October 2007
bfu'ed from /tmp/archives-nightly-osol-nd/i386 on 2007-11-16
Sun Microsystems Inc. SunOS 5.11 snv_74 October 2007
[root@octopus]:[~]# uname -a
SunOS octopus 5.11 snv_75 i86pc i386 i86pc

Pretty dang cool stuff!

My initial test here was to BFU from b74 to b76. After some fumbling about with where the menu.lst file was (I knew it was stored on the rootpool from reading Lori Alt's weblog and various presentations, but rootpool was a legacy mount, so stupid tired me was confused for a good ten minutes). The BFU and acr itself appeared to be fine, and when I finally got the BE to boot, it... panicked.

I was somewhat discouraged, but booted right back into b74 and BFU'd happily to b75.

Which was the entire point of the exercise: To upgrade the system and have a safe way to fall back to a previous build if the system becomes unusable. As I said, it's the poor man's LiveUpgrade, but LU doesn't currently support zfsboot. And, really, this just seems much quicker and easier to deal with.

There are plenty of little things to figure out still (like which filesystems are required to be on the BE for the BFU to work, so I don't end up with data being snapshotted forever), how to deal with package upgrades, and the the like. But overall... very, very cool.

Another thing to note is that everything above was gleaned not just from documentation but from the blogs of the developers.

November 22, 2007
December 15, 2007

I have been a big fan of Patch Check Advanced, as it makes patching Solaris systems not an incredible pain in the ass.

Noted on the news section there is pcapatch, which evidently aims to safely automate pca patch installation.

I suspect I might be a big fan of that as well.

December 16, 2007

To be clear, my understanding of everything I'm about to say is very basic. It's all built on implementing work others did a few months ago, and reading up last night and this morning. If I say something ridiculous, I call nubs.

(As an aside, it appears that TCL supports USDT probes; news to me!)

Bryan Cantrill mailed me the other day after finding my previous post regarding DTrace and Perl via a post by Sven Dowideit. Bryan noted that Alan's patch pre-dated Adam Levanthal's work on is-enabled probes, which are highly useful for dynamic languages: Code is only executed when DTrace is actively tracing a given probe.

When it isn't, there should be no perf hit; the caveat seems to be that when tracing is enabled when using is-enabled probes, the hit is going to be higher than the previous standard static probes.

In the current state of DTrace in Perl (as far as I am aware), there are only two probes: sub-entry and sub-return. Compare to Joyent's work on Ruby, which has about a dozen (the diff for Ruby is over 20,000 lines, though, so obviously there's a lot more going on than just throwing some USDT probes in). When you are only interested in having what objects are being destroyed, for instance, you don't want to have the function probe toggled.

So this morning after reading a very helpful USDT example, I went ahead and modified Alan Burlinson's patch for is-enabled probes.

[20071216-10:56:50]:[bda@drove]:[~/dtrace/perl]$ diff -u perl-5.8.8-dt-alanb/cop.h perl-5.8.8-isenabled/cop.h
--- perl-5.8.8-dt-alanb/cop.h Sat Dec 15 17:15:14 2007
+++ perl-5.8.8-isenabled/cop.h Sun Dec 16 10:56:49 2007
@@ -126,6 +126,7 @@
* decremented by LEAVESUB, the other by LEAVE. */

#define PUSHSUB_BASE(cx) \
CopFILE((COP*)CvSTART(cv)), \
CopLINE((COP*)CvSTART(cv))); \
@@ -180,6 +181,7 @@

#define POPSUB(cx,sv) \
PERL_SUB_RETURN(GvENAME(CvGV((CV*)cx->blk_sub.cv)), \
CopFILE((COP*)CvSTART((CV*)cx->blk_sub.cv)), \
CopLINE((COP*)CvSTART((CV*)cx->blk_sub.cv))); \

Yeah, that was really it. I know, right?

So, now, what do my numbers look like for running the Perl test suite?

Note that all I'm doing is firing on sub-entry and sub-return with no other processing, in destructive mode (otherwise DTrace bottoms out due to systemic unresponsiveness).

static libperl, unpatched:

real 5m42.162s
user 2m28.597s
sys 0m30.161s

dynamic libperl, unpatched:

real 6m31.771s
user 3m16.823s
sys 0m31.698s

dynamic libperl, patched, standard probes, not instrumented:

real 6m33.610s
user 3m12.911s
sys 0m33.445s

dynamic libperl, patched, standard probes, instrumented:

real 9m1.302s
user 3m15.186s
sys 2m47.087s

dynamic libperl, patched, is-enabled probes, not instrumented:

real 6m44.823s
user 3m18.589s
sys 0m43.765s

dynamic libperl, patched, is-enabled probes, instrumented:

real 9m27.597s
user 3m16.791s
sys 3m6.972s

Not that big of a difference, really.

What's really interesting (to me, anyway) about the above are how dynamic libperl and both sets of patches take basically the same amount of time to complete. Compared to my previous "tests" took an extra ~40s as opposed to 10s. Here I am using Sun Studio 12; previously I had been using gcc. I imagine that might make a difference.

I suspect, though, that a number of further factors are at play: the fact that the Perl test suite's behavior is (hopefully?) nothing remotely akin to what you'd see in production, the fact that we're only instrumenting a single set of probes as opposed to having entry points in other places for comparison... Most importantly, though, I imagine whatever changes were made to Ruby might have analogies here as well.

Still, I'm interested enough now to start digging through Joyent's Ruby diff and investing Perl's internals to determine other probe points.

Maybe in a week or so I'll have something worth showing off to #p5p as Rik suggests.

Or my C ignorance will bite me horribly and I'll be forced to commit seppubukkake to save.. face?

December 17, 2007
January 3, 2008

Ben Rockwood expounds upon the joys of IPMI.

As someone who was only using it to reboot his systems (and configure the SP when I'd forgotten to do so during build), it's a pretty enlightening article.

January 20, 2008

Transactional Debian Upgrades with ZFS on Nexenta

Bloody amazing is what that is. Not because the concept is revolutionary (it's been possible with hacked ONNV installs for a while now, Indiana is doing something similar, and a few "other" operating systems have had similar capabilities), but because it's integrated and the interface itself is so obvious. It looks as easy to use as apt(8) and zfs(1M).

Very exciting stuff.

June 2, 2008

The other day I ran into an issue where bootstrapping pkgsrc 2008q1 would hang while running bmake regression tests.

The fix is here.

June 6, 2008
June 23, 2008

[20080622-13:36:33]:[bda@mako]:[~]$ pfexec pkg refresh
[20080622-13:43:58]:[bda@mako]:[~]$ pfexec pkg install pkg:/SUNWipkg@0.5.11,5.11-0.91
Completed 1/1 93/93 0.84/0.84

Removal Phase 2/2
Update Phase 87/87
Install Phase 9/9
[20080622-13:48:44]:[bda@mako]:[~]$ pfexec pkg image-update
Completed 547/547 5585/5585 504.11/504.11

Removal Phase 3098/3098
Update Phase 7617/7617
Install Phase 3367/3367
A clone of opensolaris-1 exists and has been updated and activated. On next boot the Boot Environment opensolaris-2 will be mounted on '/'. Reboot when ready to switch to this updated BE.
[20080622-13:52:38]:[bda@mako]:[~]$ beadm list

BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-2 no yes - 5.25G
opensolaris-1 yes no legacy 89.5K
opensolaris no no - 59.10M
[20080622-13:52:41]:[bda@mako]:[~]$ zfs list
rpool 7.18G 63.7G 61K /rpool
rpool@install 19.5K - 55K -
rpool/ROOT 5.31G 63.7G 18K /rpool/ROOT
rpool/ROOT@install 15K - 18K -
rpool/ROOT/opensolaris 59.1M 63.7G 2.41G legacy
rpool/ROOT/opensolaris-1 89.5K 63.7G 2.57G legacy
rpool/ROOT/opensolaris-1/opt 0 63.7G 595M /opt
rpool/ROOT/opensolaris-2 5.25G 63.7G 2.78G legacy
rpool/ROOT/opensolaris-2@install 5.83M - 2.22G -
rpool/ROOT/opensolaris-2@static:-:2008-06-09-19:03:02 110M - 2.41G -
rpool/ROOT/opensolaris-2@static:-:2008-06-22-17:17:20 532M - 2.57G -
rpool/ROOT/opensolaris-2/opt 595M 63.7G 595M /opt
rpool/ROOT/opensolaris-2/opt@install 72K - 3.60M -
rpool/ROOT/opensolaris-2/opt@static:-:2008-06-09-19:03:02 0 - 595M -
rpool/ROOT/opensolaris-2/opt@static:-:2008-06-22-17:17:20 0 - 595M -
rpool/ROOT/opensolaris/opt 33K 63.7G 595M /opt
rpool/data 18K 63.7G 18K /rpool/data
rpool/export 1.87G 63.7G 19K /export
rpool/export@install 15K - 19K -
rpool/export/home 1.87G 63.7G 1.87G /export/home
rpool/export/home@install 19K - 21K -
[20080622-13:52:51]:[bda@mako]:[~]$ init 6

Well... that's easy.

August 7, 2008

Building A Solaris Cluster Express Cluster in VirtualBox

Pretty interesting stuff. VBox on OS X is not incredibly useful to me (the lack of host networking is a killer), but I run OpenSolaris on my desktop at work.

Very cool stuff.

So for a while now I've been struggling with an older Xeon system which becomes more and more unresponsive until it finally hangs, when under a moderate amount of I/O load.

I asked zfs-discuss@ about it, and received a very helpful response from Marc Bevand.

Now the kernel heap bounces between 1.2GB (idle) and 1.4GB (loaded). The ARC has maxed around 400MB, but I haven't been doing any major reads off the box yet, just a lot of write I/O, so I don't think that's particularly surprising.


(This experience really reminds me that I need to re-read Solaris Internals. I could have solved this problem myself, if I refreshed on those books periodically.)

August 13, 2008

So this morning has been... annoying.

A box was rebooted and didn't come back up. Network came up (pingable) but not ssh. Based on previous idiocy with this system, I suspected it had something to do with filesystems not being able to mount at boot. I shot off a mail to the NOC monkeys, not expecting much (and four hours later, still no response from them), and then started trying to get into the system myself to fix it.

The box in question is a Sun X4150; a really nice system (though now that I've had a T5120 for a while, I have to say I really do much prefer SPARCs simply for ease of administration), with a really lame-ass LOM (ELOM). But: Whatever. So I go to start the console via the LOM... no joy. Apparently console is not redirecting. So, ok, I should be able to get at glass (thanks for the reminder, dlg) via the web interface.

Of course there's no VPN at that site. So I kick open netcat and don't have much in the way of luck. After a few minutes of screwing around with it, I give up and download haproxy. In about three minutes I have it compiled, configured, and forwarding :80 and :443 for me.

listen proxy1
mode http
balance roundrobin
server test
contimeout 3000
clitimeout 150000
srvtimeout 150000
maxconn 60000
retries 3
grace 3000
option forwardfor
option httplog
option dontlognull

listen ssl-relay
option ssl-hello-chk
balance source
server inst1

I log into the LOM, start the redirection Java app, and... nothing.

And... Mac OS X Java bullshit.

So I start an old Parallels OpenSolaris image I had laying around, connect to the LOM that way and... get an I/O connection error. Figuring that the KVM was running on another port, I sniffed off my firewall and discovered that yes, it wanted :8890 as well.

listen ssl-relay
option ssl-hello-chk
balance source
server inst1

Did that, got into the box and discovered the problem was...

[20080813-05:43:12]:[root@brood]:[~]# tail -2 /etc/vfstab
/dev/zvol/dsk/data/zones/lb-arc/root /dev/zvol/rdsk/data/zones/lb-arc/root /zones/lb-arc ufs 1 yes logging
/dev/zvol/dsk/data/zones/lb-arc/root /dev/zvol/rdsk/data/zones/lb-arc/root /zones/lb-arc ufs 1 yes logging


A svcadm clear filesystem/local later, and all was well.


August 27, 2008

Recently I moved our x86-64 pkgsrc build zone to another system. When I did so, I had forgotten I had built the original zone as full, to get around an annoying install(1M) bug. Basically, when you tried to build a package, it would attempt to recursively mkdir /usr/pkg. On sparse zones, /usr is shared read-only from the global zone.

So the install would fail, because it couldn't create /usr for obvious reasons. At the time, I thought I had tried various install programs, but given that the problem was being re-addressed and I didn't feel like reprovisioning a zone, I figured I would tackle it again.

After some minor discussion on #pkgsrc and grepping through mk/ I "discovered" the following variable:

TOOLS_PLATFORM.install?= /usr/pkg/bin/ginstall

Added to mk.conf and all is good. Mainly because ginstall actually uses mkdir -p, so...

The contents of pkgsrc/mk/platform/ are very useful if you aren't on NetBSD.

November 1, 2008

Solaris 10 10/08 (Update 6) was released yesterday. Release notes here.

I grabbed SPARC media and headed down to the colo yesterday to reinstall our T5120 (previously running b93). Fed the media in, consoled in via the SP, booted the system, and then left.

From much more comfortable environs, I got the system installed (service processors really are the best thing ever) without issue, and then, thanks to hilarity with my laptop, lost the randomized password I'd set for root. So whatever, I boot single-user and ... get asked for root's password. This is very similar to most Linux single-user boots these days, and more recently OpenSolaris.

I really, really didn't expect Solaris to follow suit. At least not for .. a while.

Very annoying. At dlg's suggestion, I tried booting -m milestone=none, but still had no joy. Ended up just booting cdrom -s and munging /etc/shadow that way.

Very annoying.

Anyway, having ZFS root in Solaris GA is pretty great. There are a number of really awesome features putback from Nevada this release, along with zfsboot. Check out the release notes. Good stuff.


Ceri Davies corrects me:

Just a note, because it sounds as if you think otherwise, that this behaviour has been present since at least update 3; ie. at least two years. You can turn it off by creating /etc/default/sulogin with the line PASSREQ=NO.

I don't recall seeing this behavior with u4 or u5, so evidently I am a crazy person. Thanks to Ceri for the info.

See sulogin(1M) for further details.

November 12, 2008

Finally got around to doing a Jumpstart for 10/08 today. After one little hitch (u6 renames the cNdN devices in my X2100s to the more proper cNtNdN), it all worked as expected.

fdisk c1t0d0 solaris delete
fdisk c1t1d0 solaris delete

fdisk c1t0d0 solaris all
fdisk c1t1d0 solaris all

install_type initial_install
pool rpool auto auto auto mirror c1t1d0s0 c1t1d0s0
bootenv installbe bename sol10u6

Yay, ZFS root!

December 12, 2008
December 27, 2008
March 1, 2009

Over the last two weeks we (read: rjbs) migrated our Subversion repositories to git on GitHub. I was not very pleased with this for the first week or so. By default, I am grumpy when things that (to me) are working just fine are changed, especially at an (even minor) inconvenience to me. That is just the grumpy, beardy sysadmin in me.

After a bit more talking to by rjbs, things are again smooth sailing. I can do the small amount of VCS work I need to do, and more imporantly: I am assured things I don't care about will make the developers lives much, much less painful, which is something I am certainly all for.

git is much faster than Subversion ever was, and I can see some features as being useful to me eventually. Overall, though, what I use VCS for is pretty uninteresting, so I don't have much else to say about it.

I had a couple basic mental blocks that rjbs was able to explain away in a 20 minute talk he gave during our bi-weekly iteration meeting. It was quite productive. There are pictures.

Work has otherwise consisted of a lot of consolidation. I have finally reduced the number of horrible systems to two. Yes. Two. Both of which are slated for destruction in the next iteration. Not only that, I have found some poor sucker (hi, Cronin!) to take them all off our hands. Of course, they'll be upgrading from PIIIs, so...

I also cleaned up our racks. A lot. They are almost clean enough to post pictures of, though I'll wait until I've used up more of the six rolls of velcro Matt ordered before doing that.

Pretty soon we'll have nothing but Sun, a bit of IBM, and a very small number of SuperMicros. My plans are to move our mail storage from the existing SCSI arrays to a Sun J4200 (hopefully arriving this coming week). 6TB raw disk, and it eats 3.5" SATA disks, which are ridiculously cheap these days. I really, really wanted an Amber Roads (aka OpenStorage) J7110, but at 2TB with the cost of 2.5" SAS, it was impossible to justify. If they sold a SATA version at the low-end... there has been some noise about conversion kits for Thumpers, but that's also way outside our price range.

I doubt conversion support will become more common, but if I could turn one of our X4100s and the J4200 into an OpenStorage setup, I would incredibly happy. If you haven't tried out the OpenStorage Simulator, I suggest you do so. Analytics is absolutely amazing.

People on zfs-discuss@ and #opensolaris have been talking about possible GSoC projects. I suggested a zpool/filesystem "interactive" attribute, or "ask before destroy." However you want to think of it. Someone else expanded on that, suggesting that -t be allowed to ensure that only specified resource types can be destroyed. I have yet to bone myself with a `zfs destroy` or `zpool destroy` but the day will come, and I will cry.

I see a pkgsrc upgrade in my near future. I've been working on linking all our Perl modules against it, and I want to get the rest of our internal code linking against it as well. It will make OS upgrades so, so much easier. Right now, most code is either linked to OS libraries or to an internal tree (most of which also links to OS libraries).

We've almost gotten rid of all our Debian 3.1 installs, which is... well. You know. Debian 5.0 just came out, and we've barely gotten moved to 4.0 yet. Getting the upgrade path there sorted out will thankfully just be tedious, and require nothing clever.

I really hope that the Cobbler guys get Debian partitioning down soon, and integrate some Solaris support. I tried redeploying FAI over Christmas and man, did it so not work out of the box. I used to use FAI, and was quite happy with it. I had to hack it up, but... it worked pretty well. Until it stopped.

If Cobbler had Solaris support, I would seriously consider moving our remaining Linux installs to CentOS. We use puppet already, so in many ways Cobbler is a no-brainer. We are not really tied to any particular Linux distribution, and having all our infrastructure under a single management tools ken would be really nice. To put it mildly.

30% curious about OpenSolaris's Automated Installer project, but it's so far off the radar as to be a ghost.

I picked up John Allspaw's The Art of Capacity Planning, and it's next on my book queue. Flipping through it makes me think it's going to be as useful as Theo S.'s Scalable Internet Architectures.

March 18, 2009

So Linux has a history of hosed db interfaces. Apache worked around this about ten years ago by including their own SDBM in their distribution.

pkgsrc separates their Apache packages into DSOs. So mod_perl, mod_fastcgi, mod_ssl, etc, are built as separate packages. However, when you compile Apache1 with no SSL, it disables SDBM, so mod_ssl (which requires some sort of DBM) fails.

The PR is here.

My workaround was to do this:

ap-ssl$ bmake patch

ap-ssl$ vi /usr/pkg/pkgsrc/www/ap-ssl/work/mod_ssl-2.8.31-1.3.41/pkg.sslmod/libssl.module

Search for the first instance of APXS.

Add the following two lines above it:



And ap-ssl will compile happily.

April 1, 2009

So I have a device failing in one of my zpools:

extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 fd0
0.0 2.0 0.0 8.0 0.0 0.0 0.0 0.1 0 0 1 0 0 1 c0t0d0
0.0 2.0 0.0 8.0 0.0 0.0 0.0 0.1 0 0 1 0 0 1 c0t1d0
0.0 0.0 0.0 0.0 0.0 10.0 0.0 0.0 0 100 1 3 4 8 c0t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c0t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t4d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t5d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c2t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c3t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 6 2 0 8 c4t0d0
extended device statistics ---- errors ---
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 fd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c0t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c0t1d0
0.0 0.0 0.0 0.0 0.0 10.0 0.0 0.0 0 100 1 3 4 8 c0t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c0t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t3d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t4d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 1 0 0 1 c1t5d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c2t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c3t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 6 2 0 8 c4t0d0


It's part of a mirror:

pool: tank
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: none requested

tank ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t2d0 ONLINE 0 6 2
c0t3d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0
c1t3d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t4d0 ONLINE 0 0 0
c1t5d0 ONLINE 0 0 0

errors: No known data errors

So I reckon I'll just offline it and go replace it.

[20090401-17:20:12]::[root@shoal]:[~]$ zpool offline tank c0t2d0
cannot offline c0t2d0: no valid replicas

err... what?

So I detach it from the mirror instead, which does work.

I ask jmcp if he has any insight into why this might be, and after a few minutes he asks if disconnecting the device works.

[20090401-18:01:57]::[root@shoal]:[~]$ cfgadm -c disconnect c0::dsk/c0t2d0
cfgadm: Hardware specific failure: operation not supported for SCSI device

So that's the culprit, I think. A disconnect is implicit when doing a zpool offline?

Not a good error to throw back to the user, either.

April 8, 2009

I've been meaning to blog this for a while. Very useful in Jumpstart finish scripts.

eeprom console=ttyb
eeprom ttyb-mode="115200,8,n,1,-"
echo "name=\"asy\" parent=\"isa\" reg=1, 0x2f8 interrupts=3;" >> /kernel/drv/asy.conf
svccfg -s system/console-login setprop ttymon/label = 115200
svcadm refresh system/console-login
svcadm restart system/console-login
perl -pi -e 's/^splashimage/#splashimage/' /rpool/boot/grub/menu.lst
perl -pi -e 's/$ZFS-BOOTFS$/$ZFS-BOOTFS,console=ttyb/' /rpool/boot/grub/menu.lst
bootadm update-archive


April 16, 2009

A nice high-level writeup by OmniTI's Mark Harrison on Zones, ZFS, and Zetaback.

[via Theo S.]

July 1, 2009

Someone on Sun managers asked for advice on moving from Linux to Solaris and tips on living with Solaris in general. I guess I kind of have a lot to say about it, actually..

One thing I forgot to mention is using SMF. You may have two software repositories (Sun's and pkgsrc), but you only want one place to manage the actual services. Write SMF manifests! It's easy, and you can use puppet to manage it all.

From: Bryan Allen <bda@mirrorshades.net>
To: Jussi Sallinen
Subject: Re: Looking for tips: Migrating Linux>Solaris10
Reply-To: bda@mirrorshades.net
In-Reply-To: <20090624113312.GA32749@unikko>
AIM: packetdump

| On 2009-06-24 14:33:12, Jussi Sallinen wrote:
| Im new to Solaris and about to start migrating Linux (Gentoo) based E450 server
| to V240 Solaris 10.
| Currently running:
| -Apache2
| -Postfix
| -Dovecot
| -MySQL
| About 70 users using WWW and email services.
| So, to the point:
| In case you have tips and tricks, or good to know stuff please spam me with
| info regarding migration.

A quick note: I work for a company where I migrated all our services from Linux
on whiteboxes to Solaris 10 on Sun hardware. It was a major effort, but
garnered us many benefits:

* Consolidation. Thanks to the faster harder and Zones, we are down from 50+
Linux boxes to a dozen Sun systems. And for honestly not that much money.
* Much greater introspection (not just only mdb or DTrace; the *stat tools are
just that much better)
* Before ZFS, we were mostly sitting on reiserfs (before my time) and XFS
(which I migrated as much as I could to before getting it on ZFS). ZFS has
been a huge, huge win in terms of both reliability and availability.

This turned out to be quite an article, but here are some "quick" thoughts on
using Solaris particularly, and systems administration in general:

* Read the System Administrator Guides on docs.sun.com if you are new to
* No, seriously. Go read them. They are incredibly useful and easy to parse.
* Follow OpenSolaris development, either via the mailing lists or #opensolaris
on freenode. This gives you a headsup and stuff that might be getting into
the next Solaris 10 Update, so you can plan accordingly.

* Use a ZFS root instead of UFS (text installer only, but you really want to
use JET -- see below)
* Use rpool for operating system and zoneroots only
* Set up a tank pool on seperate disks
* Delegate tank/filesystems to zones doing the application work

This minimizes the impact of random I/O on the root disks for data and vice
versa (just a good practice in general, but some people just try to use a
single giant pool).

It also negates the issue where one pool has become full and is spinning
platters looking for safe blocks to write to impacting the operating system or
application data.

* Use Marin Paul's pca for patching

The Sun patching tools all suck. pca is good stuff. You get security and
reliability patches for free from Sun; just sign up for a sun.com account.

You don't usually get new features from free patches (you do for paid patches),
but regardless all patches are included in the next system Update.

* Learn to love LiveUpgrade

With ZFS roots, LiveUpgrade became a lot faster to use. You don't have a real
excuse anymore for not building an alternative boot environment when you are
patching the system.

Some patches suck and will screw you. Being able to reboot back into your
previous boot environment is of enormous use.

* Use NetBSD's pkgsrc

Solaris 10 lacks a lot of niceties you and your users are going to miss.
screen, vim, etc. You can use Blastwave, but it has its own problems. pkgsrc
packages will compile basically everything without a problem; they are good
quality, easy to administer, and easy to upgrade.

If you aren't doing this on a single box, but several machines, you would have
a dedicated build zone/host, and use PKG_PATH to install the packages on other
systems. Since you are using a single machine, see below about loopback
mounting the pkgsrc directory into zones: Compile once, use everywhere.

The services you listed are available from pkgsrc and work fine. The one thing
you might want to consider instead is using Sun's Webstack and the MySQL
package, as they are optimized for Solaris and 64bit hardware.

In addition to the above, we use pkgsrc on our (dwingling number of) remaining
Linux hosts. It means we have a *single version* of software that may be
running on both platforms. It segments the idea of "system updates" and
"application updates" rather nicely with little overhead.

* Use Solaris Zones

Keep the global zone as free of user cruft as possible. If you segment your
services and users properly, zones make it incredibly easy to see what activity
is going on where (prstat -Z).

It also makes it easy to manage resources (CPU, RAM) for a given set of
services (you can do this with projects also, but to me it's easier to do at
the zone level).

Install all your pkgsrc packages in the global zone and loopback mount it in
each zone. This saves on space and time when upgrading pkgsrc packages. It also
means you have one set of pkgsrc packages to maintain, not N. It's the same
concept as...

* Use Sparse Zones

They are faster to build, patch and manage than full root zones. If you have
recalcitrant software that wants to write to something mounted read-only from
the global zone, use loopback mounts within the global zone to mount a zfs
volume read-write to where it wants (e.g., if something really wants to write
to /usr/local/yourface).

I also install common software in the global zone (e.g., Sun's compiler,
Webstack or MySQL) and then loopback mount the /opt directory into each zone
that needs it (every zone gets SSPRO).

* Delegate a ZFS dataset to each zone

This allows the zone administrator to create ZFS filesystems inside the zone
without asking the global admin. Something like rpool/zones/www1/tank. It's
easier to manage programmically too, if you are using something like Puppet
(see below) to control your zones. You only have to edit a single class (the
zones) when migrating the zone between systems.

* Use ZFS Features

No, really. Make sure your ZFS pools are in a redundant configuration! ZFS
can't automatically repair file errors if it doesn't have another copy of the

But: ZFS does more for you than just checksumming your data and ensuring it's
valid. You also have compression, trivial snapshots, and the ability to send
those snapshots to other Solaris systems.

Writing a script that snapshots, zfs sends | ssh host zfs recvs is trivial. I
have one in less than 50 lines of shell. It gives you streaming, incremental
backups with basically no system impact (depending on your workload,

Note that if disk bandwidth is your major bottleneck, enabling compression can
give you a major performance boost. We had a workload writing constantly
rewriting 30,000 sqlite databases (which reads the file into memory, creates
temp files, and writes the entire file to disk -- which are between 5MB and
2GB). It was incredibly slow until I enabled compression, which gave us a 4x
write boost.

You can also delegate ZFS filesystems to your users. This lets them take a
snapshot of their homedir before they do something scary, or whatever.

* Use the Jumpstart Enterprise Tool

Even though you only have one Solaris system, if you're new to Solaris, the
chances are you're going to screw up your first couple installs. I spent months
trying to get mine just the I wanted. And guess what, installing Solaris is
time-consuming and boring.

Using JET (a set of wrappers around Jumpstart, which can also be annoying to
configure), you have a trivial way of reinstalling your system just the way you
want. I run JET in a virtual machine, but most large installs would have a
dedicated install VLAN their install server is plugged into.

Solaris installs have a concept of "clusters", which define which packages are
instaled. I use RNET, the smallest one. It basically has nothing. I tell JET to
install my extra packages, and the systems are configured exactly how I want.

You use the finish scripts to do basic configuration after the install, and
to configure the *rest* of the system and applications, you...

* Use a centralized configuration management tool

I use Puppet. It makes it trivial to configure the system programmically,
manager users and groups, and install zones. It's a life and timesaver. In
addition to making your system configuration reproducible, it *documents* it.

Puppet manages both our Solaris and Linux boxes, keeping each in a known,
documented configuration. It's invaluable.

I also store all my user skel in source control (see next), and distribute them
with Puppet. Users may be slightly annoyed that they have to update the
repository whenever they want to change ~/.bash_profile, but it will be the
same on *every* host/zone they have access to, without them doing any work,
which will make them very happy.

* Store your configs in a source control manager

Both your change management and your system configuration should all be
versioned. Usefully, you can use your change management to manage your system

We have an internal directory called /sw where we deploy all our software to.
Older services have configs hard-coded to other locations, so we use Puppet to
ensure symlinks exist as appropriate. We deploy to /sw with a script that
checks the tree out of git and rsyncs it to all machines. It's pretty trivial,
and very useful if you have more than, say, two hosts.

/sw is also a loopback mount into every zone, and read-only. It enforces the
idea that all config changes must go into the repository, *not* be changed
locally... because developers can't write to /sw just to fix something quickly.

* Solaris Sucks At: Logging, by default

The default logging setup is awful. Install syslog-ng from pkgsrc, and write
your logs to both a remote syslog server and the local disk (enable compression
on your logs ZFS filesystem!)

* Solaris Sucks At: Firewalling

ipf is a pain in the butt. Unless you absolutely have to do host-based
firewalling, set up an OpenBSD system and use pf.


I'm sure I could think of quite a lot more (DTrace, Brendan Gregg's DTrace
Toolkit, RBAC, mdb), but it's dinnertime. :)

Hopefully the above will prove somewhat useful!
cyberpunk is dead. long live cyberpunk.

August 29, 2009

Our build files live on a Solaris 10 NFS server. The build client lives in a zone on a separate host. The build files are exported via v3 and tcp to the client.

Periodically the client would hang and require a zone reboot. Needless to say, this was astoundingly annoying if you didn't realize it had hung until you had started your build or push processes. An init 6 always fixed it... for a while.

Looking at snoop on the NFS server, it looked like a bunch of tcp:664 packets came in and go... nowhere. They hit the interface and vanish. Gee, I thought. That's odd.

Finally I got sick of this, and Googled around and found some references to port 623, a Linux bug that sounded pretty similar, and other Solaris users experiencing the same problem.

The first post is really the most useful. Different port, but same behavior.

After creating the rmcp dummy service in inetd, and restarting the zone, the problem has not resurfaced.

It's pretty interesting that this particular bug manifests because a chip on the motherboard eats traffic silently. "Interesting", anyway.

September 25, 2009

I've spun off my work-related ramblings over here. You can tell it's hardcore, because it's green text. Like jwz.