"That which is overdesigned, too highly specific, anticipates outcome; the anticipation of outcome guarantees, if not failure, the absence of grace."
-- William Gibson, All Tomorrow's Parties
February 9, 2004

Spent Friday night rewriting a my daemon prototype. Around 0300 (Saturday morning) I was twitchy enough to realize my mistake:

First off, following JdBP's common mistakes guidelines, I don't have the daemon background itself (which is something that should be left up to a daemon manager, like inetd or preferably, djb's daemontools). So my prototype starts up, attaches to a socket, and starts listening.

When a request from a client kicks off, it spawns off a child using a while() loop. The while loop is structured in such a way that it's only running as long as the process is listening to the socket. So once the last child exits (non-blocking I/O, etc, so multiple clients can talk to multiple children, yadda yadda -- no max children options yet, though, as I haven't started on my queueing code), the parent process dies. It doesn't go through the cleanup code, so it's actually hitting the child exit statement.

This was a problem.

After rewriting the damn thing, however, I realized that for the parent to keep running, I should have an until() around the child while() loop.

eg: until ($term) { ... } where $term is global and defined 0 until $SIG{TERM} (the TERM signal) is caught.

So after that all was happy.

The next goal for this prototyping project is to class it all out into a module which does Actual Useful Stuff, and takes commands from clients and whatnot. The module/daemon will actually be a monitoring suite of some sort: it'll ping hosts, store their stats, etc. Minor stuff.

Once that's complete, I'll move on to writing a CGI::Application application (the prototype will probably be for a small kbase type thing).

And after that, well, I'll be putting all the techniques together and writing a new backup solution for work, which will replace our Veritas NetBackup setup.

Rather intimidating, really.

February 27, 2004

Spent the last two days working with CGI::Application and Template to create the UI::CGI framework for my NetBackup replacement, Archivist.

I am really not much of a programmer, and I hate doing web stuff, but I'm mildly enjoying myself with this. The code is roughly 200 lines, decently structured, very clear (because I'm not smart enough to do complex things).

In fact, the only complex thing I tried to do was use method calls as the CGI::Application run modes. That way I can just subclass all the functionality and not have to maintain the parent class for anything except the %runModes hash.

However, as rjbs pointed out (because he is a programmer), CGI::App is Just an Object, gets passed the run modes as a hash, and invokes $self to call the appropriate subroutine. So there went that idea.

Not a huge thing to maintain a sub for each section, which can then call to the subclass, but eh. I think I spent two hours on that, when I could have just looked at the damn CGI::App code and figured it out myself.

In an effort to follow the application design philosophies in The Pragmatic Programmer (which is a book everyone who codes ever should read), I showed the framework to my manager and our librarian. Seemed to go over well, for something as simplistic as it is. No major issues with the overall UI design, which is good. It's simple enough to make changes to later if something crops up, though.

Noting rjbs and mdxi talking about PAUSE last night, I signed up for an account. Got my confirmation today (along with the username "bda", which is supercool), and I'll probably be uploading what I have of Archivist after confirming (again) with the COO that it's cool to GPL this stuff (he has never had any problems in the past with doing so), and checking to see how people package things that aren't just modules.

All in all, a pretty productive and instructive last few days.

March 10, 2004

Work on archivist proceeds, about as slowly as I expected, considering the amount of Normal Work that comes up during the course of a day. The CGI interface needs to be populated with forms but otherwise works well. Just need to get that whole data entry thing going on. Relatively certain I'm writing clean code.

The SQL schema is half-written and I've just finished compiling Postgres on my laptop (G3 400 Pismo). This should be amusing.

Object orientedness is lurve.

March 11, 2004

archivist makes use of Template Toolkit, and I use vim for damn near everything. So recently I'd gotten incredibly annoyed with the lack of syntax highlighting for TT files, whined at rjbs, who thumbed his nose at me and told me to go google for it.

Came up with this, which is an imperfect solution, but possibly the start of a Good Solution.

14:10 < rjbs> well, this guy solved his own problem. and gave the solution away.
14:10 < rjbs> now you can munge it to fix yours.
14:10 <@bda> Yeah, yeah. :P
14:10 < rjbs> I bet he still will have solved the harder bits:
14:10 <@bda> Doesn't mean I'm not allowed to whine about it.
14:10 < rjbs> fucking vmethods, etc.
14:10 <@bda> Heh.
14:10 < rjbs> no, but I can try to make you feel bad about it.
14:11 <@bda> pft.
14:11 <@bda> Good luck with that.

So if I get overly annoyed with the fact that I have no HTML highlighting unless my TT file starts with <html> (which is absurd), I may try and fix it. I've tried to do vim syhi files in the past... it's less than fun.

March 16, 2004

I love spending a morning writing a database abstraction layer only to have Ricardo go: "Why not just use Class::DBI? It's practically core."

And yeah. It does exactly what I was doing, only better.


March 19, 2004

All tutorials should be written like this.

A few choice lines:

Let's say "Gone With The Wind" is released on DVD with some shock-
ing new footage of a bizarre Scarlet/sheep scene.



A new object can be made with all the info of another, except the
primary key. This is a very useful feature, given how little imag-
ination Hollywood posesses.

my $yojimo = Film->retrieve('Yojimbo');
my $fistful = $yojimbo->copy('Fistful Of Dollars');

Worth reading even if you know What's Up.

A fun little comment block in netatalk's 1.64 etc/afpd/file.c, discovered by eniak:

* What a fucking mess. First thing: DID and FNUMs are
* in the same space for purposes of enumerate (and several
* other wierd places). While we consider this Apple's bug,
* this is the work-around: In order to maintain constant and
* unique DIDs and FNUMs, we monotonically generate the DIDs
* during the session, and derive the FNUMs from the filesystem.
* Since the DIDs are small, we insure that the FNUMs are fairly
* large by setting thier high bits to the device number.

Of course, they generate the damn IDs the same way, which as eniak also discovered, requires that your total inodes be under 24 bits.

Which means I get to have some fun once we get our terabyte RAID array in.

March 24, 2004

I just Grokked Class::DBI.

I've spent the last three days trying to hack it into being something other than what it's Meant To Be. I finally got tired of trying to use a hammer as a lever and set it up how it wants: One class per database table. Holy Hell on toast.

I got inserts, updates, and reads working in the space of three minutes (the templates were already set up).

Each table subclass is 20 lines each, eg:

package Archivist::DB::RunModes;

# $Id: RunModes.pm,v 1.1 2004/03/24 19:54:37 bda Exp $

use strict;
use base 'Archivist::DB';

# Set the table.

__PACKAGE__->columns(All => qw/id mode_name display title status comment created modified/);
__PACKAGE__->columns(Primary => 'id');


sub new {
bless {}


As you can see, the table subclass just defines how the database table is actually built.

Ane the code to add a new Run Mode (this is from Archivist::Config):

sub addMode {
my $self = shift;
my %modeInfo = @_;

require Archivist::DB::RunModes;
my $db = Archivist::DB::RunModes->new;
my $query = $db->create(\%modeInfo);
$query = $db->update;
$query = $db->dbi_commit;


Obviously I'm not doing any input validation there, but! So easy!

The table subclasses inherit connection information from the Archivist::DB overclass, which just contains a typical DSN and calls Class::DBI's set_db() method.

Now maybe I can actually get some work done.

March 27, 2004

I was talking about daemons in #perl the other day, asking if anyone had any common gotchas for writing them in Perl. A fellow Philly hacker suggested Network Programming With Perl.

I was also directed to Stem, by the author. Stem looks super interesting for writing network applications. I think Eric and I are going to be checking it out for a (so far) theoretical NMS (network monitoring system) we've been talking about the last few days. I've been talking to Uri, the author, for a couple days now, and it seems like a good fit for a bunch of stuff we'd have to write anyway to get the nodes and aggregators talking to each other.

Going to try and have a brainstorming session with Eric tomorrow about it over lunch or something.

Work on archivist proceeds. I am a subclassing fool; encapsulation is god.

00:36 <Danelope> You're the Bipolar Programmer.

00:36 <Danelope> <bda> OMG WTF HATEal;skja;lkjsd;lkasjdl;akjd WHOA DBI++

This is essentially true. I have a serious problem with getting the initial concepts of something down, tend to get really frustrated, yell and bite people... and then that epiphany thing happens. I may not get the fine details, but I get enough of the Big Picture to use whatever it is I've been fighting with. And that's totally enough.

Stem is neat because it seems to act in a way that makes sense to me already. So we'll see how that goes.

At any rate. Back to archivist.

March 31, 2004

The two most annoying mistakes I make at least once a day are:


%hash = %newhash,%hash;

Which clobbers the values from the new hash. The correct way to write this is:


%hash = %hash,%newhash;

This is really useful if you're setting defaults for a method, and want to override them by passing named arguments, or if you're calling an accessor which returns new values you want to access via the original hash. e.g., You're making a database call, which updates the hash with a new row's id or a status message or whatever).

%hash = (%hash,$foo->addHottie(

chica=>"scarlett johansson",


Of course, you could just dump %hash to addHottie(), but there are some cases where you don't want an accessor to have access to certain variables (like if you're doing an update, but don't want to risk the possiblity of clobbering any of your unique values -- like your primary key!). For an insert, that probably doesn't really matter too much, if you're doing some other form of input validation.

The other mistake also involves hashes...


sub foo {
my $self = shift;
my %hash = shift;

Hmm! What's wrong with that? Gee, probably I actually meant:


my %hash = @_;

Grrr. Simple little mistakes, but I keep making them!

I need to get them tattoo'd on my forearms or something.

May 20, 2004

Started writing my Nagios config generation scripts this morning. Got about halfway done with them, as I first decided the most sane way to do it was to sweep the network with nmap (dumping to XML) and use that for a base.

Nmap::Parser is pretty decent stuff, though I did feel there was weirdness between using get_host_list() and get_host_objects(), but overall I'm pretty happy with it. Having to grab expat manually was sort of annoying, but what are you gonna do? :-)

Anyway, I did notice some scary stuff on our network. I thought I had completed my host-based firewall project several months ago, but apparently some of the workstations decided to make a liar out of me.

Next project, after this is done: Network integrity checker, like what AIDE or Tripwire do for filesystems, only for service changes. Should be pretty trivial to implement.

Then it's back to Archivist.

July 3, 2004

Profiling Perl

I think Mark Fowler's talk at YAPC mentioned this, and I meant to figure out why Mail::Postfix::Virtual was taking ages to build the virtual addresses hash.

Yet another reminder to do so.

July 6, 2004


I wish I were smart enough to figure this stuff out on my own.

But I'm not.

Thank the gods for perlmonks.

August 16, 2004
August 24, 2004

Mike and I sat down and discussed Archivist for a good four hours today. We nailed down topologies, messaging structure, client authentication methods, and various other things. Tomorrow I need to type it all up and make it understandable. We also came up with valid prototypes that will easily translate into core functionality for the application.

For instance, when you install a new Archivist node, all you copy over is a bootstrap daemon with a message queue. Once the bootstrapper is up, its controller registers its existance and pushes down its config. If this node is supposed to deal with tape control, you configure that via an admin interface, and it'll push the needed code to the node via the bootstrapper (which also deals with daemon management, upgrade/downgrade of code, etc). The bootstrapper forks a copy of itself and sends that child a message telling it what it's supposed to do. Bang, now you have a tape control daemon running.

We also discussed things like replicated read-only databases for public-facing clients, data caches for unreachable nodes (via restricted VPNs, etc), and a good deal of other stuff that we definitely won't need initially but will be easy to implement once the core is in place.

What we're really writing is an event-driven message framework that has the facility to make new functionality trivial to install. This is going to blossom out into a full network management tool.

The biggest problem we ran into was message routing, mainly because I was making it way more complex than it really needed to be. Finally we decided on two types of messaging: direct and policy-based. A direct message will follow a routing table (and isn't something that will ever originate on a client). A policy-based message is a request or status message that is destined for a specific controller that the end-point doesn't know about (for instance, when a client asks for a file to be restored, it doesn't need to know how to contact the tape controller directly: it just needs to know to pass the message to its configured local message exchange, which knows to pass it up based on a defined policy).

The wiggy stuff about messaging, though, is that sometimes you'll get a situation where you're replying to something that the end-point doesn't know about. A status update request from the primary controller to a specific node is a good example... we got around this by defining a message state table for the message exchangers.

All messages from a node are going to get passed to its local message exchange: the client is given a message ID which is passes as part of the message (probably as a REFERENCE declaration) and that MX keeps state for messages that pass through it, so it knows that message ID $x is supposed to get passed back up to MX2 (where it initially came from) as ID $y. (The message IDs should change as they pass through an MX to avoid collisions -- there is a VIA header that will log the route a message takes).

The message format looks unsurprisingly like SMTP, but what simple messaging protocol wouldn't? I could easily envision using something like Postfix with some transport hacks as the messaging system, in fact, but uhm, no. :)

All of the above, and plenty more, needs to be defined in a spec, which I'll hopefully start working on tomorrow. I also need to make the diagrams for the network topologies we came up with (I think there were three, ranging from very simple to pointlessly-complex-but-still-stupidly-doable).

Mike is going to start working on the messaging framework and bootstrap daemon code, and I'm going to re-write the junky web UI I sketched out a couple months ago. Probably I'll just end up re-writing it.

We also need to tack down database schema and some other stuff.

But I'm hopeful. Mike has a lot of experience (ten years of C), and doesn't seem to think anything we're trying to do is all that complicated. I tend to agree, from a philosophical standpoint, as most of it is currently far beyond my skill as a programmer. Apparently that will be changing, at least to some extent.

April 20, 2005

I've been in the habit of using Data::Dumper to print data structures for a while now... it's especially useful when you're interacting with a module whose documentation is somewhat lacking and you want to see exactly what it's handing back to you.

So of course I'm rambling on about how useful this in in #tildedot and Rik has to one-up me: Data::Dumper::Simple.

From the pod:

my $foo = { hash => 'ref' };
my @foo = qw/foo bar baz/;
warn Dumper ($foo, \@foo);


$foo = {
'hash' => 'ref'
$foo = [


The autodump function looks pretty sweet as well.

May 27, 2005

Jose's Guide for creating Perl Modules.

Not very useful to me, but I may be needing to nudge, all subtle-like, at a co-worker whose stuff I am ninja-rewriting.


I've never touched the Exporter, though. It's scary. OO->4("life")!

June 7, 2005

Got bored, uploaded the archiving script for resync.sh, the silly little backup script I wrote for DCI. It's not much of anything, but it Works.

It should be smarter on a number of levels, but... eh.

I vaguely recall a couple #dotnet'ers using it, I think, or else I probably wouldn't care.

September 4, 2005

Went ahead and cleaned up Invitare (the invite system I wrote for pumpcon) a very little bit, wrote some lame install docs, and packaged it up.

Stuck it on my work box, since I'm too lazy to set up svn/trac/etc, so it's gettable from gordon.

Patches, bugs, patches, welcome.

Also patches.

December 5, 2005

Just wrote my first trac plugin and boy, was it hard.

I needed to add a couple permissions to trac for my back-end scripts (SOURCE_COMMIT and IS_GROUP) for Toolbox, my svn/trac provisioning app.

trac pulls its perms from the components/plugins it loads, so I just wrote a little Python egg that returns the perms. So easy. No skills involved at all.

from trac.core import * from trac.perm import IPermissionRequestor

class ToolboxModule(Component):
"""Provides some Toolbox-specific stuff."""


# IPermissionRequestor methods

def get_permission_actions(self):

I'd never written a Python egg before, and uh, it is quite trivial.

December 13, 2005

Code that doesn't act the same on alternating runs using the same data, and which doesn't display the behavior you expect when in debug mode. (Note that debug simply doesn't make any changes to the filesystem -- it's still manipulating all the relevant data as part of its execution.)

I gave up working on it this morning as it was being deployed to a semi-production box (which is essentially only used by me at the moment), but now I get to go back and figure out what the hell it was doing.

Ugh. Programming.

December 27, 2005

So as part of Toolbox's group membership management, I need to iterate over existing projects, grab all users for those projects, compare to current UNIX group membership, and then usermod accordingly. When I first prototyped all the Toolbox "system" stuff a couple weeks ago, I ended up with seven or eight scripts that did it all (with lots of redundant code). Then I spent about a week reimplementing it all in some libraries, and grafting a little front-end Perl script to them.

January 18, 2006

So this damn thing I've been working on for the past, I dunno, forever, is almost done. Or anyway, it's about to hit beta. Finally. It's nowhere near a release candidate yet, but that won't stop us putting it into internal production (heh).

While I don't usually bother with this stuff, I'm bored:

Creating filelist for bin
Creating filelist for lib
Creating filelist for trac
Categorizing files.
Finding a working MD5 command....
Found a working MD5 command.
Computing results.

SLOC Directory SLOC-by-Language (Sorted)
2256 lib perl=2256
198 bin perl=198
11 trac python=11

Totals grouped by language (dominant language first):
perl: 2454 (99.55%)
python: 11 (0.45%)

Total Physical Source Lines of Code (SLOC) = 2,465
Development Effort Estimate, Person-Years (Person-Months) = 0.52 (6.19)
(Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months) = 0.42 (5.00)
(Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Estimated Average Number of Developers (Effort/Schedule) = 1.24
Total Estimated Cost to Develop = $ 69,671
(average salary = $56,286/year, overhead = 2.40).
SLOCCount, Copyright (C) 2001-2004 David A. Wheeler
SLOCCount is Open Source Software/Free Software, licensed under the GNU GPL.
SLOCCount comes with ABSOLUTELY NO WARRANTY, and you are welcome to
redistribute it under certain conditions as specified by the GNU GPL license;
see the documentation for details.
Please credit this data as "generated using David A. Wheeler's 'SLOCCount'."

That doesn't include the templates, which contain maybe 150-200 lines of logic.

I spent a good portion of the day manually tracing the execution path of a "rebuild", which touches a fair amount of the code, and it came out to about four pages. The biggest chunk was rebuilding the UNIX groups from the Trac databases, as it has to bounce around the user modules a bit. I ran down a few bugs I introduced while coding "in the zone", and it refamiliarized me with quite a few things I had forgotten about once they were "finished".

Need to dig through Damian's Best Practices book on how to deal with error handling, because what I'm doing now sucks.

I hate coding. :)