Use numpy to flog your.flowingdata
As noted in a previous post, your.flowingdata.com (yfd) is a handy way to collect personal data. I've been collecting sleep, diaper, etc. data on my newborn son. Although yfd now allows users to calculate durations between specified events, the visualization of the information isn't quite to my liking and it's clear that errors in the data can make for some odd durations (e.g., my son slept for two days!). Numpy to the rescue!
For those of you who don't know, numpy is python's powerful array package. Rather than loop myself to death, I thought it made more sense to use of numpy's powerful slicing and masking features to clean up the data. These features make it easy to find data entry errors.
I use the Enthought python distribution for convenience sake (and because I can't resist all those libraries -- most of which I'll never use). Below you'll find some screenshots that step through my little script. Refer to the complete code here. (Well, it's just a start really). The code is probably a bit verbose for what it does, but we all start somewhere.
The first step is getting the data into an array you can manipulate. For your reference, your.flowingdata yields data that looks like this:

array([('gmorning', '', '', '2009-10-24 23:45:36'),('gmorning', '', '', '2009-09-30 18:15:04'), ('gnight', '', '', '2009-09-23 21:00:03'), ('gmorning', '', '', '2009-09-23 19:15:03')])
I won't step through all the code here since it's available above, but you get the idea. One thing to watch out for: datetimes. I spent a lot of time trying to figure out the best way to handle the timestamps included with the yfd event data. There are ways to convert strings to ordinal numbers into datetime objects and back again, but really I wanted to manipulate the datetime objects directly to take advantage of numpy's array slicing and arithmetic. Luckily, numpy allows object types (technically, they are called 'dtypes'). This allows you to subtract one timestamp array from another to get the elapsed time without any conversions (though you'll have to convert at some point if you want to generate a human-readable string). Here's an example of the array you'll get at the end (heads -> sleep duration, start time, end time):




