Monday, July 9, 2012

Expatriate Tracking Software

I am rather delighted that I have now turned in the final edits for my book and the proofs are starting to come back. This means that, for the first time in almost a year, I have free time (how the hell did I keep this blog going?)

So what do I do with my free time? I write software, of course. And I'm using it to track you. No, that's not quite true. I'm using it to track people who have renounced their citizenship under section 6039G of the HIPAA act (damn, I've memorized that now). They're reported in the Federal Register and I've download their data and am slowing beating said data into shape. Here's a sample record (listed here because it's well known that Eduardo Saverin renounced his US citizenship):


The basic process works like this:
  1. I add the records for all renunciants per quarter.
  2. I run code that does a heuristic check to find out if the person is "newsworthy".
  3. I refuse to share this data.
Much of this turns out to be an extremely tedious process because the renunciation data isn't structured very well. I'm not going to say exactly how I did this because I don't want to violate people's privacy. If you want to duplicate this, you'll have to start from scratch. As a result, this software and the offsite backups are probably not going to be made public.

So why am I doing this? My intent is to slowly compile a list of expatriates whose voices might be powerful enough to have an impact on the expat debate. I want to try to contact them and confirm their identity. If they agree, I want to interview them and post that here. If they ask for privacy, I will respect that (in fact, I've already started building that into the software). I don't want to cause anyone grief, so no one will be "outed" without their consent. I've already kept quite about one name because the person asked me to.

I have to say that reading through the list of potentially newsworthy expatriates is fascinating. There are some names showing up that have surprised the hell out of me once I started reading about who these people might be, but my heuristic checks are pretty spotty since all I have is a name to go on. There aren't going to be too many false negatives, but there are a lot of false positives. Frankly, I love hacking on data (that's probably why I'm a computer programmer) and this is just fascinating as hell. 
Paperblog Web Analytics