Congressional Redistricting, a new project

I’m starting a long(er)-term project looking at congressional redistricting. This is the process done every decade (following the constitutionally mandated census) to redraw the boundaries of the districts represented in the House of Representatives. It can also be done, with less legitimate cause, at any time a state chooses.

There a few reasons I find this topic interesting:

  • Republicans maintained control of the House in the 2012 elections. They did this despite more Americans having voted for a Democratic representative than a Republican one. That’s just the way the cookie crumbles sometimes, but it’s been fun seeing people try to explain it and I’m interested in exploring this phenomenon more.
  • I like the idea of multi-member districts. I was first exposed to this idea a few years ago by Matthew Yglesias, and it has stayed with me largely because it’s probably my only chance of ever becoming a congressman.
  • It will finally give me an excuse to play with maps.

I’m not sure where this project is headed, but along the way we’ll get to play with these ideas as well as the Voting Rights Act, the “big house”, gerrymandering and so much more.

2012 NFL Conference Champs

ConferenceChamps_20120204

Yesterday, Super Bowl Sunday, I put out a few charts (found here. CLICK!) comparing the two contenders.  It was a last minute attempt to see how the 49ers and Ravens stacked up against each other based on their regular season performance.

In December, I had started mulling over ideas for showing how dominant the Seahawks had been in the second half of the season, but they got knocked out of the playoffs before I pulled the trigger on a graphic.  In that time I came across this difference chart by mbostock and it seemed like a really powerful way to compare teams over time:

mbostockDiffChart

I got the data I needed from Advanced NFL Stats.  There were a couple of other data sets that I ran across in my research, but this had the entire season (play-by-play) in one handy csv.

I had to do a little bit of work to get the data into the shape I needed it:

  • How do I handle time?  The first decision was to ignore bye weeks completely. This means as you move along the x-axis, the stats don’t match up by date.
  • The other hiccup was how to handle overtime, which each team encountered at least once this season.  I decided to map the game clock (as given in the raw data) to proportion of the total game time for that week.  So a play happening as the first half expired was marked as 0.5 most of the time, but somewhere around 0.35 – 0.4 for the games that went into overtime.
  • There were some data quality issues in the points columns which required manual cleaning.  It is entirely possible I missed something here.
  • I calculated yards (for and against) from a combination of the yard line data and which team was on offense for the preceding and following plays.  It is entirely likely I screwed this up for some edge cases.

I waited ’til the last minute to start coding this up and had to cut some corners.  I’ll be revisiting it soon to address these issues:

  • Have you ever heard of DRY coding? Yeah, I completely failed at it on this project.
  • With only 30 minutes left before kick off I gave up on messing with my html layout. I wussed out and used a table instead; it looks tacky (though not as bad as I’d thought it would) but got the job done.
  • I’ve also, apparently, forgotten how to do bar charts quickly in d3.js.

Code available here.