Recently defeated Gubernatorial candidate, and former WA Attorney General, Rob McKenna wrote an op-ed for the Seattle Times last week discussing ways that NW Republicans can start winning state-wide races again.
He fully endorses the need to expand the GOP tent by reaching out to minority communities and young people. And he… well that’s about it. Although he does end the column by noting how close he came to winning (he lost by ~4%).
Here’s a summary of the policy prescriptions he outlines for winning over non-old-white voters:
Not everything has an API.
I haven’t had to do much web scraping in my life, and when I have it’s been simple and did not need to be reproducible. But there are a few projects that have been floating around in my head that would benefit greatly from repeatedly collecting a lot of data straight from webpages. My search for a good scraping tool led me to the usual places (Stack Overflow and Quora) and I found Scrapy.
Scrapy is a Python based screen scraping and web crawling framework that is available to fork on GitHub. I currently work on a windows machine so, like most cool things, it was non-trivial to set up but luckily they provide a straightforward installation guide with links to all of the dependencies you need to install. They also provide a nice tutorial to help you get a feel for the framework.
So that’s where I am now: everything is up and running, and I feel comfortable with the tutorial project. Now I just need to figure out how to use it for my own (currently ill-defined) projects. Hopefully I’ll be back here soon reporting on some cool results.
“A single ear of wheat in a large field is as strange as a single [habitable] world in infinate space” – Metrodorus
This week I finished a course called Intro to Astrobiology by Professor Charles Cockell (of the UK Centre for Astrobiology at The University of Edinburgh) offered on Coursera.
An ancient field of thought that is concerned with the origin, evolution and distribution of life in the universe. It pulls from many disciplines (chemistry, biology, astrophysics, etc).
- How/why/where did life begin on earth?
- What are the extreme limits of life (temperature, pressure, desiccation) on earth?
- Are these limits universal? Can life exist in ways we haven’t concieved?
- Is there life outside of earth? How can we go about finding it?
There are billions of galaxies, each with billions of stars, many having planets; and we are only beginning to have the technology capable of inspecting them.
Each week a handful of short lecture videos were released as well as a couple of multiple choice quizzes. Meant as an introductory/teaser course, the videos offered an overview of the many disparate parts of the subject. Professor Cockell clearly finds the field fascinating and did a good job of connecting the topics together.
Is life sustainable outside of the comfort of the earth?
This may not be answered for a long time. But eventually, the earth will be unable to sustain life; whether through our own actions or because of the expiration of our sun. So it is imperative for us to explore these issues.
Are we alone in the universe? The answer is profound either way.
Yesterday, Super Bowl Sunday, I put out a few charts (found here. CLICK!) comparing the two contenders. It was a last minute attempt to see how the 49ers and Ravens stacked up against each other based on their regular season performance.
In December, I had started mulling over ideas for showing how dominant the Seahawks had been in the second half of the season, but they got knocked out of the playoffs before I pulled the trigger on a graphic. In that time I came across this difference chart by mbostock and it seemed like a really powerful way to compare teams over time:
I got the data I needed from Advanced NFL Stats. There were a couple of other data sets that I ran across in my research, but this had the entire season (play-by-play) in one handy csv.
I had to do a little bit of work to get the data into the shape I needed it:
- How do I handle time? The first decision was to ignore bye weeks completely. This means as you move along the x-axis, the stats don’t match up by date.
- The other hiccup was how to handle overtime, which each team encountered at least once this season. I decided to map the game clock (as given in the raw data) to proportion of the total game time for that week. So a play happening as the first half expired was marked as 0.5 most of the time, but somewhere around 0.35 – 0.4 for the games that went into overtime.
- There were some data quality issues in the points columns which required manual cleaning. It is entirely possible I missed something here.
- I calculated yards (for and against) from a combination of the yard line data and which team was on offense for the preceding and following plays. It is entirely likely I screwed this up for some edge cases.
I waited ’til the last minute to start coding this up and had to cut some corners. I’ll be revisiting it soon to address these issues:
- Have you ever heard of DRY coding? Yeah, I completely failed at it on this project.
- With only 30 minutes left before kick off I gave up on messing with my html layout. I wussed out and used a table instead; it looks tacky (though not as bad as I’d thought it would) but got the job done.
- I’ve also, apparently, forgotten how to do bar charts quickly in d3.js.
Code available here.