Category Archives: Best Posts

Beautiful Data book chapter

Posted on August 12, 2009

Today I received my copy of Beautiful Data, a just-released anthology of articles about, well, working with data. Lukas and I contributed a chapter on analyzing social perceptions in web data. See it here. After a long process of drafting, … Continue reading

Announcing TweetMotif for summarizing twitter topics

Posted on May 18, 2009

Update (3/14/2010): There is now a TweetMotif paper. Last week, I, with my awesome friends David Ahn and Mike Krieger, finished hacking together an experimental prototype, TweetMotif, for exploratory search on Twitter. If you want to know what people are … Continue reading

Comparison of data analysis packages: R, Matlab, SciPy, Excel, SAS, SPSS, Stata

Posted on February 23, 2009

Lukas and I were trying to write a succinct comparison of the most popular packages that are typically used for data analysis. I think most people choose one based on what people around them use or what they learn in … Continue reading

Statistics vs. Machine Learning, fight!

Posted on December 3, 2008

10/1/09 update — well, it’s been nearly a year, and I should say not everything in this rant is totally true, and I certainly believe much less of it now. Current take: Statistics, not machine learning, is the real deal, … Continue reading

It is accurate to determine a blog’s bias by what it links to

Posted on October 11, 2008

Here’s a great project from Andy Baio and Joshua Schachter: they assessed the political biases of different blogs based on which articles they tend link to. Using these political bias scores, they made a cool little Firefox extension that colors … Continue reading

Turker classifiers and binary classification threshold calibration

Posted on June 18, 2008

I wrote a big Dolores Labs blog post a few days ago. Click here to read it. I am most proud of the pictures I made for it:

Are women discriminated against in graduate admissions? Simpson’s paradox via R in three easy steps!

Posted on April 13, 2008

R has a fun built-in package, datasets: a whole bunch of easy-to-use, interesting tables of data. I found the famous UC Berkeley admissions data set, from a 1970′s study of whether sex discrimination existed in graduate admissions. It’s famous for … Continue reading

color name study i did

Posted on March 18, 2008

Link: Where does "Blue" end and "Red" begin? I’m writing some posts on blog.doloreslabs.com and this is the best one so far. Methodology-wise, along the lines of my earlier Amazon Mechanical Turk moral decisions survey…

Food Fight

Posted on January 31, 2008

Absolutely amazing — a short film chronicling conflicts from World War II — as food. I think this has to have the highest amount of Wikipedia-linkable references per second of any film I’ve seen. Yes, it’s U.S.-centric, but so is … Continue reading

Moral psychology on Amazon Mechanical Turk

Posted on January 20, 2008

There’s a lot of exciting work in moral psychology right now. I’ve been telling various poor fools who listen to me to read something from Jonathan Haidt or Joshua Greene, but of course there’s a sea of too many articles … Continue reading