Wednesday, November 25, 2015
Even the tiniest error messages can indicate an invalid statistical analysis
Word to the wise: track down the reasons for even the most innocuous-seeming warnings. Every stage of a statistical analysis is important, and small errors anywhere along the way and have huge consequences downstream. Perhaps this is obvious, but you still have to slow down and take care of the details.
(Note that I'm editing this to be a part of my Little Debate series, which discusses the tiny decisions dealing with data that are rarely discussed or scrutinized, but can have a major impact on conclusions.)
Wednesday, August 7, 2013
Joint statistical meetings 2013
Every year, the first week of August, we statisticians meet to get our statistics, networking, dancing, and beer on. With thousands in attendance, it's exhausting. I wonder about the quality of statistical work the second week of August.
Each conference seems to have a life of its own, so I tend to reflect on each one. Here's my reflection on this year's:
First, being in Montreal, most of us couldn't use smartphones. Thankfully, Revolution Analytics sponsored free WiFi. They also do great work with R. So we were all for the most part able to tweet.
The quality of talks was pretty good this year, and I've learned a lot. We even had one person describe simulations with a flowchart rather than indecipherable equations, and I strongly encourage that practice.
As a member of the biopharmaceutical section, I was struck by how few people take advantage of our awards. Of course, everybody giving a contributed or topic contributed talks is automatically entered into the best contributed paper competition. But we have a poster competition and student paper competition that have to be explicitly entered, and participation is low. This is a great opportunity.
The highlight of the conference, of course, was Nate Silver's talk, and he delivered admirably. The perhaps thousand statisticians in attendance needed the message: learn to communicate with journalists and teach them numbers need context. I also like his response to the question "statistician or data scientist?" Which was, of course, "I don't care what you call yourself, just do good work."
Tuesday, March 12, 2013
Distrust of R
I guess I've been living in a bubble for a bit, but apparently there are a lot of people who still mistrust R. I got asked this week why I used R (and, specifically, the package rpart) to generate classification and regression trees instead of SAS Enterprise Miner. Never mind the fact that rpart code has been around a very long time, and probably has been subject to more scrutiny than any other decision tree code. (And never mind the fact that I really don't like classification and regression trees in general because of their limitations.)
At any rate, if someone wants to pay the big bucks for me to use SAS Enterprise Miner just on their project, they can go right ahead. Otherwise, I have got a bit of convincing to do.