Sunday, June 5, 2016
Little Debate: defining baseline
Defining baseline seems like an easy thing to do, and conceptually it is. Baseline is where you start before some intervention (e.g. treatment, or randomization to treatment or placebo). However, the details of the definition of baseline in a biostatistics setting can get tricky very quickly.
The missing baseline
The average baseline
The extreme baseline
Conclusion
Monday, August 20, 2012
Clinical trials: enrollment targets vs. valid hypothesis testing
The questions raised in this Scientific American article ought to concern all of us, and I want to take some of these questions further. But let me first explain the problem.
Clinical trials and observational studies of drugs, biologics, and medical devices are a huge logistical challenge, not the least of which is finding physicians and patients to participate. The thesis of the article is that the classical methods of finding participants – mostly compensation – lead to perverse incentives to lie about one’s medical condition.
I think there is a more subtle issue, and it struck me when one of our clinical people expressed a desire not to put enrollment caps on large hospitals for the sake of a fast enrollment. In our race to finish the trial and collect data, we are biasing our studies toward larger centers where there may be better care. This effect is exactly the opposite of that posited in the article, where treatment effect is biased downward. Here, treatment effect is biased upward, with doctors more familiar with best delivery practices (many of the drugs I study are IV or hospital-based), best treatment practices, and more efficient care.
We statisticians can start to characterize the problem by looking at treatment effect by different sites, or using hierarchical models to separate out center effect from drug. But this isn’t always a great solution, because low-enrolling sites, by definition, have a lot fewer people, and pooling is problematic because low-enrolling centers tend to have way more variation in level and quality of care than high-enrolling centers.
We can get creative on the statistical analysis end of studies, but I think the best solution is going to involve stepping back at the clinical trial logistics planning stage and recasting the recruitment problem in terms of a generalizability/speed tradeoff.
Tuesday, May 19, 2009
Deep thought of the day
I have a suspicion that clinical trials isn't the only place where this principle applies.
Wednesday, March 25, 2009
Challenges in statistical review of clinical trials
Thursday, May 8, 2008
Can the blind really see?
I've wondered about this question even before the ENHANCE trial came to light, but, since I'm procrastinating on getting out a deliverable (at 11:30pm!) I'm going to just say that I plan to write about this soon.
Friday, May 2, 2008
Well, why not?
So at any rate, Derek deduces that the problem lies in efficacy. Is it possible to support a marketing claim that the combination is more than the sum of its parts? Merck apparently thinks so, but the FDA does not. Unless there's an advisory committee meeting on this, or the drug eventually gets approved, or efforts to get results of all clinical trials posted publically succeed, we won't know for sure. But what I do know is that for one of these combinations to gain marketing approval, at the very least there has to be a statistically significant synergistic effect. That means that the treatment effect has to be greater than the sum of the treatment effects of the drugs alone. Studies that demonstrate this effect tend to have a lot of patients, especially if there are multiple dose levels involved. It isn't easy, and I've known more than one combination development program to fizzle out.
Update: but see this serious safety concern for Singulair reported by Pharmalot.
Monday, February 25, 2008
The buzz
- O'Brien-Fleming (especially doing this kind of design in SAS)
- Bayesian statistics in R
- noninferiority
- NNT (number needed to treat)
- confidence intervals in SAS
At 2000 hits in a year, this is clearly a narrowly-targeted blog. :D
Saturday, September 1, 2007
Bias in group sequential designs - site effect and Cochran-Mantel-Hanszel odds ratio
It is well known that estimating treatment effects from a group sequential design results in a bias. When you use the Cochran-Mantel-Haenszel statistic to estimate an odds ratio, the number of patients within each site affects the bias in the estimate of the odds ratio. I've presented the results of a simulation study, where I created a hypothetical trial and then resampled from this trial 1000 times. I calculated the approximate bias in the log odds ratio (i.e. log of the CMH odds ratio estimate) and plotted that versus the estimated log odds ratio. The line is cubic smoothing spline, made by the statement symbol i=sm75ps in SAS. The actual values are underprinted in light gray circles just to get some idea of the variability.
Wednesday, August 1, 2007
A good joint statistical meetings week
This morning, I was still feeling a little burned out, but decided to drag myself to a section on Bayesian trials in medical devices. I found the speakers (which came from both industry and FDA) top notch, and at the end the session turned into a very nice dialog on the CDRH draft guidance.
I then went to a session on interacting with the FDA in a medical device setting, and again speakers from both the FDA and industry were top notch. Again, the talks turned into very good discussions about how to most effectively communicate with the FDA, especially from a statistician/statistical consultant's point of view. I asked the question of how to handle the situation where, though it's not in the best interest, a sponsor wants to kick the statistical consults out of the FDA interactions. The answer: speak the sponsor's language, which is in dollars. Quite frankly, statistics is a major part of any clinical development plan, and unless the focus is specifically on chemistry, manufacturing, and controls (CMC), a statistician needs to be present for any contact with the FDA. (In a few years, it might be true for CMC as well.) If this is not the case, especially if it's consistently not the case throughout the development cycle of the product, the review can be delayed, and time is money. Other great questions were asked on use of software and submission of data. We all got an idea of what is required statistically in a medical device submission.
After lunch was a session given by the section on graphics and International Biometric Society (West N America Region). Why it wasn't cosponsored by biopharmaceutical, I'll never know. The talks were all about using graphs to understand effects of drugs, and how to use graphs to effectively support a marketing application or medical publication. The underlying message was get out of the 60's line printer era with the illegible statistical tables, and take advantage of new tools available. Legibility is key in producing a graph, followed by the ability to present a large amount of data in a small area. In some cases, many dimensions can be included on a graph, so that the human eye can spot potential complex relationships among variables. Some companies, notably big pharma, are far ahead in this arena. (I guess they have well-paid talent to work on this kind of stuff.)
These were three excellent sessions, and worth demanding more of my aching feet. Now I'm physically tired and ready to chill with my family for the rest of the week/weekend before doing "normal" work on Monday. But professionally, I'm refreshed.
Friday, July 20, 2007
Whistleblower on "statistical reporting system"
While I can't really determine whether Novartis is "at fault" from these two stories (and related echos throughout the pharma blogs), I can tell you about statistical reporting systems, and why I think that these allegations can impact Novartis's bottom line in a major way.
Gone are the days of doing statistics with pencil, paper, and a desk calculator. These days, and especially in commercial work, statistics are all done with a computer. Furthermore, no statistical calculation is done in a vacuum. Especially in a clinical trial, there are thousands of these calculations which must be integrated and presented so that they can be interpreted by a team of scientists and doctors who then decide whether a drug is safe and effective (or, more accurately, whether a drug's benefits outweigh its risks).
A statistical reporting system, briefly, is a collection of standards, procedures, practices, and computer programs (usually SAS macros, but may involve programs in any language) that standardize the computation and reporting of statistics. Assuming they are well-written, these processes and programs are general enough to process the data any kind of study and produce reports that are consistent across all studies, and, hopefully, across all product lines in a company. For example, there may be one program to turn raw data into summary statistics (n, mean, median, standard deviation) and present them in a standardized way in a text table. Since this is a procedure we do many times, we'd like to just be able to "do it" without having to fuss over the details. We feed the variable name in (and perhaps some other details like number of decimal places) and voila the table. Not all statistics is that routine (and good for me because that means job security), but perhaps 70-80% is and can be made more efficient. Other programs and standards will take care of titles, footnotes, column headers, formatting, tracking, and validation in a standardized and efficient way. This saves a lot of time in both programming and in review and validation of tables.
So far, so good. But what happens when these systems break? As you might expect, you have to pay careful attention to these statistical reporting systems, even go so far as applying some software development life cycle methodology. If they break, you influence not just one calculation but perhaps thousands. And there is no way of knowing - obscure bugs in the code might influence just 10 out of a whole series of studies, where a more serious bug might affect everything. If this system is applied to every product in house (and it should probably be general enough to apply to at least one category of products, such as all cancer products), the integrity of the data analysis for a whole series of products is compromised.
Allegations were also made that a contract programmer was told to change dates on adverse events, which could either be a benign but bizarre request if the reasons for the change are well-documented (it's better to change dates in the database than at the program level, because it's easier to audit changes to a database and specific changes to specific dates keep a program from being generalizable to other similar circumstances) or an ethical nightmare if the changes were done to make the safety profile of the drug look better. From Pharmalot's report, the latter was alleged.
You might guess the consequences of systematic errors in data submitted to the FDA. The FDA does have the authority to kick out an application if it has good reason to believe that its data is incorrect. This application has to go through the resubmission process, after it is completely redone. (The FDA will only do this if there are systematic problems.) This erodes the confidence the reviewers have in the application, and probably even all applications submitted by a sponsor who made the errors. This kind of distrust is very costly, resulting in longer review periods, more work to assure the validity of the data, analysis, and interpretation, and, ultimately, lower profits. Much lower.
It doesn't look like the FDA has invoked its Application Integrity Policy on Novartis's Tasigna or any other product. But it has invoked its right to three more months of review time, saying it needs to "review additional data."
So, yes, this is big trouble as of now. Depending on the investigation, it could get bigger. A lot bigger.
Update: Pharmalot has posted a response from Novartis. In it, Novartis reiterates their confidence in the integrity of their data and claims to have proactively shared all data with the FDA (as they should). They also claim that the extension to the review time for the NDA was for the FDA to consider amendments to the submission.
This is a story to watch (and without judgment, for now, since this is currently a matter of "he said, she said"). And, BTW, I think Novartis responded very quickly. (Ed seems to think that 24 hours was too long.)
Wednesday, March 28, 2007
A final word on Number Needed to Treat
So, what happens if the treatment has no statistically significant effect (sample size is too small or the treatment simply doesn't work). The confidence interval for absolute risk reduction will cover 0, say, maybe -2.5% to 5%. Taking reciprocals, you get an apparent NNT confidence interval of -40 to 20. A negative NNT is easy enough to interpret: -40 NNT means that for every 40 people you "treat" with the failed treatment, you get a reduction of 1 in favorable outcomes. A 0 absolute risk reduction results in NNT=∞. So if the confidence interval of absolute risk reduction covers 0, the confidence interval must cover ∞. In fact, in the example above, we get the bizarre confidence set of -∞ to -40 and 20 to ∞, NOT -40 to 20. The interpretation of this confidence set (it's no longer an interval) is that either you have to treat at least 20 people but probably a lot more to help one, or if you treat 40 or more people then you might harm one. For this reason, for a treatment that doesn't reach statistical significance (i.e. whose absolute risk reduction includes 0), the NNT is often reported as a point estimate. I would argue that such a point estimate is meaningless. In fact, if it were left up to me, I would not report an NNT for a treatment that doesn't reach statistical significance, because the interpretation of statistical non-significance is that you can't prove with the data you have that the treatment helps anybody.
Douglas Altman, heavy hitter in medical statistics, has the gory details.
Technorati Tags: number needed to treat, NNT
Wednesday, December 13, 2006
Realizations
Most other blogging on this subject seems to come from doctors. I've found very few statistical bloggers, and I'm the only person I know of that is blogging with a purely biostatistical focus. Yet it is the biostatistician who has to determine whether a clinical trial can give a statistically valid answer, and the doctor or pharmacologist who decides if the statistically valid answer has any meaning or relevance. This blog is about clinical trials and other research, and how to extract conclusions from them.
The material in this blog will draw from the news, my own personal experience, and even a bit of research, and is intended for a general and professional audience.
As for the title: in statistics a "realization" is one instance of data. We dream up these statistical models, use mathematics to say what we can about them without looking at real data, and then examine realizations to see how well our models and methods work in practice. The data that comes out of a clinical trial can be said to be a realization.