22 Factors affecting power
It is a truth universally acknowledged, that a researcher who’s studying a real, scientific effect must be in want of enough statistical power to detect it.
On the face of it, it seems like there’s a very simple solution to make sure that you have enough power: why not just have a massive sample every time? Then, we’re probably going to detect even the smallest effect sizes, and we don’t ever need to worry.
Well, it’s not quite so simple as that.
If you’ve done experiments of your own, you’re probably well aware that they’re expensive and time consuming, and it can be unrealistic to constantly be scaling experiments up (especially without any certainty that the effect of interest is real). In some contexts, such as in animal research, an overly large sample can also be a negative and wasteful thing; or perhaps it simply isn’t feasible, such as if you’re studying a rare disease or condition. A power analysis can help you balance the trade-off between statistical power and feasibility in your research. From the persepective of a funder, or supervisor, it also demonstrates that you’ve thought ahead and planned the experiment carefully, including the resources you’ll need for it.
22.1 Other factors to consider
Aside from a small effect size, there are other reasons why your sample size may need to be increased, depending on your planned analysis.
You’re expecting a high attrition or exclusion rate
This is likely to be outside of your control as a researcher, but is important to consider and account for when planning experiments. Attrition and exclusion are both ways in which your sample can be decreased, once you’ve started collecting data.
Exclusion refers to the deliberate choice of an experimenter to remove observations from the sample, perhaps because the participant fails to meet certain criteria, or the quality of the data is not high enough. For instance, in neuroimaging studies, participants’ data may be removed from the study if they moved too much during the functional scan and caused motion artefacts. (Crucially, these exclusions shouldn’t be related to the outcome variable - i.e., there should be no systematic bias in which observations are excluded from the sample, otherwise your dataset is no longer representative, and then we have an entirely new problem besides reduced power!)
The term attrition is used to refer to all other reductions in sample size that happen throughout a study, which generally aren’t due to the experimenter. This can include scenarios such as human participants choosing to drop out of a study, a number of plants in the greenhouse dying, or cell cultures failing. Attrition is a particular problem in clinical studies as they recruit from small populations of sometimes vulnerable or unwell participants; further, many clinical studies also have matched samples, which means that one participant dropping out means their match in other experimental group(s) must also be excluded, compounding the difficulty. As a general role, an attrition rate of >20% is considered problematic in a clinical setting.
The best way to deal with attrition or exclusion in a sample is to prepare for it in advance, by building in a “buffer” to your sample size. It’s likely that you’ll already know in advance that the type of experiment you’re designing is likely to have a high attrition or exclusion rate; hopefully, you will also have some indication of roughly what that rate might be, based on similar studies conducted by you or colleagues. You should increase your desired final sample size by this amount, so that if the sample is reduced during data collection or analysis, there will still be sufficient power.
You’re planning to separately analyse subsets of the dataset, or make multiple comparisons
It’s absolutely acceptable to plan to do either of these two things in your analysis. The trick is to plan for them.
If you’re intending to analyse a smaller subset of the data, this means that there needs to be sufficient power within that subset in order to detect the effect(s) of interest. It’s not enough for your overall sample size to be sufficient.
Also - analysing multiple subsets of the data will almost always constitute making multiple comparisons; it may be one of the most common ways, in fact, that multiple comparisons are introduced to an analysis pipeline. When making multiple comparisons, it’s typically recommended that you should adjust your significance threshold (or your p-values, which is mathematically equivalent) to reduce the chance of making a type I error. Increasing your significance threshold necessarily will reduce power, however, because of MathsTM. So, when making lots of multiple comparisons, you need a larger sample to boost power.
You know (or suspect) that there will be lots of parameters in your model
This includes, broadly, two sets of scenarios: 1) you have lots of predictors of interest, and/or 2) you have lots of uncontrolled variables that can affect your outcome unpredictably, that you plan to include as covariates of no interest. The overall impact on power is the same in both cases. Regardless of whether a variable is a predictor of interest, its inclusion in the model will still decrease the degrees of freedom, as more parameters need to be estimated, and increase the model’s complexity.
With any luck, you’ll be able to perform model comparison and simplify your model somewhat after building it (as a reminder: we are always looking for the simplest model that does a good job of explaining the variance in the dataset, in the goodness-of-fit vs complexity trade-off). But it’s best to make sure that you account for all possible parameters when performing a priori power analyses.
22.2 Summary
- Error, significance, effect size and power are all related to one another, such that you can calculate desired sample size a priori
- Multiple factors can decrease power, including making corrections for multiple comparisons and including a large number of parameters in your model
- It’s also important to be aware of decreases in sample size from attrition and exclusion, which will reduce power