Confidence interval
A confidence interval is a range of values used to indicate the degree of uncertainty in a scientific finding. Although it can often be derived from the same computations used to produce statistical significance, it represents a different way of evaluating scientific evidence. Where statistical significance describes uncertainty in terms of a hypothesis-based probability, a confidence interval directly shows uncertainty without necessarily leading to a decision between various hypotheses. In some circles, confidence intervals are preferred because they don't result in any seemingly arbitrary null and alternative hypotheses.[1] The width of the confidence interval varies with the precision of the study effect estimate.
The poetry of reality Science |
We must know. We will know. |
A view from the shoulders of giants. |
v - t - e |
Definition(s)
Conventionally, a confidence interval is the range which contains the true value, such as a population average, with a certain degree of confidence.[2] A 95% confidence interval, computed from a sample, of the mean height of children with Duchenne Muscular Dystrophy would have a 95% chance of containing the true average height of these children.
The true statistical definition is a little more nuanced and, in fact, true. A 95% confidence interval says that 95% of the time, in repeated samples, an interval constructed in a proper way will contain the true parameter. It does not, as commonly believed, say anything about the one specific sample for which a confidence interval is produced. Got it? It just means that 95% of the time, the interval performs as hoped and 5% of the time it won't. Still, based on one study, the calculated confidence interval is the best available guess of a true repeated-sample confidence interval.
In a Bayesian confidence interval (called a credible interval), the interval is not necessarily the one that contains the true value-of-interest. Instead, it is the range within which the true value varies some proportion of the time.
Other kinds of intervals exist that are conceptually related but that account for some other factors.
- A prediction interval is a confidence interval around a single new observation.
- A tolerance interval, rather than containing the true parameter, contains some proportion of the observed subjects.
- A confidence region extends the confidence interval beyond a single dimension.
Relation to statistical significance
The confidence interval does have a direct relationship to hypothesis tests. If the null hypothesis states that there is no difference between groups, a 95% confidence interval of the observed difference corresponds to one of two conditions:
- If the interval contains 0, the null hypothesis would not be rejected
- If the interval does not contain 0, the null hypothesis would be rejected with p<0.05.
Interpreting confidence intervals as a hypothesis test results in the exact same problems as generating a hypothesis test directly.[3] Rather than showing a range of values that could, plausibly, be the real value, using a confidence interval for a hypothesis test just says that the true value isn't a particular value — something is lost.
Confidence intervals in surveys
The most commonly encountered confidence interval comes from survey results. For a given question, results are reported as the percentage responding a particular way (say, in favor of a political candidate) and either a margin of error or a confidence interval. A margin of error is simply how far each end of the confidence interval is from the observed result, while the confidence interval is … well … the confidence interval.[4] The basic formula is relatively simple for a simple random sample.[5] However, in larger surveys, respondents are not assumed to be sampled completely at random. Such confidence intervals require adjustment for over/under-sampling of specific subgroups, a complicated process that guarantees work for survey statisticians.[6]
In surveys or in anything with a percentage, common forms of a confidence interval are always wider when the sample percentage is close to 50%, all other things being equal. This has nothing to do with the magic of 50%, just with the variance of a binomial distribution mean. This variance, used to compute the confidence interval, is larger near 50% and progressively smaller everywhere else. It's just math. Unfortunately, that magic value of 50% is often the deciding factor between one side being ahead or behind. It's frustrating. Live with it.
Unusual confidence intervals
There is no rule that says a confidence interval has to be symmetrical and tidy. In fact, any computed interval that covers 95% of the probability is properly known as a 95% confidence interval even if it's lopsided. Confidence intervals for odds ratios and relative risks are often oddly shaped, with the side close to 1 being smaller than the side away from 1. Probabilistically, though, they are centered with half the probability on each side of the estimated ratio. Putting all the improbable parts to one side results in a one-sided confidence interval that allows a researcher to say that, with 95% confidence, the true value falls somewhere above/below a specific value.[7]
See also
References
- Gardner and Altman 1986. Confidence intervals rather than P values: estimation rather than hypothesis testing.
- For 100% confidence, that range would extend from negative infinity to positive infinity
- McCormack, Vandermeer, and Allen 2013. How confidence intervals become confusion intervals
- LA Times, What is a "margin of sampling error"? What is a "confidence interval"?
- see, for example, The application of the 95% Confidence interval with ISAT and IMAGE
- For example, consider the "brief" overview of variance estimation for the National Health and Nutrition Examination Survey (NHANES)
- Assuming they use the conventional definition and not the precise definition.