Sample size

In statistics, a sample size is the number of observations in a statistical sample.

This page contains too many unsourced statements and needs to be improved.

Sample size could use some help. Please research the article's assertions. Whatever is credible should be sourced, and what is not should be removed.

The larger the sample size, the more precise the estimate is of the population being studied in the statistical sample. Therefore, any study involving statistics should use a large enough sample size to draw a reasonable conclusion. More specifically, a sample should be large enough to reject the null hypothesis when, in truth, it should be rejected (i.e. statistical power). This ensures that what you observe in your study is actually real and not just a fluke.

Large samples

Large samples are particularly important in medical trials. This not only ensures that the medical treatment is effective for everyone, but it allows rare side-effects to be identified. It would be very easy to show something was safe with a small sample size of 100, only to give it to a population of millions to discover one in a thousand died. Hence samples are large and varied, importantly covering all the demographics needed, age, gender and/or lifestyle for example.

In the off chance a negative result were not suppressed, a large sample would assure that the negative finding has some merit. Without a sufficiently large sample, a study can fail to reject the null hypothesis even if the null hypothesis were false.

Small samples

Many pseudosciences rely instead on anecdotal evidence to draw their conclusions. As an anecdote involves a sample size of 1, it cannot be considered conclusive. Furthermore, multiple anecdotes do not constitute a larger sample size, but instead constitute several (often selectively chosen for their results) samples, each with a sample size of 1. Therefore, arguments based solely on anecdotes and testimony need to be studied further, but not taken as conclusive in themselves. It's often very easy to get dramatic results out of small samples because of the ease of which small numbers can, by pure chance alone, buck a real trend and give you the opposite result. If these are picked up by the media, they can start panics over what in reality is nothing. The claim that the MMR vaccine caused autism, which was rife in the UK press from 2001 to around 2005 relied solely on a scientific paper with a sample size of 12. In contrast, one of the more recent studies showing that the vaccine was safe had a sample size of 4000 (which the authors even noted was a bit on the small side).

In addition to pseudoscience, a great deal of legitimate investigative research suffer from small sample sizes. A recurring problem in neuropsychology is the extreme expense of populating functional magnetic resonance imaging studies. Almost all initial fMRI studies have sample sizes less than 100, and they rely on large effect sizes to find significance. Sometimes this leads to a kind of accidental p-hacking derived from the number of brain regions examined in each study. This has led to a replication crisis in the field[1]

Statistical power

Statistical power is the degree to which a study will properly reject the null hypothesis or, alternately, have a reasonably narrow confidence interval. Statistical power is the other side of statistical significance, focusing on the probability of the null hypothesis being rejected when, in fact, it is false. While conventional statistical significance is quite commonly set at 0.05 (5% chance of incorrectly rejecting the null), conventional statistical power ranges further. At a minimum, power of 0.8 (80% chance of rejecting the null when it is false) is considered adequate. Higher power, lower statistical significance levels, and smaller effect sizes all require larger sample sizes.

Sample size snobbery

Sample sizes regularly get brought up in online discourse. In an argument, one tactic is to reject your opponents claims citing that their sample size is too small. Since in many cases it is impractical to sample the entire population, the sample size could have always theoretically been bigger. It is therefore possible to call out your opponent's sample size as a rhetorical tactic whether it is a sufficient size or not. Another one is to say that your opponent's sample size is small in comparison to the larger population, but this overstates the importance of population size. In statistics, one regularly assumes population that is large to have infinite size which would make any finite sample size infinitesimal in comparison[2]. This does not cause a problem, since if there is some underlying universal probability distribution to your repeated trials, convergence to that distribution occurs anyway as long as your sample is random and individuals from your sample are chosen independently. For example when dealing with Bernoulli trials from an infinite population, quadrupling your sample size effectively halves your error[3]. This means (under reasonable statistical assumptions) you can get within a 5% error (99% confidence level) with a sample size of 1000 no matter the size of the population. Online tools[4] are available in order to see this effect.

gollark: Just use a magic portal, like for the laser video about laser videos.

gollark: I don't think I've ever actually broken any bones.

gollark: > Is it illegal to operate a machine that telepathically gives people nightmares?It's possible. The laws might have general provisions against knowingly harming people that way.

gollark: > What, recommending 13 year old vids that are only 30 seconds long? LolThat's *a* mystery of it, yes.

gollark: The mysteries of the magic algorithm.

References

Turner, B.O., Paul, E.J., Miller, M.B. et al. Small sample sizes reduce the replicability of task-based fMRI studies. Commun Biol 1, 62 (2018). https://doi.org/10.1038/s42003-018-0073-z
"Statistical Population, Mathworld".
Walker, Helen M (1985). "De Moivre on the law of normal probability". In Smith, David Eugene. A source book in mathematics. Dover. p. 78. ISBN 0-486-64690-4.
"Sample Size Calculator by Creative Research Systems".

This article is issued from Rationalwiki. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Turner, B.O., Paul, E.J., Miller, M.B. et al. Small sample sizes reduce the replicability of task-based fMRI studies. Commun Biol 1, 62 (2018). https://doi.org/10.1038/s42003-018-0073-z

[2] "Statistical Population, Mathworld".

[3] Walker, Helen M (1985). "De Moivre on the law of normal probability". In Smith, David Eugene. A source book in mathematics. Dover. p. 78. ISBN 0-486-64690-4.

[4] "Sample Size Calculator by Creative Research Systems".