Are Neuroscience Research Findings Reliable? Experts Weigh In

New analysis casts doubt on the reliability of many neuroscience studies, pointing to small sample sizes and low statistical power as key problems.

Researchers led by the University of Bristol examined 48 neuroscience meta-analyses published in 2011 and found that the average statistical power of the studies included was roughly 20 percent. In practical terms, that means a typical study had only about a one-in-five chance of detecting a true effect even when that effect existed.

The review, published in Nature Reviews Neuroscience, argues that low-powered studies are widespread across neuroscience and that this endemic underpowering undermines the reliability and efficiency of research in the field. Low statistical power often stems from small sample sizes, small true effect sizes, or a combination of both, and it has several damaging consequences for scientific inference.

Drawing of a brain with a blue question mark — Review authors estimate only about a one-in-five chance that an average study would detect the effect under investigation.

Low power reduces the probability that a study will detect real effects, and when significant results are reported from low-powered studies those findings are more likely to be false positives or to substantially overestimate the true effect size. The review found this pattern across various neuroscience approaches, including brain imaging, genetics, and animal models, where smaller studies tended to report stronger positive results than larger, better-powered studies.

The review team included researchers from the University of Bristol, Stanford University, the University of Virginia and the University of Oxford. Key contributors were Kate Button (School of Social and Community Medicine) and Marcus Munafò (School of Experimental Psychology). The authors emphasize that the combination of small sample sizes and modest effect sizes creates a systematic problem: claims of discovery based on underpowered designs are inherently less reliable and more likely to mislead other researchers and the public.

As one of the lead authors explained, many neuroscience studies lack the statistical muscle to provide definitive answers. Even when an effect is real, a study with 20 percent power will miss it four times out of five. That low sensitivity both wastes resources and makes the published literature noisier, because the subset of small, underpowered studies that do happen to report significant results often inflate the perceived strength of an effect.

Beyond the immediate risk of false positives, the review documents how publication practices and analytical choices can amplify problems from low power. Selective reporting, failure to disclose full methods or negative findings, and an emphasis on novel positive results all interact with small sample sizes to skew the evidence base. The authors make the case that many published findings in neuroscience should be treated with caution unless they are replicated in larger, well-powered studies.

To improve reliability and reproducibility in neuroscience, the paper recommends several practical, evidence-based steps. These include planning studies with adequate sample sizes to achieve reasonable statistical power, transparently reporting methods and all results (including null findings), and clearly acknowledging limitations when interpreting outcomes. The authors also advocate collaborative approaches that pool resources and data to increase total sample sizes and statistical power, thereby reducing the chance of misleading conclusions.

Improving study design, preregistration of analysis plans, and routine sharing of raw data and code are other standard methodological practices that the review highlights as important for raising the quality of neuroscience research. The core message is that modest changes in standard practice can have a large positive effect on the robustness and usefulness of results across imaging, genetics, animal research, and other subfields.

Notes about this neuroscience research

Contact: Philippa Walker – University of Bristol
Source: University of Bristol press release
Image Source: Illustration of a brain with a question mark created by NeuroscienceNews.com, adapted from a public-domain image credited to Popular Science Monthly, Volume 46, 1894. The image was released back into the public domain.
Original Research: “Power failure: why small sample size undermines the reliability of neuroscience” by Katherine Button, John Ioannidis, Claire Mokrysz, Brian Nosek, Jonathan Flint, Emma Robinson and Marcus Munafo. Nature Reviews Neuroscience. Published online April 10, 2013. DOI: 10.1038/nrn3475