fbpx
Wikipedia

Forking paths problem

The garden of forking paths is a problem in frequentist hypothesis testing through which researchers can unintentionally produce false positives for a tested hypothesis, through leaving themselves too many degrees of freedom. In contrast to fishing expeditions such as data dredging where only expected or apparently-significant results are published, this allows for a similar effect even when only one experiment is run, through a series of choices about how to implement methods and analyses, which are themselves informed by the data as it is observed and processed.[1]

History edit

Exploring a forking decision-tree while analyzing data was at one point grouped with the multiple comparisons problem as an example of poor statistical method. However Gelman and Loken demonstrated[2] that this can happen implicitly by researchers aware of best practices who only make a single comparison and only evaluate their data once.

The fallacy is believing an analysis to be free of multiple comparisons despite having had enough degrees of freedom in choosing the method, after seeing some or all of the data, to produce similarly-grounded false positives. Degrees of freedom can include choosing among main effects or interactions, methods for data exclusion, whether to combine different studies, and method of data analysis.

Multiverse analysis edit

A multiverse analysis is an approach that acknowledges the multitude of analytical paths available when analyzing data. The concept is inspired by the metaphorical "garden of forking paths," which represents the multitude of potential analyses that could be conducted on a single dataset. In a multiverse analysis, researchers systematically vary their analytical choices to explore a range of possible outcomes from the same raw data.[3][4][5] This involves altering variables such as data inclusion/exclusion criteria, variable transformations, outlier handling, statistical models, and hypothesis tests to generate a spectrum of results that could have been obtained given different analytic decisions.

The key benefits of a multiverse analysis include.

  • Transparency. It makes the analytical process more transparent by openly discussing the impact of different analytic choices on the results.
  • Robustness. By examining how conclusions vary across a range of analytical scenarios, researchers can assess the robustness of their findings. If a conclusion holds across many plausible analyses, it is considered more robust and less likely to be a product of arbitrary decision-making.
  • Identifying Consequential Decisions. It helps identify which analytical decisions most strongly influence the outcomes, guiding researchers towards more informed methodological choices in future studies.

This approach is valuable in fields where research findings are sensitive to the methods of data analysis, such as psychology,[4] neuroscience,[5] economics, and social sciences. Multiverse analysis aims to mitigate issues related to reproducibility and replicability by revealing how different analytical choices can lead to different conclusions from the same dataset. Thus, it encourages a more nuanced understanding of data analysis, promoting integrity and credibility in scientific research.

Concepts that are closely related to multiverse analysis are specification-curve analysis [6] and the assessment of vibration of effects.[7]

See also edit

References edit

  1. ^ "Garden of forking paths". FORRT - Framework for Open and Reproducible Research Training. Retrieved 2023-07-28.
  2. ^ Gelman, Andrew; Loken, Eric (November 14, 2013). "The garden of forking paths: Why multiple comparisons can be a problem, even when there is no "fishing expedition" or "p-hacking" and the research hypothesis was posited ahead of time" (PDF).
  3. ^ Steegen, Sara; Tuerlinckx, Francis; Gelman, Andrew; Vanpaemel, Wolf (2016). "Increasing Transparency Through a Multiverse Analysis". Perspectives on Psychological Science. 11 (5): 702–712. doi:10.1177/1745691616658637. ISSN 1745-6916.
  4. ^ a b Harder, Jenna A. (2020). "The Multiverse of Methods: Extending the Multiverse Analysis to Address Data-Collection Decisions". Perspectives on Psychological Science. 15 (5): 1158–1177. doi:10.1177/1745691620917678. ISSN 1745-6916.
  5. ^ a b Clayson, Peter E. (2024-03-01). "Beyond single paradigms, pipelines, and outcomes: Embracing multiverse analyses in psychophysiology". International Journal of Psychophysiology. 197: 112311. doi:10.1016/j.ijpsycho.2024.112311. ISSN 0167-8760.
  6. ^ Simonsohn, Uri; Simmons, Joseph P.; Nelson, Leif D. (2020). "Specification curve analysis". Nature Human Behaviour. 4 (11): 1208–1214. doi:10.1038/s41562-020-0912-z. ISSN 2397-3374.
  7. ^ Patel, Chirag J.; Burford, Belinda; Ioannidis, John P.A. (2015). "Assessment of vibration of effects due to model specification can demonstrate the instability of observational associations". Journal of Clinical Epidemiology. 68 (9): 1046–1058. doi:10.1016/j.jclinepi.2015.05.029. ISSN 0895-4356. PMC 4555355. PMID 26279400.


forking, paths, problem, garden, forking, paths, redirects, here, other, uses, garden, forking, paths, disambiguation, garden, forking, paths, problem, frequentist, hypothesis, testing, through, which, researchers, unintentionally, produce, false, positives, t. Garden of forking paths redirects here For other uses see Garden of forking paths disambiguation The garden of forking paths is a problem in frequentist hypothesis testing through which researchers can unintentionally produce false positives for a tested hypothesis through leaving themselves too many degrees of freedom In contrast to fishing expeditions such as data dredging where only expected or apparently significant results are published this allows for a similar effect even when only one experiment is run through a series of choices about how to implement methods and analyses which are themselves informed by the data as it is observed and processed 1 Contents 1 History 2 Multiverse analysis 3 See also 4 ReferencesHistory editExploring a forking decision tree while analyzing data was at one point grouped with the multiple comparisons problem as an example of poor statistical method However Gelman and Loken demonstrated 2 that this can happen implicitly by researchers aware of best practices who only make a single comparison and only evaluate their data once The fallacy is believing an analysis to be free of multiple comparisons despite having had enough degrees of freedom in choosing the method after seeing some or all of the data to produce similarly grounded false positives Degrees of freedom can include choosing among main effects or interactions methods for data exclusion whether to combine different studies and method of data analysis Multiverse analysis editA multiverse analysis is an approach that acknowledges the multitude of analytical paths available when analyzing data The concept is inspired by the metaphorical garden of forking paths which represents the multitude of potential analyses that could be conducted on a single dataset In a multiverse analysis researchers systematically vary their analytical choices to explore a range of possible outcomes from the same raw data 3 4 5 This involves altering variables such as data inclusion exclusion criteria variable transformations outlier handling statistical models and hypothesis tests to generate a spectrum of results that could have been obtained given different analytic decisions The key benefits of a multiverse analysis include Transparency It makes the analytical process more transparent by openly discussing the impact of different analytic choices on the results Robustness By examining how conclusions vary across a range of analytical scenarios researchers can assess the robustness of their findings If a conclusion holds across many plausible analyses it is considered more robust and less likely to be a product of arbitrary decision making Identifying Consequential Decisions It helps identify which analytical decisions most strongly influence the outcomes guiding researchers towards more informed methodological choices in future studies This approach is valuable in fields where research findings are sensitive to the methods of data analysis such as psychology 4 neuroscience 5 economics and social sciences Multiverse analysis aims to mitigate issues related to reproducibility and replicability by revealing how different analytical choices can lead to different conclusions from the same dataset Thus it encourages a more nuanced understanding of data analysis promoting integrity and credibility in scientific research Concepts that are closely related to multiverse analysis are specification curve analysis 6 and the assessment of vibration of effects 7 See also editStatistical hypothesis testing Researcher degrees of freedomReferences edit Garden of forking paths FORRT Framework for Open and Reproducible Research Training Retrieved 2023 07 28 Gelman Andrew Loken Eric November 14 2013 The garden of forking paths Why multiple comparisons can be a problem even when there is no fishing expedition or p hacking and the research hypothesis was posited ahead of time PDF Steegen Sara Tuerlinckx Francis Gelman Andrew Vanpaemel Wolf 2016 Increasing Transparency Through a Multiverse Analysis Perspectives on Psychological Science 11 5 702 712 doi 10 1177 1745691616658637 ISSN 1745 6916 a b Harder Jenna A 2020 The Multiverse of Methods Extending the Multiverse Analysis to Address Data Collection Decisions Perspectives on Psychological Science 15 5 1158 1177 doi 10 1177 1745691620917678 ISSN 1745 6916 a b Clayson Peter E 2024 03 01 Beyond single paradigms pipelines and outcomes Embracing multiverse analyses in psychophysiology International Journal of Psychophysiology 197 112311 doi 10 1016 j ijpsycho 2024 112311 ISSN 0167 8760 Simonsohn Uri Simmons Joseph P Nelson Leif D 2020 Specification curve analysis Nature Human Behaviour 4 11 1208 1214 doi 10 1038 s41562 020 0912 z ISSN 2397 3374 Patel Chirag J Burford Belinda Ioannidis John P A 2015 Assessment of vibration of effects due to model specification can demonstrate the instability of observational associations Journal of Clinical Epidemiology 68 9 1046 1058 doi 10 1016 j jclinepi 2015 05 029 ISSN 0895 4356 PMC 4555355 PMID 26279400 nbsp This statistics related article is a stub You can help Wikipedia by expanding it vte Retrieved from https en wikipedia org w index php title Forking paths problem amp oldid 1219857258, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.