Aalbersberg, I. J., Appleyard, T., Brookhart, S., Carpenter, T., Clarke, M., Curry, S. ., & Vazire, S. (2018). Making science transparent by default: Introducing the TOP statement. Retrieved from https://doi.org/10.31219/ osf.io/sm78t 

Allen, C., & Mehler, D. M. (2019). Open science challenges, benefits and tips in early career and beyond. PLoS Bi ology, 17, e3000246. doi:10.1371/journal.pbio.3000246

Altman, D. G., & Bland, J. M. (1995). Absence of evidence is not evidence of absence. BMJ, 311(7003), 485. doi:10. 1136/bmj.311.7003.485 

Baron, J. (2018). Prediction, accommodation and pre-registration. Retrieved from http : / / judgmentmisguided . blogspot . com / 2018 / 05 / prediction-accommodation-and-pre.html?m=1 

Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A.,

Redefine statistical significance. Nature Human Be haviour, 2(1), 6-10. doi:10.1038/s41562-017-0189-z Bleidorn, W., Arslan, R. C., Denissen, J. J., Rentfrow, P. J., Gebauer, J. E., Potter, J., & Gosling, S. D. (2016). Age and gender differences in self-esteem: A cross cultural window. Journal of Personality and Social

Psychology, 111(3), 396-410. doi:10.1037/pspp0000078

Butzer, B. (2019). Bias in the evaluation of psychology stud ies: A comparison of parapsychology versus neuro science. EXPLORE: The Journal of Science & Healing. doi:10.1016/j.explore.2019.12.010

Centre for Open Science. (n. d). What is preregistration?

Retrieved from https://cos.io/prereg/ Chambers, C. D., Forstmann, B., & Pruszynski, J. A. (2019). Science in flux: Registered reports and beyond at the European Journal of Neuroscience. European Journal of Neuroscience, 49(1), 4-5. doi:10.1111/ejn.14319 Coffman, L. C., & Niederle, M. (2015). Pre-analysis plans have limited upside, especially where replications are feasible. Journal of Economic Perspectives, 29, 81-98. doi:10.1257/jep.29.3.81 Cortina, J. M., & Dunlap, W. P. (1997). On the logic and pur pose of significance testing. Psychological Methods, 2,

61-172. doi:10.1037/1082-989X.2.2.161

Cox, D. R. (1958). Some problems connected with statistical inference. Annals of Mathematical Statististics, 29(2), 357-372. doi:10.1214/aoms/1177706618

Cox, D. R. (1977). The role of significance tests. Scandi navian Journal of Statistics, 4, 49-70. Retrieved from https://www.jstor.org/stable/4615652

Cox, D. R., & Mayo, D. G. (2010). Objectivity and conditional ity in frequentist inference. In D. G. Mayo & A. Spanos (Eds.), Error and inference: Recent exchanges on exper imental reasoning, reliability, and the objectivity and rationality of science (pp. 276-304). Cambridge, MA:

Cambridge University Press.

de Groot, A. D., Wagenmakers, E.-J., Borsboom, D., Ver hagen, J., Kievit, R., Bakker, M., . . . van der Maas, H. L. J. (2014). The meaning of “significance” for dif ferent types of research. Acta Psychologica, 148, 188-

194. doi:10.1016/j.actpsy.2014.02.001 Devezer, B., Navarro, D. J., Vandekerckhove, J., & Buzbas, E. O. (2020). The case for formal methodology in sci entific reform. doi:10.1101/2020.04.26.048306 Donkin, C., & Szollosi, A. (2020). Unpacking the disagree ment: Guest post by Donkin and Szollosi. Retrieved from http://www.bayesianspectacles.org/unpacking the-disagreement-guest-post-by-donkin-and-szollosi/

Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psy chological Review, 70(3), 193-242. doi:10 . 1037 /

Feynman, R. (1974). Cargo cult science. Retrieved from http://calteches.library.caltech.edu/51/2/CargoCult. htm 

Field, S. M., Wagenmakers, E.-J., Kiers, H. A., Hoekstra, R., Ernst, A. F., & van Ravenzwaaij, D. (2020). The effect of preregistration on trust in empirical research find ings: Results of a registered report. Royal Society Open Science, 7(4), 181351. doi:10.1098/rsos.181351 

Fisher, R. A. (1935). The design of experiments. London: Oliver & Boyd. 

Forstmeier, W., Wagenmakers, E.-J., & Parker, T. H. (2017). Detecting and avoiding likely false-positive findingsa practical guide. Biological Reviews, 92, 1941-1968. doi:10.1111/brv.12315 

Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Retrieved from http://www.stat.columbia.edu/ ~gelman/research/unpublished/p_hacking.pdf 

Gelman, A., & Loken, E. (2014). The statistical crisis in sci ence. American Scientist, 102, 460. doi:10.1511/2014. 111.460 

Giner-Sorolla, R. (2012). Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspectives on Psychological Science, 7, 562-571. doi:10 . 1177 / 

Hergovich, A., Schott, R., & Burger, C. (2010). Biased evaluation of abstracts depending on topic and conclusion: Further evidence of a confirmation bias within scien tific psychology. Current Psychology, 29(3), 188-209. doi:10.1007/s12144-010-9087-5 

Hollenbeck, J. R., & Wright, P. M. (2017). Harking, sharking, and tharking: Making the case for post hoc analysis of scientific data. Journal of Management, 43(1), 5-8. doi:10.1177/0149206316679487 

Jussim, L., Crawford, J. T., Anglin, S. M., Stevens, S. T., & Duarte, J. L. (2016). Interpretations and methods: To wards a more effectively self-correcting social psy chology. Journal of Experimental Social Psychology, 66, 116-133. doi:10.1016/j.jesp.2015.10.003 

Kane, M. J., Core, T. J., & Hunt, R. R. (2010). Bias versus bias: Harnessing hindsight to reveal paranormal be lief change beyond demand characteristics. Psychonomic Bulletin & Review, 17(2), 206-212. doi:10.3758/ PBR.17.2.206 

Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2, 196-217. doi:10.1207/s15327957pspr0203_4 

Kruglanski, A. W., & Ajzen, I. (1983). Bias and error in hu man judgment. European Journal of Social Psychology, 13(1), 1-44. doi:10.1002/ejsp.2420130102

Lakens, D. (2019). The value of preregistration for psycho logical science: A conceptual analysis. Japanese Psy chological Review, 62(3), 221-230.

Landy, J. F., Jia, M. L., Ding, I. L., Viganola, D., Tierney, W., Dreber, A., . . . the Crowdsourcing Hypothesis Tests Collaboration. (2020). Crowdsourcing hypoth esis tests: Making transparent how design choices

shape research results. Psychological Bulletin. doi:10. 1037/bul0000220 Ledgerwood, A. (2018). The preregistration revolution needs to distinguish between predictions and analy ses. Proceedings of the National Academy of Sciences, 115, E10516-E10517. doi:10.1073/pnas.1812592115

Leung, K. (2011). Presenting post hoc hypotheses as a priori: Ethical and theoretical issues. Management and Organization Review, 7, 471-479. doi:10 . 1017 / CBO9781139171434.009

Lew, M. J. (2019). A reckless guide to p-values: Local evi dence, global errors. In M. C. M. Bespalov & T. Steckler (Eds.), Good research practice in experimental phar macology (pp. 100-199). Berlin: Springer.

Lewandowsky, S. (2019). Summing up #PSprereg. Re trieved from https : / / featuredcontent . psychonomic . org/avoiding-nimitz-hill-with-more-than-a-little-red book-summing-up-psprereg/

Lin, W., & Green, D. P. (2016). Standard operating pro cedures: A safety net for pre-analysis plans. Polit ical Science & Politics, 49, 495-500. doi:10 . 1017 / S1049096516000810

Locascio, J. J. (2017). Results blind science publishing. Basic and Applied Social Psychology, 39, 239-246. doi:10. 1080/01973533.2017.1336093 Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584-585. doi:10.

1126/science.aal3618 Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago: University Press.

Mayo, D. G. (2014). On the Birnbaum argument for the strong likelihood principle. Statistical Science, 29,

227-239. doi:10.1214/1-3-STS457

Mayo, D. G. (2015). Learning from error: How experiment gets a life (of its own). In M. Boumans, G. Hon, & A. C. Petersen (Eds.), Error and uncertainty in scien tific practice (pp. 57-78). London: Routledge.

Mayo, D. G. (2018). Statistical inference as severe testing. Cambridge, MA.: Cambridge University Press.

Munroe, R. (2011). Significant. Retrieved from https://xkcd.

Navarro, D. (2019). Paths in strange places, Part I. Re trieved from https : / / djnavarro . net / post / paths - in - strange-spaces/ 

Nosek, B. A., Beck, E. D., Campbell, L., Flake, J. K., Hard wicke, T. E., Mellor, D. T., . . . Vazire, S. (2019). Preregistration is hard, and worthwhile. Trends in Cognitive Sciences, 23(10), 815-818. doi:10.1016/j.tics.2019.07.

Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceed ings of the National Academy of Sciences, 115, 2600- 2606. doi:10.1073/pnas.1708274114 

Nosek, B. A., & Lindsay, D. S. (2018). Preregistration be coming the norm in psychological science. APS Observer, 99, 1-4. Retrieved from https : / / www . psychologicalscience . org / observer / preregistration - becoming - the - norm - in - psychological - science / comment-page-1 

Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in psychology. Psychonomic Bulletin & Review, 26, 1596-1618. doi:10.3758/s13423-019-01645-

Reid, N., & Cox, D. R. (2015). On some principles of statisti cal inference. International Statistical Review, 83, 293- 308. doi:10.1111/insr.12067 

Rossi, J. S. (1990). Statistical power of psychological re search: What have we gained in 20 years? Journal of Consulting and Clinical Psychology, 58(5), 646-656. doi:10.1037//0022-006x.58.5.646 

Rouder, J. N. (2014). Optional stopping: No problem for Bayesians. Psychonomic Bulletin & Review, 21(2), 301- 308. doi:10.3758/s13423-014-0595-4 

Rubin, M. (2017a). An evaluation of four solutions to the forking paths problem: Adjusted alpha, preregistration, sensitivity analyses, and abandoning the Neyman-Pearson approach. Review of General Psychology, 21, 321-329. doi:10.1037/gpr0000135 

Rubin, M. (2017b). Do p values lose their meaning in ex ploratory analyses? It depends how you define the familywise error rate. Review of General Psychology, 21, 269-275. doi:10.1037/gpr0000123 

Rubin, M. (2017c). When does HARKing hurt? Identifying when different types of undisclosed post hoc hypothe sizing harm scientific progress. Review of General Psy chology, 21, 308-320. doi:10.1037/gpr0000128 

Rubin, M. (2019a). The costs of HARKing. The British Jour nal for the Philosophy of Science, online, 1-15. doi:10. 1093/bjps/axz050 

Rubin, M. (2019b). What type of Type I error? Contrasting the Neyman-Pearson and Fisherian approaches in the context of exact and direct replications. Synthese, 99,

Scheel, A. M., Schijen, M., & Lakens, D. (2020). An excess of positive results: Comparing the standard psychology literature with registered reports. doi:10.31234/osf.io/

p6e9c

Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P.,

Aust, F., Awtrey, E., . . . Carlsson, R. (2018). Many an alysts, one data set: Making transparent how varia tions in analytic choices affect results. Advances in Methods and Practices in Psychological Science, 1, 337-

356. doi:10.1177/2515245917747646

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359-1366. doi:10.1177/0956797611417632 Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2012). A 21

word solution. Dialogue, 21, 4-7. doi:10 . 2139 / ssrn .

Spanos, A. (2010). Akaike-type criteria and the reliability of inference: Model selection versus statistical model specification. Journal of Econometrics, 158(2), 204- 220. doi:10.1016/j.jeconom.2010.01.011 Szollosi, A., & Donkin, C. (2019). Arrested theory develop ment: The misguided distinction between exploratory and confirmatory research. Retrieved from https : / /

psyarxiv.com/suzej/

Szollosi, A., Kellen, D., Navarro, D. J., Shiffrin, R., van Rooij,

I., Van Zandt, T., & Donkin, C. (2020). Is preregistration worthwhile? Trends in Cognitive Science, 24(2), 94-95. doi:10.1016/j.tics.2019.11.009

Thabane, L., Mbuagbaw, L., Zhang, S., Samaan, Z., Mar cucci, M., Ye, C., . . . Goldsmith, C. H. (2013). A tuto rial on sensitivity analyses in clinical trials: The what, why, when and how. BMC Medical Research Method ology, 13, 92. doi:10.1186/1471-2288-13-92

Tukey, J. W. (1953). The problem of multiple comparisons. Princeton: Princeton University.

Tukey, J. W. (1991). The philosophy of multiple compar isons. Statistical Science, 6, 100-116. doi:10 . 1214 / ss /

Vancouver, J. B. (2018). In defense of HARKing. Indus trial and Organizational Psychology, 11, 73-80. doi:10. 1017/iop.2017.89

Wagenmakers, E.-J. (2007). A practical solution to the per vasive problems of p values. Psychonomic Bulletin & Review, 14(5), 779-804. doi:10.3758/BF03194105 Wagenmakers, E.-J. (2016). [comment] Why preregistra tion makes me nervous. Retrieved from https : / / www . psychologicalscience . org / observer / why - preregistration-makes-me-nervous Wagenmakers, E.-J. (2019). A breakdown of “preregistra www . bayesianspectacles . org / a - breakdown - of - preregistration-is-redundant-at-best/ 

Wagenmakers, E.-J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010). Bayesian hypothesis testing for psy chologists: A tutorial on the Savage-Dickey method. Cognitive Psychology, 60(3), 158-189. doi:10 . 1016 / j . cogpsych.2009.12.001 

Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on