By: D. Gage Jordan
“Replication is a central tenet of science; its purpose is to confirm the accuracy of empirical findings, clarify the conditions under which an effect can be observed, and estimate the true effect size” – Klein et al. (2014
Reproducibility Crisis?
A recent hallmark paper entitled Estimating the reproducibility of psychological science appeared in Science in August, 2015. This paper was a collaboration by multiple researchers examining the effects of a variety of studies utilizing self-report measures. Overall, the results from this paper showed a disturbing trend: for the original studies, there appeared to be several “significant results” across the board (i.e., p-values less than .05, the gold standard), whereas their replications indicated a preponderance of “effects” with p-values larger than .05 (i.e., not significant). In addition, the effect sizes (correlation coefficients) were seemingly much smaller, or even in the opposite direction, than the original studies’ effect sizes. Given the advancement in measurement and statistical methodology, the authors concluded that there is sound evidence about the concerns of reproducibility in psychological research – adequately powered replication studies, controlling for the effects of variation within research terms, found that “success” of reproducibility was minimal.
Related tenents?
The main tenets of science that emphasize reproducibility and the ability to find a “true” effect must be considered in tandem with other underlying assumptions about scientific research. That is, can we really find a “true” effect or “confirm” any hypotheses? Scientific discourse is laden with words such as “truth” and “confirmation,” implying that rigorous methodology is the be-all and end-all in discovering the truth. However, scientists understand that one never really “confirms” a hypothesis; rather, they provide support for it. One never creates a theory that is the undeniable truth, as this theory would not be falsifiable. Can we examining the reproducibility crisis in psychological science under these tenets as well? Gilbert and colleagues (2016) provide an alternative account of what may be driving this inability to reproduce psychological science. For example, they turn to the characteristics of the study itself. That is, what happens when we attempt to reproduce a study with the same procedures (e.g., measures), drawing from an entirely new population? One is likely not using the same participants for a conformity study, for example, that took part in Asch’s (1951) study. Thus, Gilbert and colleagues state that it indeed seems plausible that some replications might not show the same result because researchers are drawing from a different pool of subjects. Asch’s study can also be used as an example. During the 1950s, there was a high prevalence of “McCarthyism” (related to fear of Soviet invasion). Who wouldn’t want to conform with their peers, under the threat of being outed as a “Communist?” Could one assume that college students in this decade are different from students in the 1950s (or do we need science to “confirm” this as well)?
Moving on from more general principles of science into specific psychological effects, Sinclair, Hood, & Wright (2014) investigated the reproducibility of the “Romeo and Juliet effect” (Driscoll, Davis, and Lipetz, 1972), wherein couples who reported an increase in parental interference in their relationship also showed an increase in love over the same 6-month period in a longitudinal sample. Sinclair and colleagues sought to replicate this study with the original measures (e.g., Likert-type scales assessing “love” and “commitment”), as well validated measures of the constructs measured in the original study. Interestingly, the authors did not replicate the original effects from Driscoll and colleagues using the original study’s measures. However, they found that consistent support for the notion that greater the approval for a relationship, the better the relationship fared over time. This effect is also consistent with previous research (cf. Wright and Sinclair, 2012).
Widening the net, a study by Klein et al. (2014; emphasis on the “et al.”) attempted to replication a number of studies that examined effects such as sex different in implicit math attitudes (e.g., women having more negative implicit attitudes toward math), to anchoring (i.e., whether an estimate of how big a distance is would be based on a previous estimate or answer). Interestingly, this large-scale replication showed that 11 of their 13 effects were successfully replicated (the two that weren’t: currency priming influencing system justification and flag priming influencing conservatism). Taken together, these studies (along with the studies that “fail”) indicate variation in the replicability of psychological effects (Klein et al., 2014).
In conclusion
The reproducibility “crisis” has hardly solved anything in psychological science. Whereas many studies have not been reproduced, several effects from other studies indeed have. Importantly, cultural considerations need to be taken into account, as humans different between eras, as well as within and between societies (Henrich et al., 2010). In sum, psychological science should take note of the variation amongst people that they study, as well as consider how much we really know about an effect. That is, perhaps it is promising that some studies have failed to replicate – it creates incentive for a more rigorous investigation of what we think is true.
References
- Driscoll, R., Davis, K. E., & Lipetz, M. E. (1972). Parental interference and romantic love: The Romeo & Juliet effect. Journal of Personality and Social Psychology, 24, 1–10.
- Gilbert, King, Pettigrew, & Wilson. (2016). Comment on “Estimating the reproducibility of psychological science.” Science, 351, 1037-b.
- Henrich, J., Heine, S. J., & Norenzayan, A. (2010). Most people are not WEIRD. Nature, 466, 29-29.
- Klein, R. A., Ratliff, K., Nosek, B. A., Vianello, M., Pilati, R., Devos, T., & Galliani, E. M. (2014). Investigating variation in replicability: The “many labs” replication project. Retrieved from Open Science Framework.
- Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349, aac4716.
- Sinclair, H. C., Hood, K. B., & Wright, B. L. (2014). Revisiting the Romeo and Juliet Effect (Driscoll, Davis, & Lipetz, 1972). Social Psychology, 45, 170-178.
- Wright, B. L., & Sinclair, H. C. (2012). Pulling the strings: Effects of friend and parent opinions on dating choices. Personal Relationships, 19, 743-758.