By Leonard P. Freedman, Mark C. Gibson, Richard M. Neve
Irreproducible preclinical research is a global, expensive, and well-recognized problem that contributes to delays and increased costs of drug discovery (1, 2). One published study conservatively estimated the total prevalence of irreproducible preclinical research to exceed 50% with a cost of $28 billion per year in the United States alone (3), while other researchers have estimated that an astounding 85% of biomedical research is wasted as a result of correctable problems (4). Excluding scientific misconduct, which is not a major source of irreproducibility (5), lack of reproducibility typically results from cumulative errors or flaws in one or more of the following, non-discrete categories of the research and publication process: biological reagents and reference materials, study design, laboratory protocols, and data analysis and reporting (3). Although each contributes to a systemic problem that requires extensive changes in the overall scientific culture milieu, taking immediate steps to address biological reagent issues--specifically the use of cultured cell lines--is a relatively straightforward fix that will improve the credibility, reproducibility, and translation of preclinical research. It will also make more efficient use of scarce biomedical research resources.
Immortalized cancer cell lines isolated from various human and other mammalian tissues have been used for decades across multiple areas of biomedical research (6). Their use is central to most drug-discovery projects, from initial target validation studies, through clinical candidate selection, to subsequent translational studies (7). It is essential for drug-discovery scientists to have routine access to a wide variety of high-quality, well-characterized, and contaminant-free mammalian cell lines. For these applications, accurate determination of species, sex, and tissue of origin (i.e., identity) is crucial to interpretation, validity, and translation of research results (8). Cell lines are cultured, passaged, and processed in and among laboratories with widely varying quality control (QC) procedures, while sharing cell lines is endemic--particularly in academia. For these reasons, misidentification errors, including intraspecies (most commonly by HeLa cells ) and interspecies (nonhuman) cross-contamination, as well as labeling and cell naming errors, occur frequently and can persist for years (10). Changes in the genotype and phenotype of cells (i.e., drift) as a result of over-passaging of cell lines or poor culture technique continue to be persistent problems (11).
Microbial contamination of continuous cell cultures by a wide variety of microorganisms is also problematic in cell culture laboratories worldwide, particularly by mycoplasma. The latter can be difficult to detect, grow to high densities without adverse effect on cell morphology, and can impact a wide variety of cell functions, including changing response to therapeutics (12). Commercial tests are available to detect the most common forms of mycoplasma, but no affordable test exists to detect all Mollicute species.
Although multiple organizations (e.g., National Institutes of Health [NIH]) promote or recommend best practices for handling biospecimens and other biologicals, including cell lines (7, 13), none are universally followed by a majority of biomedical researchers. Expanding the development and use of best practices and consensus-based standards for obtaining and maintaining authenticated and contaminant-free cell lines should also include smaller repositories, otherwise biological materials in the public domain will likely become compromised over time (8). Table I summarizes factors that can contribute to irreproducible cell-based research.
Use and cost of misidentified and contaminated cell lines
How widespread is the problem? One key review examined the prevalence of contaminated cell lines from 1968 to 2007 and reported combined cell line misidentification and contamination rates ranging from 18% to 36%, with only a small improvement over time (14). A more recent estimate places the cross-contamination rate at 20% (15). While intraspecies contamination receives the majority of attention, approximately 6% of cell cultures are thought to be affected by interspecies cross-contamination (16). To complicate matters, a study of more than 200 biomedical papers found that only 43% of cell lines could be unambiguously identified by their description (e.g., authors provided a name and source for the line such as a repository) (17). This type of problem—coupled with journal-imposed space limitations—is symptomatic of a widespread lack of consensus on the level of detail required to allow adequate documentation of materials and methods in the literature to facilitate external replication of study results.
A search of NIH RePORTER identified 9000 projects and sub-projects that use cell lines at a total estimated taxpayer expenditure of $3.7 billion. If 18-36% of these research projects use misidentified or contaminated cell lines, potentially $660 million to $1.33 billion in research dollars could be affected. Based on two well-known misidentified cell lines, HEp-2 and INT 407, more than 7000 articles have been published that may have inappropriately used one or both cell lines at a total estimated cost of more than $700 million (15). Estimates of the prevalence of mycoplasma contamination of cell cultures vary widely, from 15% to as high as 35% (12, 18). An assessment of mycoplasma contamination in the National Center for Biotechnology Information Sequence Read Archive conservatively found that 11% of projects were contaminated (19). The authors estimated that hundreds of millions of dollars of NIH-funded research using continuous cell lines had been potentially affected.
Although cell lines contaminated with mycoplasma or other microorganism constitute an ongoing and expensive concern in cell banks and cell culture labs around the world, the remainder of this paper focuses primarily on misidentification and intraspecies cross-contamination of human cell lines used in preclinical biomedical research.
Authentication is the solution
The identity of a cell line (i.e., authentication) can be determined by comparing the genetic signature (profiling or fingerprinting) with established databases (e.g., American Type Culture Collection [ATCC] in USA, Japanese Collection of Research Bioresources [JCRB] in Japan, Deutsche Sammlung von Mikroorganismen und Zellkulturen [DSMZ] in Germany) to discover misidentified cells (20). It is important to emphasize, however, that profiling comprises only one component of understanding the complex molecular and phenotypic properties of a cell line, which is not a uniform, clonal population (8). To fully characterize a cell line requires detailed genomic, proteomic, and phenotypic analyses, which remains implausible and costly for most cell banks, let alone typical research laboratories. For this reason, cell line authentication and QC measures such as mycoplasma detection constitute an essential first step to establish and maintain the integrity of cell cultures and to enhance reproducibility of results using cultured cells. An American National Standards Institute (ANSI)-accredited, low-cost (approximately $150 fee for service or $15-30 in-house) standard for cell line authentication based on short tandem repeat (STR) profiling has been available for several years (21). Figure 1 provides a timeline of key events in cell line-based research and authentication.
Figure 1: Timeline of key events in cell line-based research and authentication. Figure courtesy of the authors.
Another DNA profiling test of cell-line identity uses single nucleotide polymorphism (SNP) variations between members of the same species within a specific locus, which are conserved during evolution (10, 22). Although commercial kits are becoming available, at present there is no ANSI-approved standard or centralized, online database for SNP-based cell line authentication. Table II compares and contrasts the pros and cons of STR and SNP profiling assays.
Despite the widespread availability of the STR standard and its low cost, there is little evidence that authentication is routinely used in the life sciences—particularly among academic researchers (23). One widely cited survey reported that only one-third of laboratories tested their cell lines for identity (24). A Nature Cell Biology editorial reported that only 19% of papers using cell lines published in the latter months of 2013 conducted (or at least reported conducting) cell-line authentication (25). Although the International Cell Line Authentication Committee (ICLAC) online database of Cross-contaminated or Misidentified Cell Lines is widely considered to be the “go-to” reference in the field (26), less than half of the respondents from a 2014 Sigma-Aldrich survey were familiar with the database, and only 11% searched the database during 2013 (27).
Many scientists remain unaware or unconvinced of the need to carefully establish and maintain cell cultures, and many do not authenticate their cell lines often enough or at all (8). More worrying is a lack of understanding of how to interpret DNA profiling results. The current status quo entails a de facto honor system that assumes all scientists use proper cell culture practices and authenticate their cells lines, as well as a pervasive presumption that misidentified or contaminated cells is a problem “for others” or is inconsequential for the final conclusions. This culture persists because most scientific journals—with few exceptions—do not require authentication as a condition of acceptance of research for publication (8). Merely reporting or attesting that cell lines were authenticated or checked against a database of misidentified or cross-contaminated database is not sufficient. To date, compliance levels and the impact of reporting guidelines to improve study reproducibility have been disappointing (28), but multidisciplinary efforts continue to promote transparency, openness, self-correction, and reproducibility in science reporting (29, 30).
Although expanding the commercial availability of inexpensive assays and fee-for-service providers will help make authentication more universal, a systematic approach with commitment by all key stakeholders that embraces the importance of targeted training and education is needed. At present, there is little or no standardized training on cell-culture best practices and authentication in basic biological research groups, although these do exist in GLP and GMP labs.
Improving awareness and training
To effect meaningful change, enhancing the reproducibility and translation of biomedical research using best practices for cultured cell lines and authentication must build upon ongoing multi-stakeholder efforts to raise awareness of the issues and solutions (6). The Global Biological Standards Institute (GBSI) #authenticate campaign (www.gbsi.org/authenticate) facilitates this kind of engagement (31). NIH’s proposed Principles and Guidelines for Reporting Preclinical Research, which were developed and are endorsed by many journals and research societies, recommend establishing best practice guidelines for cell lines, such as the need to authenticate cell lines, report the source of the cell lines, and communicate their mycoplasma contamination status (32). More recently, NIH announced clarifications to their expectations from the scientific community regarding the rigor of research proposed in grant applications, as well as additions to the review criteria used to evaluate proposals; these clarifications were developed in an attempt to enhance reproducibility (33). Notably, the latter changes, which will take effect in early 2016, include expectations for authentication of key biological and/or chemical resources, such as cell lines, antibodies, and other biologics.
Additional systematic changes are needed beyond raising awareness, expanding use of reporting guidelines, and revising proposal preparation and review criteria. Development, implementation, and dedicated funding to support targeted training and education is absolutely essential. Toward that end, in 2014 NIH launched an extramural grant initiative, “Training Modules to Enhance Data Reproducibility (R25)” (34).
Understanding existing barriers that prevent implementation of universal cell authentication is central to changing this state of affairs. GBSI conducted an online survey to determine why cell authentication and the STR standard specifically is not used more broadly, the results of which will be shared in 2015. As a central component of a broader educational program to improve the credibility, reproducibility, and translation of the life-science research, GBSI is developing an exportable “active learning” training module to reduce cell line misidentification, mislabeling, and contamination errors.
Targeted training, education, and access to reliable and affordable assays are crucial to change the culture of cell authentication. In conjunction with effective policies and expanded use of standards and best practices for cell culturing and authentication, knowledge of why and how often to perform cell authentication will improve; hundreds of millions of dollars in annual research expenditures will be used more efficiently; and the translation of discoveries from bench to clinical trial to bedside diagnostics and therapies will be accelerated. Considering the billions of dollars spent on cell-based research each year, expanded awareness and adoption of authentication protocols through targeted training is a relatively inexpensive way to considerably increase our annual return on biomedical research investment.
1. C.G. Begley and J.P. Ioannidis, Circ. Res. 116, pp. 116-126 (2015).
2. F.S. Collins, L.A. Tabak, Nature 505, pp. 612-613 (2014).
3. L.P. Freedman et al., PLoS Biol. 13:e1002165 (2015).
4. I. Chalmers and P. Glasziou, Lancet 374, pp. 86-89 (2009).
5. W. Gunn, Nature 505, pp. 483 (2014).
6. J.R. Lorsch et al., Science 346, pp. 1452-1453 (2014).
7. J.D. Wrigley et al., Drug Discov. Today 19, pp. 1518-1529 (2014).
8. L.P. Freedman et al., Nat. Meth. 12, pp. 493-497 (2015).
9. D.A. Kniss and T.L. Summerfield, Reprod. Sci. 1933719114522518 [doi] (2014).
10. M. Yu et al., Nature 520, pp. 307-311 (2015).
11. A. Capes-Davis et al., Int. J. Cancer 132, pp. 2510-2519 (2013).
12. S. Rottem et al.,“Contamination of Tissue Cultures by Mycoplasmas,” in Biomedical Tissue Culture, L. Ceccherini-Nelli, Ed. (INTECHOPEN, Rijeka, Croatia, DOI: 10.5772/51518, 2012), pp. 35-58.
13. R.J. Geraghty et al., Br. J. Cancer 10.1038/bjc.2014.166 [doi] (2014).
14. P. Hughes et al., Biotechniques 43, pp. 575, 577-578, 581-572 passim (2007).
15. J. Neimark, Science 347, pp. 938-940 (2015).
16. W.G. Dirks and H.G. Drexler, Methods Mol. Biol. 946, pp. 27-38 (2013).
17. N.A. Vasilevsky et al., PeerJ 1:e148 (2013).
18. V. Marx. Nat. Meth. 11, pp. 483-488 (2014).
19. A.O. Olarerin-George and J.B. Hogenesch, bioRxiv, Cold Spring Harbor Laboratory, pp. 1-24 (2014).
20. Y.A. Reid, Methods Mol. Biol. 731, pp. 35-43 (2011).
21. ATCC/SDO. ASN-0002: Authentication of Human Cell Lines: Standardization of STR Profiling, ATCC-Standards Development Organization (SDO), Manassas, Virginia, USA (2012).
22. F. Castro et al., Int. J. Cancer 132, pp. 308-314 (2013).
23. L.K. Riley and B.A. Bauer, “Authentication and Quality Control Guidelines for Human Transplantable Tumors and Cell Lines,” presentation at FELASA 2013 (Barcelona, Spain, 2013).
24. G.C. Buehring et al., In Vitro Cell Dev. Biol. Anim. 40, pp. 211-215 (2004).
25. Editors of Nature, “An update on data reporting standards,” Nat. Cell Biol. 16, pp. 385-385 (2014).
26. ICLAC, “Database of Cross-Contaminated or Misidentified Cell Lines,” http://iclac.org/databases/cross-contaminations/, accessed Aug. 12, 2015.
27. Sigma-Aldrich, “The Second Annual State of Translational Research: 2014 Survey Report,” pp. 1-32 (St. Louis, Missouri, USA, 2014).
28. D. Baker et al., PLoS Biol. 12:e1001756 (2014).
29. B.A. Nosek et al., Science, 348, pp. 1422-1425 (2015).
30. B. Alberts et al., Science, 348, pp. 1420-1422 (2015).
31. GBSI, “#authenticate,” www.gbsi.org/authenticate, accessed Aug. 12, 2015.
32. NIH, “Principles and Guidelines for Reporting Preclinical Research,” accessed Aug. 12, 2015.
33. NIH, “Enhancing Reproducibility through Rigor and Transparency,” accessed Aug. 12, 2015.
34. NIH, “Training Modules to Enhance Data Reproducibility (R25),” accessed Aug. 12, 2015.
About the Authors
Leonard P. Freedman is president, Global Biological Standards Institute; Mark C. Gibson is policy research analyst, Global Biological Standards Institute; and Richard M. Neve is senior research scientist, Gilead Sciences, Inc.