by Arthur Ruiz
A little bit of bad science can go a long way
Historians of science have always appreciated the collaborative efforts of communities of researchers, whose individual efforts can result in huge aggregate advances. For every revolutionary scientist like Isaac Newton or Nikola Tesla, who almost single-handedly changed the direction of their fields, there are numerous unsung scientists who have pushed a given discipline forward by their joint efforts. Newton and a few other Titans have planted entire mountains into the scientific landscape, while many others have resolutely contributed their pebbles. But pebbles can accumulate, and Newton himself was humble enough to observe: “If I have seen further, it is by standing on the shoulders of giants.” Scientists rely on the results and the insights of both predecessors and peers to drive their own studies forward. As contributors, however minute, to the collective edifice known as Science, we should therefore feel a special affront when we hear of an instance of scientific fraud or malfeasance.
An advance or a development that can take a community of researchers a generation to build up can be compromised with a single bad paper. To understand the repercussions of bad science, we need to consider the trajectory that a bad publication can take. After going through some kind of peer-review process, which itself is susceptible to problems of patronage and factionalism, a paper gets published. Scientists in similar fields have their PubCrawler alerts set to specific keywords that deposit the paper right in their email inboxes. You read the paper, and consider it, and re-read it, and criticize it and scrutinize it and do all the other things that scientists love doing with other scientists’ work. Then the fun gears start turning when you fit the new story into the existing web of knowledge, and begin to consider how this could impact some project or interest of your own.
Good science is one of those incremental pebbles of knowledge, that fills in a gap in an existing signaling pathway model or informs you of a new stimulus to which a particular transcription factor is responding. It leads you towards a better understanding of their own experimental system, in which you didn’t consider the background levels of some elusive expression factor until you read that one key paper. Good science is a “tide that raises all of our boats.” Bad science releases termites onto our boats that chew holes in the hulls, and you don’t even realize they are there until we become waterlogged. Bad science inspires misguided hypotheses, which lead to flawed experiments and perplexing results. “Bad,” or negative, results are simultaneously one of the most frustrating but also useful things that can happen. Bad results confound meticulous planning and informed speculation and well-polished ideas, and force you back to the beginning to figure out what went wrong. And while bad results may reflect poor methodology or technique or materials, all too often they reflect the genuine validity of your null hypothesis. No change, no effect, no statistical significance. Your beautiful idea, elegantly modeling some natural process or phenomenon, is moot.
Troubleshooting at the first level involves checking your reagents, re-calculating your dilutions, calibrating your pH meter and pipettes, looking for microbial contamination in your cell cultures – basic technical “noise.” Troubleshooting at the next level involves your experimental design – are you using the right cell type, will it express the right protein, do you have enough samples to show significance? This can be tempting to ignore while you keep on repeating experiments until you can “get it to work.” But this is not even the most fundamental level of troubleshooting, because your hypothesis and your experimental design and the shRNA that you ordered and the clinical population that you are sampling are largely based on the work of other researchers. And here is where Newton’s “shoulders of giants,” which initially seems like such a convenient shortcut, instead becomes a dizzying precipice, because to truly independently verify every scientific assumption that goes into a project would be to endlessly re-invent every wheel in the world. This implicit trust that every scientist must have with their community can turn sour when peer-reviewed publications are revealed to be flawed.
Science is ultimately empirical. Outdated ideas and paradigms can be jealously cleaved to by the old guard, institutional biases, and august personages who have built entire careers around accepted archetypes. Change takes time, communication, and persuasion. It requires well-established scientists to consider the flaws and failings in their celebrated works. It takes a willingness by grant funding agencies to change their priorities and directions. Most of all, it takes a weight of well-designed and ethically-performed experiments and data analysis to convince open-minded scientists that a new model of describing reality is superior to the existing one. Aristotle’s heavenly spheres yielded to Copernicus’ heliocentric system, Lamarck’s ideas of accumulated biological change gave way to Darwin and Mendel, and Newton’s laws of motion were supplanted by Einstein (at frames of reference near the speed of light, at least). With apologies to Dr. Martin Luther King: “The arc of the scientific universe is long, but it bends towards truth.”
So how does bad science affect the trajectory of this arc? Part of it depends on how much traction it gets in the first place. Consider the case of “cold fusion.” In the late 1980’s, Martin Fleischmann and Stanley Pons performed experiments using a palladium cathode and heavy water to create an electrolytic cell, set up within a calorimeter to measure generated heat. While running the electrolysis reaction, the power input usually matched the measured output, but they reported intermittent spikes of energy that raised the temperature of the cell, yielding a power output much higher than the input. Fleischmann and Pons saw this as a confirmation of their hypothesis that due to the “very high compression and mobility of the dissolved species (deuterium ions, D+) there must therefore be a significant number of close collisions and one can pose the question would nuclear fusion of D+ be feasible under these conditions?” They concluded that the excess heat generated could not be explained by chemical reactions, but instead had to be the result of “cold” (i.e. room temperature) fusion of deuterium atoms in their mixture. Their results were peer-reviewed and published in the Journal of Electroanalytical Chemistry in March 1989, resulting in frenzied media attention. The world was ready and eager for a cheap, clean-energy breakthrough – the 1970’s OPEC oil embargo was still in people’s living memories, and the Exxon Valdez tanker spill had underscored the environmental hazard of relying on dirty fossil fuels.
Unfortunately, the reality failed to live up to the hype. Attempts to replicate the experiment were met with failure, and significant problems were identified in the experimental setup and monitoring equipment, including the calorimeter and neutron detector used. The field was quick to conclude and publicize the fact that “cold fusion” was a bust and relegated back to the realm of science fiction.
A similar case of “post-publication peer review” involves a 2014 Nature publication by RIKEN group in Japan, which described the potentially groundbreaking discovery of induced pluripotency in differentiated murine leukocytes by a simple incubation in a low pH solution. The stem cell world was excited at the prospect of a straightforward way to generate induced pluripotent stem cells (iPSC), but attempts to replicate the procedure were fruitless, and the work was quickly revealed to contain both experimental mistakes and two instances of intentional fraud. In both the cold fusion and the iPSC examples, the public saw an efficient and laudable engagement of the “post-publication” peer review process at work, and the steadied hand of the scientific community bringing an enthusiastic public back to earth.
Bad science truly wrecks the most havoc when it becomes part of the “accepted knowledge” in a field. Once the weeds are growing alongside the crops, it can be a nightmarish endeavor to extirpate them. One of the best historical examples is that of the Piltdown Man. By the early 20th century, anthropologists had already made discoveries of early human fossils in Europe, including the Neanderthal finds in Germany and Cro-Magnon in France, as well as the discovery of Homo erectus fossils in Java. British scientists were eager to establish an early human presence in the British Isles, to confirm what many at the time saw as the “inherent superiority” of the British, as illustrated by the vast expanse of their Empire and culture. In this atmosphere of impatient anticipation, Charles Dawson and Arthur Woodward presented the fragments of a human skull recovered from the Piltdown gravel pit, to a meeting of the Geological Society of London in 1912. The reconstruction of the skull was notable for its striking combination of human and simian features, specifically a juxtaposition of a human-like cranium with an ape-like jaw. The find was quickly celebrated as a critical “missing link” between humans and apes, and despite some contemporary challenges, was almost universally accepted as a legitimate example of human evolution, which had incidentally occurred within England (and flattered British national sensibilities by virtue of having a larger cranium than its Neanderthal and Cro-Magnon “competitors”).
Subsequent anthropological discoveries pointed to the development of human-like jaws before the development of larger brains, with Piltdown remaining a fiercely defended anomaly. But it took almost 4 decades for Piltdown to be ousted as a fraud – in 1949, Kenneth Oakley did fluoride testing on the fossils that revealed their age to be closer to 50,000 years rather than 500,000, and then in 1953 the skull and jawbone were confirmed to have originated from entirely different species. Almost overnight, the forgery, which included chemically stained bones and teeth filed down to look human, became obvious. Piltdown had led anthropologists down a false road of human evolution for more than a generation, created an artificial controversy between ”cranium” and “jawbone” primacy, and had provided a shocking example to the public of the potential for nominally objective and empirical scientists to embrace an emotionally appealing false narrative.
A more recent example of entrenched bad science, with a still-unfolding impact on medicine and public health, involves the fraudulent research of Andrew Wakefield regarding the putative “link” between vaccines and autism. In 1998, Wakefield, a gastroenterologist at the Royal Free Hospital School of Medicine in London, was the first author of a publication in the Lancet that described a new condition called “autistic enterocolitis.” This was based off of 12 children who had developed bowel disease following an MMR vaccination, and 8 of those children who went on to develop autistic attributes. Without mincing words, the study was a complete farce. Wakefield falsified the children’s medical histories, used blood samples he took from children at his son’s birthday party, and failed to disclose the hundreds of thousands of dollars in “advisory fees” that he had received from a group of lawyers planning to sue vaccine manufacturers. The General Medical Council of Britain found Wakefield guilty of 30 separate charges of misconduct, and barred him from practicing medicine. The impact of Wakefield’s study was immediate and widespread – MMR vaccination rates in Britain dropped to as low as 80% by 2004, and the non-medical opt-out rate in the United States remains alarmingly high.
Wakefield’s report, similar to cold fusion a decade earlier, appealed to an aspect of public sentiment, in this case a reaction towards the increasing rates of autism diagnoses and a demand for an explanation. Generation X had not seen first-hand the scourges of smallpox and polio that their parents and grandparents had dealt with, and many felt free to assign blame to a scary, invasive procedure that stuck dangerous-sounding substances into healthy children. Multiple follow-up studies conducted by the CDC, the American Academy of Pediatrics, the US National Academy of Sciences, and other prominent medical research groups all failed to confirm the link between the MMR vaccine and autism.
Most scientists sensed the link was not legitimate; however, celebrities and internet bloggers decided to weigh in with their opinions about immunology, epidemiology, and early childhood cognitive development. The Lancet finally retracted the bogus publication in 2010, but we are still dealing with the aftermath of Wakefield’s fraud, where diseases like measles and whooping cough are emerging from their 20th century tombs. Perhaps more insidious is the damage Wakefield has done to the public’s perception of science. His original publication seemingly confirmed the paranoid views of certain people that doctors and the medical/pharmaceutical industries were intentionally giving children autism as some sort of moneymaking scheme. And now, we are still fighting the ongoing public, decade-long battle to formally repudiate his claims and re-establish vaccines as one of the transformative wonders of the modern age. Yes, empirical science triumphed; yes, the unethical fraudster was uncovered; but at what cost?
What happens when a corporate agenda drives bad science? “Doubt is our product” reads an infamous 1969 tobacco industry memo. Despite the emerging medical consensus by the 1950’s that there was a casual link between smoking and a host of diseases, the tobacco industry infamously muddied the waters of public debate by hiring their own scientists to produce “studies” casting doubts on the health risks of smoking. They were successful at slowing the pace of public health disclosures and regulations on tobacco for decades. Today we see energy companies and their political associates using the same strategy of “subsidizing controversy,” to create the perception of “debate” in the area of climate science. Despite 97% of established climate scientists supporting the contention that anthropogenic climate change (ACC) is a real phenomenon with a significant impact on the planet, anonymous billionaires with economic interests in fossil fuels have donated hundreds of millions of dollars to anti-climate change groups, whose purpose it is to create a “subsidized controversy” around established science. Gutless politicians who want to avoid any substantial action on this front can then claim that “the science isn’t in yet” or “we need to keep studying this issue,” while global CO2 levels creep even higher. Recently, the state of North Carolina even banned the use of climate science to factor in predictions of sea level rise, relating to issues like coastal development and home insurance assessment!
In general, scientists are pretty good at spotting a fraud. If something seems too good to be true, it usually is, and most scientists are level-headed enough to see this. But scientists are not perfect, and either through well-meaning error or malicious fraud, bad science gets through. The containment level at which the bad science is identified and stopped is critical – most errors and mistakes will be caught at the pre-publication level, but even shoddy results that get published are usually scrutinized by an author’s immediate colleagues. But whereas everyone who reads a paper about cold fusion or induced pluripotency will run out to replicate those experiments, what about the “unsexy” results that people read and tend to accept? How many scientists feel the need to generate their own crystal structures of a complicated enzyme-substrate binding intermediate, or to validate the results of a small molecule inhibitor screen? What about all of the unexamined assumptions we accept, that allow us our vantage point on the shoulders of giants? For example, how many cell biology labs bother to test the identities of the cell lines they use? A recent study has estimated that between 18 and 36% of all cell lines are misidentified or contaminated. An online compendium lists 475 misidentified or cross-contaminated lines. And a striking example of the consequences: a breast cancer cell line, MDA-435, has metastatic properties that has made it a popular (and frequently shared) cancer line since its isolation in the late 1970’s, and it has appeared in more than a thousand published papers. About 15 years ago, scientists working with MDA-435 did DNA testing on the cells, and were shocked to find that what they had thought were breast cancer cells were, in fact, contaminated with a melanoma skin-cancer line. The contamination was found to go back decades, severely compromising the results of most, if not all, of the 1000+ papers using MDA-435. Scientists need to be willing to spend the time and money to comprehensively validate and calibrate their systems, instead of rushing to the “interesting” experiments that will yield data for their next grant submissions. The real danger of bad science is not the splashy results trumpeted in newspaper headlines, which will inevitably be found out, but the quiet results and unexamined assumptions that incrementally turn the gears in a field behind the scenes.
Arthur is working on his Ph.D. at Einstein, studying the interplay of viral and host factors that result in HIV neurocognitive disorders. He grew up in San Diego, California, and earned a B.S. in Biology at UC San Diego. He worked in biotech for a few years in the field of vaccine development, then earned his M.S. in Biology at NYU before coming to Einstein. Besides science, Arthur has interests in history, politics, public policy and social justice.