A Thought for the Scary Season #3: Dead Men, Sexy Blondes, and the Confirmation Bias, OR the Illogic of the “Conceptual Article”

PREVIOUS POSTS IN THIS SERIES

Pay No Attention to That Self Behind the Curtain, OR the Least Behavioristic Thing About Behavior Analysts
There Are Too Many Behavior Analysis Journals. I Can Prove It.

ACKNOWLEDGEMENT: NICK BURKEY PROVIDED HELPFUL FEEDBACK ON A DRAFT OF THIS POST.

A few years back, in a paper called “The Dead Man Test: A Preliminary Experimental Analysis,” my co-author (a fascinating woman) and I expressed mock outrage that the measurement heuristic called the Dead Man Test “appears to have been widely embraced without critical evaluation.” We presented an empirical analysis of the Dead Man Test that I was certain even casual readers would recognize as tongue-in-cheek. I therefore thought it was editorial overkill when Behavior Analysis in Practice, which was good-naturedly up for publishing the article, asked us to attach a postscript explaining that the article was satirical.

And then, shortly after publication, someone posted disgustedly on social media: “Demonstrating the dead man test works by using 3 mummies in a museum…. Apparently 2 geniuses have done it.” Could this person have taken the paper seriously? Impossible. But then came a series of emails from other behavior analysts all saying something to the effect of, “Thank you, Dr. Critchfield, for opening my eyes. I’m so embarrassed for having accepted the Dead Man Test without experimental verification.” At first I thought this had to be good natured, in-kind sarcasm. But there were enough of these messages that I’m now convinced they were unironic. And there was a lot more — see the Postscript for details.

This experience got me to thinking about other instances where a non-serious message was taken seriously. I’m eventually going to get around to a modest proposal for our field, but getting there won’t be a straight line, so bear with me.

Gentlemen Prefer Blondes

Back in 1997, neuroscientist V.S. Ramachandran was growing irritated about the field of evolutionary psychology:

The distinction between fact and fiction gets more easily blurred in evolutionary psychology than in any other discipline, a problem that is exacerbated by the fact that most “ev-psych” explanations are completely untestable… One afternoon, in a whimsical mood, I sat down and wrote a spoof of evolutionary psychology just to annoy my colleagues in that field. I wanted to see how far one could go in conjuring up completely arbitrary, ad hoc, untestable evolutionary explanations for aspects of human behavior that most people would regard as ‘cultural’ in origin. The result was a satire titled “Why do gentlemen prefer blondes?”

The core proposition was that natural selection has favored human males who favor blonde female mates. Wearing his writer’s poker face, Ramachandran provided three very serious-sounding just-so stories that purported to explain the evolutionary advantages conferred by men preferring blondes. For example:

Several authors have suggested that certain florid displays of secondary sexual characteristics – such as the peacock’s tail or the rooster’s bright-red wattles – may serve the purpose of ‘informing’ the female that the suitor is healthy and free of dermal parasites. I suggest that being blonde, or light-skinned, serves a similar purpose… Anemia (usually caused by intestinal parasites), cyanosis (a sign of heart disease), jaundice (liver disease) and skin infection are much easier to detect in fair-skinned individuals than in brunettes…. There must have been a considerable selection pressure for the early detection of anemia in a nubile young woman, since anemia can interfere substantially with fertility, pregnancy, and the birth of a healthy child.

A second evolutionary explanation was that, because men prefer fertile partners, fertility is age-dependent, and signs of aging are often obvious in the skin, men prefer blondes because the signs of aging are especially easily seen in their pale skin. A third explanation: “Certain external signs of sexual interest – such as social embarrassment and blushing – as well as sexual arousal (e.g. the ‘flush’ of orgasm) would be difficult to detect in dark-skinned women; so that the likelihood that one’s courtship gesture will be reciprocated and consummated can be predicted with greater confidence when courting blondes.”

Of course there was no evidence for any of these hypotheses; in fact, as often is the case in evolutionary explanations, definitive evidence is not even possible because the conditions of ancient natural selection cannot be observed or recreated. Illustrating Ramachandran’s point about uncritical acceptance of hypotheses in evolutionary psychology, however, a journal took the paper seriously enough to published it in 1997. And, according to Google Scholar, that paper has been cited, mostly unironically, 41 times. As best I can tell, a followup paper, “The evolutionary psychology of jealousy and envy,” is also an elaborate prank, and it has been cited 37 times. I’m embarrassed to say that a bunch of serious papers I’ve published were cited fewer times.

The Missing Link

This is far from the only instance where scientists embraced something with insufficient evidentiary support. You might be familiar with Piltdown Man, a skull “discovered” by Charles Dawson in 1912 and introduced as both the “Earliest Englishman” and also a “missing link” species that was intermediate to apes and humans. Despite the skull being a fairly obvious forgery (a shoddy mash-up modern bones from three species), some archaeologists embraced it as critical proof of evolutionary theory.

But that is not all. In a very meta twist to the story, some other archaeologists suspected from the outset that Piltdown Man was bogus. They thought so because Dawson had long history of creating paleontological forgeries, including my favorite, a ridiculous fossilized toad supposedly found entombed inside a geode. In attempt to highlight the absurdity of Piltdown Man, one of the skeptics apparently placed a second hoax in the English gravel pit where the original “fossil” was said to have been discovered. Says Wikipedia:

Professor Adrian Lister… has said that “some people have suggested” that there may also have been a second “fraudster” seeking to use outrageous fraud in the hope of anonymously exposing the original frauds… The piece nicknamed the “cricket bat” (a fossilised elephant bone) was such a crudely forged “early tool” that it may have been planted to cast doubt upon the other finds, the “Earliest Englishman” in effect being recovered with the earliest evidence for the game of cricket. This seems to have been part of a wider attempt, by disaffected members of the Sussex archaeological community, to expose Dawson’s activities…. Nevertheless, the “cricket bat” was accepted at the time. (italics added)

Lost in the Confirmation Bias

For the record, I would love to have dreamed up the “earliest evidence for the game of cricket.”

But the “cricket bat” and “gentlemen prefer blondes” episodes highlight something very serious: the confirmation bias. This is a tendency to notice and overvalue “evidence” that conforms to already-held beliefs, coupled with a tendency to ignore and undervalue disconfirmatory evidence. As L.L. Thurstone wrote way back in 1924:

If we have nothing personally at stake in a dispute between people who are strangers to us, we are remarkably intelligent about weighing the evidence and in reaching a rational conclusion. We can be convinced in favor of either of the fighting parties on the basis of good evidence. But let the fight be our own, or let our own friends, relatives, fraternity brothers, be parties to the fight, and we lose our ability to see any other side of the issue than our own. The more urgent the impulse, or the closer it comes to the maintenance of our own selves, the more difficult it becomes to be rational and intelligent.

You have certainly heard of the confirmation bias but, to appreciate its power and scope, it’s worth checking out a fabulous review of the phenomenon. For present purposes, the scary thing is that, As Walter Schumm has observed, scientists can be just as susceptible to the confirmation bias as anyone else: “While science is presumably objective, scholars are humans, with subjective biases. Those biases can lead to distortions in how they develop and use scientific theory.” When archaeologists embraced Piltdown Man and the “cricket bat,” they uncritically accepted shaky supporting evidence about which they should have been skeptical. When evolutionary psychologists bought the “Gentlemen prefer blondes” account, they accepted an idea with no supporting evidence at all.

The Confirmation Bias in Behavior Analysis

The confirmation bias may have tripped up scholars in other disciplines, but surely we behavior analysts, with our rigorous science of behavior, are immune. Right? Perhaps, but recall my “Dead Man” fan fail, which suggested that at least somebody out there was willing to believe that a semi-serious scholar (fitting the stereotype of the nebbish with no common sense) would really bother to observe vitality-challenged individuals (Egyptian mummies) for signs of behavior.

In truth, the confirmation bias operates in every science. Sometimes this means accepting evidence that is too good to be true. Sometimes it means ignoring evidence that challenges cherished ideas. But neither case should trouble us too much, because science is iterative and, eventually, self-corrective, because it focuses on the preponderance of evidence that accumulates over time. What we should worry about are circumstances that direct our attention away from the very need for evidence, and I think there is one practice in our discipline that creates a special risk of this.

As background to my concern, let’s start with two key points. Point 1 is that behavior analysts love behavior analysis, and they see in it the potential to understand and solve nearly any problem involving behavior. We are fond of saying that behavior analysis can “save the world” — you can even buy mugs and t-shirts proclaiming as much. Point 2 is that it isn’t easy to empirically study of all parts of the world that require saving. There are too few behavior analysts and too many parts of the world, and some problems are so big as to defy our single-case experimental designs and modest resources.

All of that “save the world” enthusiasm needs an outlet, and it has found one in what’s affectionately called a “conceptual analysis,” a narrative-form, plausible interpretation of some everyday phenomenon through the lens of fundamental behavior principles. For this type of exercise, B.F. Skinner was Prometheus, producing influential examples like Science and Human Behavior and Verbal Behavior. These works pioneered the practice of using principles derived from operant experiments to “explain” a vast array of individual behaviors and cultural traditions.

Four Pitfalls

Following Skinner’s lead, over the years countless conceptual analyses have been published in behavior analysis books and journals, and you are no doubt familiar with some of them. It seems that the more weighty and complex the problem in behavior, the more likely our discipline is to employ a conceptual analysis instead of an empirical one. I’m not saying conceptual analyses serve no purpose, but they are laden with pitfalls, of which I’ll mention four.

The first pitfall of conceptual analyses derives from the power of narrative. Conceptual analyses are stories, and people (even behavioral analysis people) love a good story. Told skillfully, stories exude Comedian Stephen Colbert’s concept of truthiness — they feel true, even if there is no objective evidence to directly back them up. For my money, this is just another way of saying “confirmation bias.”

The second pitfall follows from the first: Once you think you know something, you stop trying to find out. Alan Baron and colleagues warned that in this way conceptual analyses can actually discourage much-needed empirical work: “Interpretations … seem to have had limited impact, except to generate more interpretations and, perhaps, to evoke a sense of self-satisfaction with the apparent scope of the explanatory principle.” In other words, a well-crafted conceptual analysis confers the sense of mission accomplished and thus makes us less curious.

The third pitfall of a conceptual analysis is that, although its story line rarely acknowledges this, the same principles can be used to craft a whole host of different interpretations of the same phenomenon. One of my favorite illustrations of the problem: Psychoanalysts have used the same set of Freudian principles to generate at least 13 competing interpretations of why the painter Vincent Van Gogh cut off part of his ear. Maybe you’re bristling at this example because, in comparison to hazy Freudian ideas, the principles of behavior are precise and well-vetted. But the underlying problem is not with principles. It’s with the interpreting, which involves subjective decisions about what are the key facets of the problem at hand and what specific principles apply to it.

Behavior analysts, in fact, are just as capable of disagreeing on interpretation as Freudians. For instance, the bill-passing behavior of the United States Congress has long shown an annual “scalloped” pattern: very low output early in the year, giving way gradually to an end-of year frenzy. Some behavior analysts think the scalloped pattern of responding results from naturally-occurring fixed-interval schedule dynamics. Others, based on the exact same reinforcement schedule literature, flatly reject the notion that fixed-interval dynamics operate on Congress.

The fourth, and most important, pitfall concerns falsifiability. Empirical work includes many checks and balances against sloppy thinking (e.g., formal research designs and rigorous rules for data analysis). This is what makes science self-correcting: You can tell when an idea is wrong. With conceptual analyses, however, the path to disconfirmation is not so clear. Conceptual analyses enthusiastically (and selectively) point to apparent supporting “evidence,” but I have rarely seen one (even from B.F. Skinner) that specifies how to tell if it is wrong. Here is a challenge: Dig up a dozen or so conceptual analyses in behavior analysis journals (this won’t be hard). Check to see how many advance testable predictions and also say how to test them. I predict the answer will be “not many.”

All in all, there can be less to conceptual analyses than meets the eye. When we treat conceptual analyses as scientific progress, we create an echo chamber of disciplinary self-congratulation, a sort of shared delusion about the reach and power of our science. Not too long ago, some colleagues and I got a glimpse of this. To help us compile a list illustrating how diverse applied behavioral research is, we polled colleagues for examples of behavior science being applied to a wide variety of problems, and we stressed our interest in research examples. A surprising number of the suggestions we received, however, were not of empirical research, but of conceptual analyses. Some correspondents seemed not to grasp that a conceptual analysis cannot validate a theory or solve a real-world problem.

Now, don’t get me wrong. I believe that conceptual analysis can be a useful first step when tackling a new topic of inquiry. When Skinner wrote Science and Human Behavior, for instance, pretty much all of his topics were new topics for behavior analysis. In a void, anything that fuels the imagination can be a boon to science, so I’m all in on that. But it’s critical to think of conceptual analyses, not as a source of answers, but rather as a source of motivation for doing research. They don’t provide answers, but they can suggest that answers are possible. Only empirical work can verify what the answers really are.

A Modest Proposal

As long as behavior analysis journals continue to feature conceptual analyses as they have traditionally been constructed, the confirmation bias is going to operate and we can tell ourselves that we are doing more to solve vital scientific and practical problems than we really are. For this reason, I advance the following two-pronged proposal for updating the policies of our journals.

My first suggestion is for journals to put the brakes on conceptual articles — not to end them entirely, but to give more careful consideration to when they really advance the literature. In my view, a conceptual analysis should be published only when it is a first leap into some potentially fertile new domain of inquiry — and because our discipline’s basic and applied scientific wings have been around a while, such topics should be somewhat rare. Importantly, once a conceptual analysis reveals the potential to investigate something new, it’s time to start investigating. Further narrative interpretation isn’t useless, but it offers questionable archival benefits, and should shift to other discussion venues such as blogs, list servs, and so forth.

My second suggestion is for journals to require the conceptual analyses they do publish to explicitly demonstrate their capacity to serve as a lubricant for empirical investigation. Each should state what evidence would support and contradict it, and specify how to get that evidence. Here’s an example of how this can be accomplished. Some years ago, when my students and I conducted our own study of scalloping in Congressional productivity, we reiterated the fixed-interval interpretation, but we didn’t stop there. We were careful to specify four testable predictions based on factors known to influence fixed-interval behavior and three testable predictions derived from competing ideas in political science. We actually performed the necessary tests (final score: reinforcement hypotheses 4, alternative hypotheses 0), but a first-pass conceptual analysis needn’t go this far. I’m only saying that there is no reason why every conceptual analysis shouldn’t (a) emphasize that it is speculative, and (b) examine, transparently, how to find out if it is wrong.

I believe that these two strategies, if implemented, will make conceptual analyses less prevalent in our literature, but the ones that get published will be more useful in stimulating research and less likely to harness the confirmation bias.

Disagree with my position? Itching to write a rebuttal? Use “Leave a Reply” below, or contact me at tscritc@ilstu.edu.

Postscript 1: A Contemporary Iteration of the Conceptual Article

Recent years have seen an explosion of a new variety of conceptual article related to the practice of applied behavior analysis. This type of article explains and defends potentially important values and skills (e.g., cultural responsiveness, soft skills, trauma-informed practice) and describes “best practices.” The relevant discussions are conceptually consistent with a behavioral perspective, but all of the potential pitfalls of conceptual analyses that I mentioned above apply. Often the articles are light on supporting empirical evidence and say little about how the ideas being advanced could be objectively tested. Because these are new topics, we should extend some grace initially and be grateful for the intellectual adventurousness of the articles. At the same time, we owe it to our discipline to be impatient about waiting for the relevant discussions to begin showcasing hard evidence.

Postscript 2: On Short-Circuiting Cognitive Biases

There are, of course, a whole host of biases in thinking and action, and their causes and solutions are prime targets for a behavioral analysis. A few, like delay discounting, have actually received fruitful attention from behavior analysts, but most have not. With this oversight in mind, check out recent work (here and here) on how insights from Relational Frame Theory can be used to subvert the correspondence bias, which conceptually has a lot in common with the confirmation bias. Also check out some interesting commentary and research by Howard Rachlin on the sunk-cost effect and other biases.

Postscript 3: More On Taking Silly Things Seriously

A different “Dead Man” paper (also tongue in cheek, though this one, ironically, and embarrassingly enough for me given the present post, was a sort of conceptual article) was published as a Halloween joke in Perspectives on Behavior Science. The journal’s Editor at the time, Don Hantula, recently reminded me that there was a fair amount of blowback about that article. Here’s Don’s recollection:

Several people complained to ABAI about it. A common concern was that the article was “unethical” — some thought you were advocating studying dead people (and digging them up to do so), which led to concerns about informed consent. Others thought the paper may lead people to think ABAI was advocating killing people to study deceased operant behavior. One person understood that the article was satire, but said that the Dead Man Test was a foundational part of ABA and satirizing it would cause people to doubt ABA, which they concluded was also unethical. That person thought both you and I should be formally sanctioned on this basis.

I am, unfortunately, accountable to this post’s focus on the confirmation bias, in which people uncritically accept “evidence” that fits their preconceptions. The implication: Folks out there, maybe a lot of them, think me capable of digging up bodies (making relevant this putatively satirical article: “Free dead bodies buried underground seemingly for the taking“) or even killing people in order to study them. All I can say to those of you who think I’m bad is: Check this out.