Forgotten History of Behavior Science (#4): “Porn,” Plungers, and Persistence at the Dawn of Human Behavioral Research

3

A Peek Inside the Groundbreaking Metropolitan State Hospital Grant Project

Laboratory logo from Annual Technical Report 3 (1955-1956), Office of Naval Research Contracts N5-ori-07662 and Nonr-1866(18). Wouldn’t this make a great nerdy t-shirt?

In the early 1950s, B.F. Skinner helped to secure grant funding (“An experimental analysis of psychotic behavior”) to open a human operant laboratory inside Metropolitan State Hospital in Massachusetts. This may have been the world’s first systematic free-operant research program conducted with humans. Although there had been earlier studies demonstrating operant effects in people, for the most part when the grant was funded “operant research” meant “animal research.” For context, remember that the wildly speculative Walden Two, which had just come out in 1949, was criticized in some circles for having no human research to back it up [As one reviewer put it, “It is a long way from ‘rat psychology’ to either child psychology or human social and political psychology”]. When the grant began, the equally speculative Science and Human Behavior had not yet been published.

This image has an empty alt attribute; its file name is lindsley.jpg
Ogden Lindsley

Ogden Lindsley bravely agreed to head up the grant, and he faced a daunting task on two fronts. First, all of the existing procedures of operant research had been standardized on nonhumans, so he had to find out what would and wouldn’t work with humans. Second, the participants would be psychiatric patients, not typical individuals, and nobody quite knew the implications of that. With these uncertainties in mind it’s interesting to take a look under the hood of the project, which we can do do because, shortly before he passed away, Murray Sidman sent me a stack of Lindsley’s unpublished grant reports (since composing this post, I’ve learned that reports from 1955-1958 are available in digital form at the Beatrice Barrett Initiative).

Only some of what’s mentioned below is presented in Lindsley’s publications. 

Pull This

Much of the grant work, which spanned years and scores of participants, focused on determining viable methods of study. For instance, it was well understood that pigeons will peck keys and rats will press levers, but what kind of operandum would suit humans?

Cross-species translations of methods are not necessarily simple. To illustrate, here’s Lindsley’s account of designing a way to collect responses from beagles. You don’t have to read the whole thing to get the idea that this was quite a challenge — despite the fact that the leap from rat to beagle is presumably much smaller than that from beagle to human.

In designing an operandum for beagle dogs, I … at first tried a lever very much like a rat lever only larger and at an appropriate height. It promptly produced a chopped rather than a smooth, even VI rate, because the dogs chewed the lever at times rather than pressing it. To rule out chewing, I built a panel at the end of a shallow tube. The dogs put one paw into the tube and pressed the panel at the end. This eliminated chewing entirely, but produced an uneven VI rate as the dogs’ toenails got sore and sometimes caught on the edge of the tube. Next, I mounted a square operandum panel hinged at its top at the right side of the chamber.8 This eliminated both chewing and nail tenderness, but the rate of response was still uneven, because the dogs shifted their weight and their right paw would tire. Next, I mounted the operandum in the middle of the end of the chamber and the dogs alternated either front paw, both paws, or their muzzles. This operandum produced the widest response class so far, and the rate of response was now almost smooth with the VI schedule. Occasionally a dog would paw at the top of the panel, however, and this pawing would not operate the microswitch because the panel was hinged at that point and would not depress. This added occasional chop in the record that should be smooth, so I hinged the operandum panel on internal struts about 24 in. above the panel on the outside of the chamber end to produce the final successful design. The top of the panel and the bottom moved the same distance to trip the microswitch response counter, and right front paw, left front paw, and muzzle presses were recorded at the top as well as at the bottom and sides of the panel. Now the 1-min VI schedule produced a smooth, even response rate over a period of several hours, and we finally had an appropriate dog operandum that gave the beagles freedom to form their responses.

Lindsley originally thought of levers for humans, too…

But when the levers were operated six hours per day at rates up to 10,000 pulls per hour they soon showed signs of wear…. Children were especially destructive, and constantly tore the apparatus apart in attempts to get the candy that was used as reinforcement. We have designed a standard manipulandum constructed of angle iron and half-inch brass rod which remains operative even when struck with chairs.

This image has an empty alt attribute; its file name is plunger.jpg
Lindsley plunger. Photo courtesy of Aubrey Daniels International.

This was the venerable Lindsley plunger, a spring loaded rod that had to be pulled to register a response. The device proved almost impossible to damage and suited Lindsley’s purposes well.

[Side note: Beyond the lab at Metropolitan, the Lindsley plunger got only sporadic use in human operant studies, and with the advent of digital technology it has become little more than a museum piece (literally; check out the cool online Behavioral Apparatus Museum hosted by Aubrey Daniels International). But it did have its moments, being featured, for example, in the great Dave Schmitt’s pioneering studies of human social behavior and in a lot behavioral pharmacology research involving destructive beasts like humans and baboons.] 

Milk and Cigarettes

With response measurement solved, Lindsley set out to identify reinforcers. Pigeons will work for grain, and rats for food pellets, but what about humans? Food was an obvious option and a vending machine was modified to deliver “jelly beans, corn-candies, gum drops, sour-balls, peanuts, chiclets, M and M’s, small Hershey bars and Tootsie-rolls” contingent on responding. The machine could also deliver cigarettes. Regarding motivating operations, Lindsley commented that:

Deprivation is useful in increasing the reinforcing properties of a stimulus. Thus, depriving a dog of food increases the rate of food-producing responses. In our case the under-financed State Hospital system and poverty-stricken or disinterested relatives had relatively deprived the patients” of candy and cigarettes.

Unsurprisingly, Lindsley found that food and cigarettes were reinforcers for most patients.

By around 1954 or 1955 Lindsley also began exploring social reinforcers:

In our search for useful reinforcers we felt that some of the patients might not respond to produce a reinforcer for themselves, but they might respond to produce one for another organism. In non-technical terms, even though they have “guilt,” they might respond to give “charity” or “help” to another organism. Also, it is a common hospital observation that some of the sickest mental patients will not feed themselves, but will not let an animal pet die — they continue to feed it long after they won’t feed themselves.

 

“You’re ugly.”

In one study, patients were seated at response console with a plexiglass window in front of them at around eye level. The window was normally darkened but the space behind could be illuminated to reveal a hungry kitten. On an intermittent schedule, the patient’s plunger pulls delivered milk to the kitten.  This proved to be an effective reinforcer for many participants, maintaining hundreds of responses per hour (and thereby demonstrating a sort of “empathy“).

However, perhaps because these were psychiatric patients there was a certain amount of intersubject variability:

Several of the patients talked to the kitten while they were responding, saying things like, “Pretty little kitty. You’re hungry. I’ll get some milk for you” … Two patients swore continuously at the kitten and struck the plexi-glass window with the chair. One paranoid patient kept saying that the kitten was the devil and that it was saying bad things about him.

Ambling Towards Experimental Design

Examining the grant reports also reminds us that standards of experimental design were very different in the 1950s than they are today. At first Lindsley’s interest was simply in whether orderly behavior patterns could be observed in his atypical participants. Consistent with the approach in Ferster and Skinner’s (1957) Schedules of reinforcement, grant reports often simply described behavior under some particular contingency. As in the following graph, the results sometimes took the form of cumulative records. Here we see one patient’s responding on Day 64 under a simple reinforcement schedule.

This image has an empty alt attribute; its file name is schedule.jpg

This is, of course, not an experimental design but rather a snapshot of behavior-in-the-moment. That the pattern of behavior seems relatively stable across the session suggests good control by a reinforcer, but in isolation the cumulative record demonstrates nothing conclusively except that behavior occurred. In the big picture, however, this record was part of a very lengthy acquisition curve. The patient, who was described as “catatonic,” initially responded much less frequently than normal controls under the same contingency, but over the course of 140 one-hour sessions gradually produced higher response rates that came to look more and more typical. As Lindsley observed, foreshadowing the reinforcement-based future of ABA, “this sort of progressive change is in the direction demanded by therapy.” 

IN THE LAND OF BASELINES, THE ONE-EXPERIMENTAL-CONDITION STUDY IS KING

To evaluate reinforcers, Lindsley usually employed a minimal experimental design (A-B): He would first demonstrate that behavior occurred frequently when some consequence was contingent on it, and then show that it became less frequent when the consequence was removed. Below are representative results from two patients working on a variable interval 1 minute schedule (the reinforcer was feeding the kitten), then extinction.

This image has an empty alt attribute; its file name is kitten.jpg

Lindsley’s grant research almost never included within-subject condition replications as demanded by that workhorse of contemporary behavioral methods, the reversal design. This at first seems odd because, unlike in many applied situations and in most laboratory work with free-ranging human volunteers, the participants — residential patients — had nowhere else to go. Thus Lindsley had the luxury of studying them for scores of experimental sessions, and there should have been plenty of time to withdraw and reinstate experimental conditions. But that almost never happened.

There’s a simple explanation for this: At the time the grant operated, it was not common in the basic laboratory to replicate conditions according to a strict A-B-A-B reversal protocol. If you doubt me, leaf through the first 10 volumes or so of Journal of the Experimental Analysis of Behavior (1958-1967), and see how many A-B-A-B designs you find. Lindsley was simply mimicking the designs (or lack thereof) used in a lot of the era’s animal research.

That reversals were not foundational in laboratory research of that time may come as a bit of a surprise if you’re familiar with Baer et al.’s (1968) take, from only a few years later, on how ABA differs from the experimental analysis of behavior, which included this caustic appraisal of laboratory work:

An experimenter has achieved an analysis of a behavior when he can exercise control over it. By common laboratory standards, that has meant an ability of the experimenter to turn the behavior on and off, or up and down, at will. Laboratory standards have usually made this control clear by demonstrating it repeatedly, even redundantly, over time. Applied research, as noted before, cannot often approach this arrogantly frequent clarity of being in control of important behaviors. [emphasis added]

In reality, when Baer et al. wrote, the laboratory research strategy based on “arrogantly frequent clarity of… control” was a relatively new invention. The reversal design seems to have gained popularity gradually after the publication of Sidman’s (1960) influential Tactics of scientific research. Formal reversal designs simply weren’t in vogue when Lindsley did his grant work.

Of course, with those treatment-extinction A-B experiments, Lindsley was clearly on the trail of what would later be called reversal designs. And with a focus on reversals comes a concern for when they can actually be achieved. Is it okay ethically, for instance, to un-do therapeutic gains? If I have reduced a child’s self-injury from 60 instances per minute to nearly zero, should I allow the self-injury to return in order to demonstrate that my intervention really caused the elimination? Because Lindsley was doing basic research without explicit therapeutic goals, the grant reports show no evidence of him losing sleep over the ethics of reversals.

MY KINGDOM FOR A MULTIPLE-BASELINE DESIGN

But a parallel worry is whether it’s really possible to reverse all behavior change. Think of the proverbial case of learning to ride a bike, which is so widely thought to be irreversible that in everyday conversation we use the expression “like riding a bike” to describe a behavioral repertoire that is massively resistant to change (cf. Postscript 1). In at least one instance in the grant reports, you can see Lindsley beginning to grapple with this type of reversibility problem.

This image has an empty alt attribute; its file name is chamber.jpg

As shown here in Lindsley’s rendering, a male patient (P23) was seated at a console where he could press buttons to occasionally reveal color photographs of nude women (see Postscript 2). As the graph below shows (“Room 2” function), in an initial condition contingent presentation of photos appeared to be a potent reinforcer that maintained roughly 4,000 to 5,500 responses per hour. Once this was established, Lindsley did as he had in many other cases: He withdrew the putative reinforcer, such that button pressing no longer produced pictures.

But P23’s response rate didn’t drop as expected. In fact, some 50 hours of session time later, button pressing was still chugging along at thousands of responses per hour. So what gives? Is this the most impressive example of resistance to extinction you’ve ever seen? Or is something else going on — for instance, is there some other reinforcer operating here besides the nude photos?

Lindsley never found out for certain, but he did eventually intuit that a more complex design would be needed to shed light on the situation. After 100 hours of nonstop button pressing, P23 was placed into a different laboratory room where button pressing had never previously been reinforced (“Room 6” function). There was clear discrimination (that is, limited Room 6 button pressing). This shows that button pressing wasn’t automatically reinforcing in any generic sense, but does not explain what the operative variable in Room 2 may have been (photo reinforcement or something else).

Here we can see Lindsley meandering into some of the logic of multiple baseline designs which, among other things, evaluate change in presumably independent behaviors. Of course, Lindsley fell short of contemporary standards for that design. He did not begin Room 6 testing concurrent with Room 2 testing, and he did not replicate the Room 2 contingency in Room 6 to see if  similar behavioral persistence would result. The importance of those steps would not become clear until the multiple baseline design was fully elaborated about a decade later (see Postscript 3 for Baer et al.’s early description of it).

The Big Picture

So there’s your quick and selective tour of the first human operant lab. By now you may be wondering about the point of all of this reminiscing. That’s a fair question, and I have two answers.

On one level I’m simply offering up an entertaining diversion for history buffs. As someone who’s had more than his share of research false starts, I find the uneven progress of science to be inherently fascinating — certainly more engaging than the myth of infallible geniuses forcing Nature to bend to their will. In real research, we all sometimes feel a bit like good old P23, banging away incessantly on something with no certainty that a payoff is ever coming. 

On a more serious level, it’s good to be reminded that science isn’t static. At some point in the past, things we take for granted today had to be figured out from scratch. The Og Lindsley of the 1950s was a cautious investigator who proceeded incrementally — for instance, when the kitten-feeding apparatus was built, an early step was to test whether simply opening the window (to no kitten) could be a reinforcer. If much of his grant research seems methodologically quaint or conceptually uninteresting, we have to remember that, despite the interspecies generality of principle promised in Science and Human Behavior, in empirical terms Lindsley had few precedents to guide his work.  

This was groundbreaking stuff for its time, which should prompt a healthy dose of contemporary humility. Although today we may think we have a singular recipe for methodological success, to future observers our practices may appear primitive. Examining the past reminds us that science is not a product but an evolving process. As Skinner said in The Shaping of a Behaviorist, “Regard no practice as immutable. Change and be ready to change again.”


Postscripts

(1) Turns out learning to ride a bike may actually BE reversible, albeit only under difficult-to-arrange circumstances. 

(2) I haven’t overlooked an additional way in which early science differs from current standards: ethical considerations. Research oversight by Institutional Review Boards wasn’t required until 1974, so it’s a pretty good bet that no formal process of informed consent was used to enter patients into Lindsley’s studies, and the grant reports don’t talk about how patients were enrolled. It’s also unclear if the patients, once engaged in the research, were able to withdraw from a study or to withdraw permission for their data to be used (technically, since the patients had been committed to a psychiatric hospital, their legal guardians would have had to exercise these rights). We also don’t know where those nude photos came from or whether their subjects knew how they were being used. I could go on… but in general let’s grant a little grace given the pre-IRB times in which the research was conducted. 

(3) Baer et al.’s (1968) early description the multiple baseline design:

An alternative to the reversal technique may be called the “multiple baseline” technique. This alternative may be of particular value when a behavior appears to be irreversible or when reversing the behavior is undesirable. In the multiple-baseline technique, a number of responses are identified and measured over time to provide baselines against which changes can be evaluated. With these baselines established, the experimenter then applies an experimental variable to one of the behaviors, produces a change in it, and perhaps notes little or no change in the other baselines. If so, rather than reversing the just produced change, he instead applies the experimental variable to one of the other, as yet unchanged, responses. If it changes at that point, evidence is accruing that the experimental variable is indeed effective, and that the prior change was not simply a matter of coincidence. The variable then may be applied to still another response, and so on. The experimenter is attempting to show that he has a reliable experimental variable, in that each behavior changes maximally only when the experimental variable is applied to it.

1 thought on “Forgotten History of Behavior Science (#4): “Porn,” Plungers, and Persistence at the Dawn of Human Behavioral Research

  1. The Fixed Interval

    You make a point in postscript 2 about the ethics of these early experiments. It’s difficult to discuss the ethics of another time period, though it brings to mind two conflicting ideas: on the one hand, there were people at this time, working with these populations, that displayed incredible empathy and had behavior that we would judge today as ethical. On the other hand, society at large was not this way, and in fact virtually nobody was willing to assume that these “warehoused” people could change.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.