Teaching Operant Conditioning Principles via Virtual Reality and In-Class Demonstrations

Students often find that learning about basic operant conditioning principles can be challenging. As operant conditioning is the foundation for understanding all basic and applied experiments as well as practical applications of behavior analysis, it is essential that students master this content.

Operant conditioning principles include the three-term contingency, shaping by the method of successive approximations, positive reinforcement, negative reinforcement, operant extinction, and operant discrimination. The three-term contingency provides a framework for explaining behavior with respect to the antecedent stimuli, behavior, and consequences (McSweeney & Murphy, 2017; but see Killeen & Jacobs, 2017 for a discussion on adding a fourth term that represents the organism). In the simplest terms, shaping by the method of successive approximations is a way to produce new behavior that involves identifying the final form of the target behavior and reinforcing topographies that are closer and closer to the final form while withholding reinforcers for more distant topographies. In positive reinforcement, a discriminative stimulus indicates that emitting the target behavior (from a specific response class) will produce a reinforcer. The effect of this contingency is that the target behavior will be more likely to occur in the presence of the discriminative stimulus in the future. The same is true in escape and avoidance for negative reinforcement except that an aversive stimulus is removed contingent upon the target behavior. In operant extinction, an antecedent stimulus indicates that an experimenter-programmed consequence will no longer be delivered for emitting the target response. The main behavioral outcome of extinction is that the target behavior will be less likely to occur in the future. In operant discrimination, a different response (e.g., pressing a left versus right lever, pressing a lever at a high versus low rate, or pressing a lever versus not pressing a lever) is required in the presence of different antecedent stimuli. We previously gave suggestions about how to teach the reinforcement contingencies (Swisher, 2021, August 1), but instructors can do more than provide examples in lecture to help students understand these concepts.

Because students will be designing experiments or programs to use these principles as professional behavior analysts (or psychologists), it is beneficial when students can experience how these principles function in the classroom before attempting to apply them. Some common methods to provide experience with operant conditioning principles are (from least to most intensive) are in-class demonstrations with people, virtual reality software, and live, nonhuman animal laboratories. Because there are so few remaining live nonhuman animal laboratory components of university courses, they will not be described here (but see Sundberg, 1985 for an excellent example of students teaching pigeons to tact in a course laboratory).

https://www.pexels.com/photo/people-laughing-while-in-a-meeting-6914348/
[2] Image provided courtesy of Tima Miroschnichenko under Pexels license
In-Class Demonstrations

The Shaping Game. Students can practice in the classroom with some operant conditioning principles such as shaping by the method of successive approximations with very little (or no) additional equipment. Morgan (1974) described how the shaping game can be used by dividing students into different teams, selecting a task, and reinforcing successive approximations to the task with a clicker (or clapping). The students make a list of behaviors/tasks and attempt to produce in one of their teammates a behavior/task indicated by the other team. Each team member should have an opportunity to be a shaper as well as the person whose behavior is being shaped (i.e., learner). In addition to positive reinforcement on a continuous reinforcement schedule, the shaping game can be used to simulate the effects of positive punishment if the clicker is an aversive stimulus, schedules of reinforcement (Ferster & Skinner, 1957), and operant extinction. Superstitious behavior can also be generated if clicks are delivered on a time-based schedule irrespective of behavior. Additionally, Keenan and Dillenburger (2000) suggested some changes to incorporate the demonstration of other operant conditioning principles and generating meaningful class discussions with the shaping game. 

For a simple example of a behavior chain that can be shaped during the shaping game, the learner may start out in a standing position away from the chairs in the room. As the learner leans, looks, or moves in the direction of a chair, the shaper can provide a click (or clap). If the learner understands that the click (or clap) is a conditioned reinforcer, then there should be more leaning, looking, or moving in the same direction. Those movements closer and closer to the chair should be reinforced, and those earlier approximations (i.e., looking) should be extinguished when no longer providing clicks. Once the learner is next to the chair, the shaper could reinforce leaning or bending down toward the chair until part of the body touches the chair. Eventually, the learner should sit in the chair and earn a click for that target behavior.

PORTL. Hunter and Rosales-Ruiz (2019, December 25) previously wrote a blog post about their Portable Operant Research and Teaching Lab (PORTL) one-on-one tabletop game to teach students the principles of operant conditioning (e.g., shaping, discrimination, generalization, chaining, extinction, reinforcement schedules, and concepts; see Goodhue et al., 2019 for further explanation). Similar to the shaping game, Hunter and Rosales-Ruiz (in press) emphasize providing the student with a basic laboratory experience via PORTL. That is, the student who has the role of teacher (or shaper) in PORTL can only communicate with the learner (or person whose behavior is shaped) via the clicker and small blocks, which serve as conditioned reinforcers. When attempting to shape lever pressing in a rat or key pecking in a pigeon, the teacher cannot prompt, model, or tell the rat or pigeon what to do. However, the teacher can use a handswitch to deliver pellets or mixed grain to reinforce behavior, just as a teacher can use the clicker and small blocks (or tokens) in PORTL to reinforce behavior. (The blocks aren’t true tokens because the blocks aren’t used as part of a token economy in which they would be exchangeable for other items.) Aside from generating the target behavior, the focus of PORTL is to help students design and implement successful teaching protocols with various stimuli (e.g., household items such as a toy car, dice, plastic bricks, paperclips, buttons).

Using the same example of a behavior chain in PORTL, the teacher would use the clicker when the learner moved toward the chair and deliver a token to the learner’s hand, which the learner would drop in a dish. Requiring the learner to “consume” the token is a way to interrupt the free operant behavior in the human laboratory just as the sequence of behaviors is interrupted by reinforcer deliveries in the nonhuman animal laboratory. Once the learner engages in the target behavior, the learner and teacher can discuss the contingencies and the rule-governed behavior that the learner suspected was in effect throughout the experiment. The teacher can use the learner’s information to refine future teaching protocols and produce the intended stimulus control.

With any version of the shaping game in the classroom, the learner needs to move around for the teacher to shape the learner’s behavior. When a learner stands motionless in front of the class, there won’t be any topographies for the teacher to select and build upon. The teacher will also need to reinforce topographies early on that do not look like the final topography.

https://www.pexels.com/photo/a-white-mice-in-close-up-shot-15106525/
[3] Image provided courtesy of Nikolett Emmert under Pexels license
Virtual Reality Software

When students don’t have access to living rats, pigeons, or other experimental subjects, they can participate in an analog nonhuman animal laboratory. There are other virtual reality programs that students can use to model similar effects (e.g., Behavior on a Disk and The Box; see Graf, 1995), but those programs won’t be discussed here.

Sniffy the Rat. Graham et al. (1994) designed the Sniffy program for introductory psychology students to learn about operant conditioning procedures such as shaping and partial reinforcement as well as Pavlovian (or classical or respondent) conditioning procedures such as overshadowing and blocking. Sniffy’s behavior is based on images of real rat behavior, but the learning curves and steady-state schedule behaviors displayed in the cumulative record don’t necessarily map onto any one rat’s behavior. Sniffy is an albino rat avatar with a fixed number of responses. Students can select when to deliver food contingent upon Sniffy’s behavior. Students will quickly learn that immediately delivered reinforcers are better than delayed reinforcing consequences when shaping behavior. As in a real nonhuman animal laboratory, Sniffy can be magazine trained to learn that the click sound of a food hopper (or magazine) operating indicates the availability of food. Typically, an operant chamber also has a hopper (or magazine) light that comes on when food is available, which like the sound, also serves as a conditioned reinforcer. Food is a primary reinforcer that does not require prior learning to be effective. When Sniffy hears the food pellet in the hopper, it should immediately walk over to eat the pellet out of the hopper. Once Sniffy is reliably magazine trained, students can shape lever pressing on a continuous reinforcement schedule and then explore more complex sequences of behavior in both operant and Pavlovian (or both simultaneously, such as the conditioned emotional response procedure) conditioning (for example, see Jakubow, 2013). 

CyberRat. Ray (2011/2012) designed the CyberRat program for psychology students to learn about operant conditioning procedures, which is based on interbehavioral psychology and general systems analysis. Real albino rats in operant chambers were recorded, and those 1,800 video clips are spliced together to give a student feedback about the decisions that are made in training. Ray asserts that the virtual simulations of rat behavior are indistinguishable from a real rat’s behavior based on structural, functional, and operations analyses. The cumulative records generated by the CyberRat are authentic and change depending upon differences in establishing operations, number of reinforcers delivered, number of bar presses, rate of responding, and time on the schedule. Ray and colleagues accomplished this by using published parameters for operant and respondent conditioning procedures with rats or by conducting research with real rats to establish the missing parameters. Therefore, students should have the same experience with CyberRat as they would a living rat in the laboratory. When using CyberRat, students can conduct experiments on reinforcement, satiation, extinction, schedules of reinforcement, operant discrimination, and additional procedures.

In deciding whether to use in-class demonstrations with human participants or virtual reality software with rat avatars, it’s helpful to know how students will respond to those activities. Moreover, it is essential that these experiences provide some measurable improvement on learning. Virtual reality software seems to provide a benefit to learning beyond lecture alone. Venneman and Knowles (2005) assessed student performance in an upper level learning course. All students received lectures over operant conditioning principles. In addition to the lectures, one group of students completed exercises with Sniffy the Virtual Rat, a second group of students studied their materials for 2 hours, and a third group received no additional activities or study time. Venneman and Knowles found that students who completed Sniffy exercises performed better on an exam covering operant conditioning than students in the same course who studied for an extra 2 hours or students who only had the operant conditioning lecture. These benefits in learning aren’t restricted to virtual reality software and can occur with in-class demonstrations of conditioning. Lewis (2015) compared exam performance in an upper division learning course for students who completed either a Sniffy exercise or in-class exercise with human participants on respondent conditioning for one assignment and either a Sniffy exercise or an in-class exercise on operant conditioning for a second assignment. She found that students performed similarly on the assignments when averaged across a semester regardless of whether students completed the virtual or in-class demonstration of conditioning. Students also earned similar exam grades regardless of how they completed the assignment that demonstrated respondent (or operant) conditioning principles. However, students reported that they enjoyed the in-class demonstrations and real-world examples more than the Sniffy exercises and laboratory-based examples. Part of this preference seems to be driven by the fact that students were unfamiliar with the Sniffy program and required more instructions than with the in-class demonstrations. While it is important to provide students with a demonstration to help them learn about operant (and respondent) conditioning principles, instructors can incorporate either virtual reality software or in-class human demonstrations of those principles because they are equally beneficial for learning.

Image Credits:

[1] Cover image provided courtesy of Zen Chung under Pexels license

[2] Image provided courtesy of Tima Miroschnichenko under Pexels license

[3] Image provided courtesy of Nikolett Emmert under Pexels license

References

Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. Appleton-Century-Crofts. https://doi.org/10.1037/10627-000 

Goodhue, R. J., Liu, S. C., & Cihon, T. M. (2019). Incorporating the Portable Operant Research and Teaching Laboratory into undergraduate introduction to behavior analysis courses. Journal of Behavioral Education, 28, 51-541. https://doi.org/10.1007/s10864-019-09323-y 

Graf, S. A. (1995). Three nice labs, no real rats: A review of three operant laboratory simulations. The Behavior Analyst, 18, 301-306. https://doi.org/10.1007/BF03392717 

Graham, J., Alloway, T., & Krames, L. (1994). Sniffy, the virtual rat: Simulated operant conditioning. Behavior Research Methods, Instruments, & Computers, 26, 134-141. https://doi.org/10.3758/BF03204606 

Hunter, M., & Rosales-Ruiz, J. (2019, December 25). Shaping students to be better shapers. Behaviorally Educated, ABAI. https://science.abainternational.org/2019/12/25/shaping-students-to-be-better-shapers/ 

Hunter, M. E., & Rosales-Ruiz, J. (in press). The PORTL laboratory. Perspectives on Behavior Science. https://doi.org/10.1007/s40614-023-00369-y 

Jakubow, J. J. (2007). Review of the book Sniffy the Virtual Rat Pro Version 2.0. Journal of the Experimental Analysis of Behavior, 87, 317-323. https://doi.org/10.1901/jeab.2007.07-06 

Keenan, M., & Dillenburger, K. (2000). Images of behavior analysis: The shaping game and the behavioral stream. Behavior and Social Issues, 10, 19-38. https://doi.org/10.5210/bsi.v10i0.132 

Killeen, P. R., & Jacobs, K. W. (2017). Coal is not black, snow is not white, food is not a reinforcer: The roles of affordances and dispositions in the analysis of behavior. The Behavior Analyst, 40, 17-38. https://doi.org/10.1007/s40614-016-0080-7 

Lewis, J. L. (2015). A comparison between two different activities for teaching learning principles: Virtual animal labs versus human demonstrations. Scholarship of Teaching and Learning in Psychology, 1, 182-188. http://dx.doi.org/10.1037/stl0000013 

McSweeney, F. K., & Murphy, E. S. (2017). Understanding operant behavior: Still experimental analysis of the three-term contingency. The Behavior Analyst, 40, 39-47. https://doi.org/10.1007/s40614-017-0088-7 

Morgan, W. G. (1974). The shaping game: A teaching technique. Behavior Therapy, 5, 271-272. https://psycnet.apa.org/doi/10.1016/S0005-7894(74)80144-9 

Ray, R. D. (2011/2012). CyberRat, interbehavioral systems analysis, and a “Turing Test” trilogy. Behavior and Philosophy, 39/40, 203-301. https://www.jstor.org/stable/10.2307/behaphil.39-40.203 

Sundberg, M. L. (1985). Teaching verbal behavior to pigeons. The Analysis of Verbal Behavior, 3, 11-18. https://doi.org/10.1007/BF03392804 

Swisher, M. (2021, August 1). Suggestions for teaching the reinforcement contingencies. Behaviorally Educated, ABAI. https://science.abainternational.org/2021/08/01/suggestions-for-teaching-the-reinforcement-contingencies/ 

Venneman, S. S., & Knowles, L. R. (2005). Sniffing out efficacy: Sniffy lite, a virtual animal lab. Teaching of Psychology, 32, 66-68. http://dx.doi.org/10.1207/s15328023top3201_13