When and How to Use Exams in the Classroom

There are many methods to assess learner performance in class, and one of the most common methods is to administer examinations (or exams). Learners are typically given multiple-choice, true/false, short answer, or essay question prompts in exams to help them demonstrate what they have learned in a course on a specific topic. Exams can serve as additional learning opportunities as well as a source of evaluation. Learners receive a standard score from graded exams that can be used to compare individual performance to the group’s performance.

How Frequent Should Knowledge Assessments Be?

Learners tend to prefer fewer quizzes, exams, and assignments and will report to the instructor with more assessments (that’s usually me) that other sections of the same course have fewer assessments. While learners seem to be focused on the time spent studying or working on individual assessments and how those points are allocated to their overall course grade, they aren’t necessarily considering that these assessments exist to motivate studying, to reduce the contribution of an individual assessment to the overall grade (i.e., low-stakes quizzes), to help learners discriminate important content from peripheral content, to give learners practice with the types of questions they might encounter on an exam, and to give learners feedback on their current level of understanding on those topics. Engaging in retrieval practice (i.e., when a learner studies and tests themselves – or the instructor quizzes them – prior to the actual exam in class) has the benefit of improving performance on verbatim questions as well as inference questions (i.e., when combining ideas in a new way) over the long term relative to repeated study/rereading – despite the fact that learners do not expect retrieval practice to be beneficial, as Karpicke (2012) and Roediger (2013) point out.

Schedule-induced response patterns occur for many different types of behavior, including studying. When pop quizzes could occur, then studying should conform to a variable interval schedule (assuming that quizzes occur after an average amount of time has passed). That is, students should study for some duration daily (or every other day) because the quizzes are unpredictable. The only reliable signal is that the longer it has been since the last quiz, there is a higher probability of an impending quiz (cf. unsignaled avoidance procedure; Badia et al., 1974; Sidman, 1953). This assumes that studying is maintained by negative reinforcement where avoiding a failing grade is the negative reinforcer. Regularly scheduled quizzes produce more predictable and less frequent studying of course material. Jarmolowicz et al. (2010) found a typical fixed interval scalloped pattern of responding for most learners completing quizzes in an introductory psychology course prior to their weekly deadlines. That is, as the deadline was farther away, fewer learners completed quiz attempts. As the deadline approached, substantially more learners completed quiz attempts.

When frequent quizzes in class may be untenable or otherwise undesirable, instructors can use online Readiness Assessment Tests (RATs) to help learners prepare for the upcoming lecture and to allow the instructor to adjust the lecture material to address content in the readings that was confusing to learners. These RATs are low-stakes assignments with only a few questions per reading. Typically, learners are expected to read a few pages from a textbook or an article prior to attending class to help them understand the lecture content. Heinicke et al. (2017) found that slightly more than half of their learners in a psychology course thought the RATs helped them to understand the material and preferred RATs to standard reading quizzes. However, more learners in the section of the course that used RATs attended class regularly and earned passing grades on unit exams than learners in the section of the course without RATs. Quizzes are simply one way to assess knowledge and promote learning.

Even when learners do not know the answer to a question when they initially encounter it, they are more likely to remember the answer to the question after they are given a chance to study the information. Grimaldi and Karpicke (2012) gave participants a pretest in which cues were presented without their targets (e.g., tide-?) and compared this to a condition in which participants were given the cue and the target without a pretest (e.g., tide–beach). Participants who were given the pretest were unable to produce the targets initially but recalled a higher proportion of those related targets in a cued recall test than participants who were given the list to study without a pretest. Immediate pretests are better for later recall than delayed pretests. In an immediate pretest, participants have the cue (e.g., tide-?) and immediately get the cue and target to study (e.g., tide–beach). In a delayed pretest, participants get all the cues alone (e.g., tide-?, kite-?, jelly-?) and then the cues plus the targets (e.g., tide–beach, kite–wind, jelly–bread). One application of the pretesting effect is to have low- or no-stakes quizzes before a lecture to promote learning the material delivered in lecture.

Image provided courtesy of cottonbro studio under Pexels license

What Are Cumulative and Noncumulative Exams?

Course topics and their assessments can be arranged in several different ways. Instruction on several topics could comprise one unit, and there could be a content test after each unit. The course could end with a content test after the last unit with no repeated topics from previous exams; this would be a noncumulative exam. Alternatively, the final test after the last unit could include questions from topics across the span of the course; this would be a cumulative exam. Typically, course content is cumulative with the next topic building on the foundations of the previous topic(s). For example, understanding how to calculate the mean and standard deviation is important not only for descriptive statistics but also for calculating t-tests and analysis of variance (ANOVA; i.e., sum of squares and degrees of freedom – the numerator and denominator, respectively, of the standard deviation formula). In an introductory statistics course, we generally cover descriptive statistics for the first exam, t-tests for the second exam, and ANOVAs for the third exam. While some material is repeated on successive exams (i.e., learners calculate the mean, sum of squares, and degrees of freedom with ANOVAs), the unit exams are not cumulative given that questions about t-tests will not occur on the third exam. It therefore makes sense that assessments of related topics would also be cumulative with questions from not only the current units (or topics) but also from previously-covered (and tested) units.

Learners prefer to study content for unit exams rather than cumulative exams (cf. Lawrence, 2013) because the list of possible topics that could appear on the unit exams are more constrained than for cumulative exams. Learners in higher education, at least, feel that their study efforts were wasted if they learned material that was not assessed on an exam. According to Petrowsky (as cited in Khanna et al., 2013), learners studied the material in previously-covered units longer and had an additional opportunity to learn material from previous units when they had cumulative exams. Assuming that the instructor wants learners to retain all the skills and knowledge acquired throughout the course, a cumulative exam is a way to communicate the importance of this material to learners and assess their retention of core information.

Image provided courtesy of Keira Burton under Pexels license

Are Cumulative or Noncumulative Exams Better for Learning?

There aren’t many studies in which cumulative and noncumulative exams and their respective learning outcomes are compared. Despite this apparent dearth of studies, the researchers who have addressed this question indicate that cumulative exams are better for acquisition and retention of course content than noncumulative exams.

Khanna et al. (2013) described the differential effect of cumulative versus noncumulative final exams across several introductory psychology courses in each content area (i.e., social, cognitive, and abnormal) on performance on a departmental content exam. Learners either completed a cumulative final exam or a noncumulative final exam (depending upon instructor) at the end of the course. Then, learners completed a content exam for the department as a measure of teaching effectiveness; performance on this exam had no consequences for learners. Learners who completed cumulative final exams in their courses had 10-13% higher scores on the immediate departmental content exam than learners who completed noncumulative exams in their courses. Furthermore, learners maintained this benefit of knowledge retention from cumulative final exams relative to noncumulative exams on the departmental content exam up to 18 months after completing a course. Cumulative final exams seem to be especially important for learners in introductory psychology courses compared to more advanced psychology courses because learners who have not taken many psychology courses still had much to learn.

Lawrence (2013) taught two different sections of an introductory psychology course and gave learners four exams total. In one section, learners completed three noncumulative (or unit) exams plus a cumulative final exam. In the other section, learners completed four cumulative exams. On the cumulative midterm exams, 10 questions (or 20%) covered material from previous units, and 40 questions (or 80%) covered new material. The final cumulative exam contained more (55%) material from previous units. Learners were given a follow-up test 2 months after the course ended. Lower-performing learners performed better on cumulative exams than noncumulative exams. Higher-performing learners also performed well on cumulative exams. The benefit of learning from cumulative exams also extended to chapter quizzes for the learners with lower scores. It seems that cumulative exams are especially helpful for learners who may be otherwise struggling to understand course content.

Gayman et al. (2023) assessed the effect of cumulative versus noncumulative exams in an online learning course with interteaching as the instructional delivery method. Exams occurred weekly for each section. In the cumulative exams section, 50% of the 20 multiple-choice questions covered material from previous weeks. In the noncumulative exams section, all 20 multiple-choice questions covered material from the current week. The cumulative final exam for both sections had 35 multiple-choice questions. As part of the interteaching component of the course, learners created a preparation guide for the readings, discussed those questions with classmates, indicated to the instructor which questions were most difficult, and viewed lectures that addressed difficult content. For all but one of the weekly exams, learners performed similarly across cumulative and noncumulative exams. On the cumulative final exam, learners in the cumulative exams section earned more As and Bs than learners in the noncumulative exams section. Grade point average (GPA) did not contribute to the difference in performance on the cumulative final exam across sections. Not only are cumulative exams beneficial in traditional lecture courses for learners, but they also improve learning outcomes when combined with interteaching methods.

Part of this benefit of cumulative exams may be due to spaced (or distributed) retrieval practice (Roediger, 2013) and avoiding massed practice (or cramming) due to procrastination (Imam, 2014). In spaced practice, learners take a break of somewhere between 30 minutes or 2 days in between each study session instead of studying without stopping prior to the criterial test. With both weekly quizzes in introductory psychology, learning and behavior, and research methods courses (Imam, 2014) and spaced (i.e., cumulative) versus massed (i.e., noncumulative) questions on quizzes in an engineering course (Hopkins et al., 2016), learners performed better on posttests or final exams, respectively, after spaced practice, which was necessitated by weekly quizzes or cumulative quizzes. That is, learners have to study more frequently (and not procrastinate) when having regular assessments on material, although Lawrence (2013) indicated that she did not note any remarkable differences in study habits between learners in the cumulative versus the noncumulative exams sections. Overall, frequent quizzes (or other assessments) and cumulative exams seem to support learning within courses and long-term retention of course content.

Image credits:

[1] Image provided courtesy of Charlotte May under Pexels license

[2] Image courtesy of cottonbro studio under Pexels license

[3] Image provided courtesy of Keira Burton under Pexels license

References:

Badia, P., Culbertson, S. A., & Harsh, J. (1974). Relative aversiveness of signaled vs. unsignaled avoidable and escapable shock situations in humans. Journal of Comparative and Physiological Psychology, 87, 338-346. https://psycnet.apa.org/doi/10.1037/h0036851

Gayman, C. M., Jimenez, S. T., Hammock, S., Taylor, S., & Rocheleau, J. M. (2023). The effects of cumulative and noncumulative exams within the context of interteaching. Journal of Behavioral Education, 32, 261-276. https://doi.org/10.1007/s10864-021-09451-4

Grimaldi, P. J., & Karpicke, J. D. (2012). When and why do retrieval attempts enhance subsequent encoding? Memory & Cognition, 40, 505-513. https://doi.org/10.3758/s13421-011-0174-0

Heinicke, M. R., Zuckerman, C. K., & Cravalho, D. A. (2017). An evaluation of readiness assessment tests in a college classroom: Exam performance, attendance, and participation. Behavior Analysis: Research and Practice, 17, 129-141. http://dx.doi.org/10.1037/bar0000073

Hopkins, R. F., Lyle, K. B., Hieb, J. L., & Ralston, P. A. S. (2016). Spaced retrieval practice increases college students’ short- and long-term retention of mathematics knowledge. Educational Psychology Review, 28, 853-873. https://doi.org/10.1007/s10648-015-9349-8

Imam, A. A. (2014). Beneficial assessment outcomes from frequent testing. The International Journal of Assessment and Evaluation, 20, 15-23. http://dx.doi.org/10.18848/2327-7920/CGP/v20i02/48339

Jarmolowicz,D. P., Hayashi, Y., & St. Peter Pipkin, C. (2010). Temporal patterns of behavior from the scheduling of psychology quizzes. Journal of Applied Behavior Analysis, 43, 297-301. https://doi.org/10.1901/jaba.2010.43-297

Karpicke, J. D. (2012). Retrieval-based learning: Active retrieval promotes meaningful learning. Current Directions in Psychological Science, 21, 157-163. https://doi.org/10.1177/0963721412443552

Khanna, M. M., Badura Brack, A. S., & Finken, L. L. (2013). Short- and long-term effects of cumulative finals on student learning. Teaching of Psychology, 40, 175-182. https://doi.org/10.1177/0098628313487458

Lawrence, N. K. (2013). Cumulative exams in the introductory psychology course. Teaching of Psychology, 40, 15-19. https://doi.org/10.1177/0098628312465858

Roediger, H. L., III. (2013). Applying cognitive psychology to education: Translational educational science. Psychological Science in the Public Interest, 14, 1-3. https://doi.org/10.1177/1529100612454415

Sidman, M. (1953). Avoidance conditioning with brief shock and no exteroceptive warning signal. Science, 118, 157-158. https://doi.org/10.1126/science.118.3058.157

Blog post contributed by Melissa Swisher.