- Balancing new assessment systems within a state
Automated scoring engines offer great promise in reducing scoring costs and time while also ensuring scoring consistency and accuracy for K–12 assessment systems. However, little is known about how to effectively integrate automated scoring into an assessment program, what type of preparation is needed to train a scoring engine, and how an integrated human/automated scoring system should be monitored. In this presentation, we offer the Automated Scoring Decision Matrix (the AS matrix), which provides a structure for identifying suitable procedures for designing scoring approaches based on the program stakes and the item types included in the test. Our hope is that the AS matrix will enable practitioners to design and implement assessment programs that incorporate automated scoring appropriately.
The presentation will provide examples of decision points for each of three components of an automated scoring system: 1) what type of scoring model is appropriate, ranging from the scoring engine as the primary score provider to humans as the primary score provider; 2) what type of data are used to train the engine and how the data are used in the training process; and 3) what type of monitoring should be conducted. We show how decisions about these three components can be made based on two factors—the type of item (e.g., essay, technology-enhanced, highly constrained constructed-response items, unconstrained constructed-response items, and scaffolded/complex items) and program stakes (e.g., low, medium, and high)—and walk participants through the process of using the matrix to make these decisions. Following this overview, we will present our experiences with designing and implementing models in high- and low-stakes environments across a variety of item types and illustrate how these testing programs fit into the decision matrix.
Participants will be encouraged to ask questions and apply the matrix to assessment programs of interest to them, with an opportunity to discuss each decision point.