This study examines the differences in
equating outcomes between two trend score
equating designs resulting from two different scoring strategies for trend scoring when operational constructed-response (CR) items are double-scored--the single group (SG) design, where each trend CR item is double-scored, and the nonequivalent groups with anchor test (NEAT) design, where each trend CR item is single-scored during trend score equating--for varying sample sizes (n =150, 200, 250, 300, 400).
The impact of different anchors on observed score
equating were evaluated and compared with respect to systematic error (bias), random
equating error (standard errors of
equating), and total
equating error (RMSE) using empirical data.
This report proposes an empirical Bayes approach to the problem of
equating scores on test forms taken by very small numbers of test takers.
In this paper, we develop a new chained equipercentile
equating procedure for the nonequivalent groups with anchor test (NEAT) design under the assumptions of the classical test theory model.
This study investigated kernel
equating methods by comparing these methods to operational
equatings for two tests in the SAT Subject Tests[TM] program.
In this report, an alternative item response theory (IRT) observed score
equating method was newly developed.