Acknowledgments |
|
xi | |
Preface |
|
xii | |
|
|
1 | (46) |
|
1 Introduction and Overview |
|
|
3 | (24) |
|
|
5 | (1) |
|
|
6 | (2) |
|
What Is Invariant Measurement? |
|
|
8 | (2) |
|
What Is Invariant Measurement with Raters? |
|
|
10 | (4) |
|
|
14 | (2) |
|
What Are Rater-Mediated Wright Maps? |
|
|
16 | (2) |
|
Case Study: Middle School Writing Assessment |
|
|
18 | (6) |
|
|
24 | (3) |
|
2 Progress in the Social Sciences: An Historical and Philosophical Perspective |
|
|
27 | (20) |
|
History and Philosophy of Science |
|
|
29 | (6) |
|
Measurement Problems in the Social Sciences |
|
|
35 | (7) |
|
|
42 | (5) |
|
PART II Theories of Measurement and Judgment for Rating Scales |
|
|
47 | (48) |
|
3 Measurement Models for Rater-Mediated Assessments: A Tale of Two Research Traditions |
|
|
49 | (31) |
|
Research Traditions in Measurement for Rater-Mediated Assessments |
|
|
50 | (9) |
|
Comparative Perspective on the Two Traditions for Rater-Mediated Assessments |
|
|
59 | (14) |
|
|
73 | (7) |
|
4 Lens Models of Human Judgment for Rater-Mediated Assessments |
|
|
80 | (15) |
|
|
82 | (4) |
|
What Are Lens Models for Human Judgment? |
|
|
86 | (1) |
|
How Have Lens Models Been Used for Rater-Mediated Assessments of Student Performance? |
|
|
87 | (3) |
|
|
90 | (5) |
|
PART III Foundational Areas for Rating Scales |
|
|
95 | (114) |
|
5 Validity, Invariant Measurement, and Rater-Mediated Assessments |
|
|
97 | (25) |
|
|
98 | (2) |
|
What Is the Current Consensus Definition of Validity? |
|
|
100 | (4) |
|
Howls Validity Defined for Rater-Mediated Assessments? |
|
|
104 | (2) |
|
What Constitutes Validity Evidence to Support the Interpretation and Use of Rater-Mediated Assessments? |
|
|
106 | (11) |
|
|
117 | (5) |
|
6 Reliability, Precision, and Errors of Measurement for Ratings |
|
|
122 | (25) |
|
|
123 | (4) |
|
What Is the Current Consensus Definition of Reliability? |
|
|
127 | (3) |
|
How Is Reliability Defined for Rater-Mediated Assessments? |
|
|
130 | (2) |
|
What Constitutes Reliability Evidence to Support the Interpretation and Use of Rater-Mediated Assessments? |
|
|
132 | (12) |
|
|
144 | (3) |
|
7 Fairness in Rater-Mediated Assessment: Appropriate Interpretation and Use of Ratings |
|
|
147 | (20) |
|
|
148 | (3) |
|
What Is the Current Consensus Definition of Fairness? |
|
|
151 | (3) |
|
How Is Fairness Defined for Rater-Mediated Assessments? |
|
|
154 | (2) |
|
What Constitutes Fairness Evidence to Support the Interpretation and Use of Rater-Mediated Assessments? |
|
|
156 | (8) |
|
|
164 | (3) |
|
8 Case Study: Evidence for the Validity, Reliability, and Fairness of Ratings on a Middle Grades Writing Assessment |
|
|
167 | (42) |
|
Methodology of Case Study |
|
|
168 | (2) |
|
|
170 | (31) |
|
|
201 | (8) |
|
PART IV Technical Issues and IRT Models for Ratings |
|
|
209 | (46) |
|
9 Models for Ratings Based on Item Response Theory |
|
|
211 | (24) |
|
Historical Perspectives on Polytomous Data |
|
|
212 | (3) |
|
Two Item Response Theory Models for Ratings: Partial Credit and Graded Response Models |
|
|
215 | (3) |
|
Empirical Analyses of Two Item Response Theory Models for Ratings |
|
|
218 | (9) |
|
Issues in Modeling Ordered Categories within Rasch Measurement Theory |
|
|
227 | (4) |
|
|
231 | (4) |
|
10 Parameter Estimation for the Polytomous Rasch Model |
|
|
235 | (20) |
|
What Is Parameter Estimation? |
|
|
236 | (4) |
|
Illustration of a Pairwise Algorithm for Rating Scales |
|
|
240 | (5) |
|
Estimation of Person Locations |
|
|
245 | (6) |
|
|
251 | (4) |
|
|
255 | (70) |
|
11 Model-Data Fit for Polytomous Rating Scale Models |
|
|
257 | (18) |
|
|
258 | (5) |
|
Model-Data Fit for Rater-Mediated Assessments |
|
|
263 | (1) |
|
|
264 | (7) |
|
|
271 | (4) |
|
12 Designing Rater-Mediated Assessment Systems |
|
|
275 | (22) |
|
Building Blocks for Rater-Mediated Assessments |
|
|
277 | (13) |
|
Rasch Models as Equating Models for Rater-Mediated Assessments |
|
|
290 | (1) |
|
|
291 | (2) |
|
|
293 | (1) |
|
|
293 | (4) |
|
13 Examining Rating Scale Functioning |
|
|
297 | (28) |
|
How Can Polytomous Rasch Models Be Used to Examine Rating Scale Functioning? |
|
|
298 | (6) |
|
|
304 | (1) |
|
Evidence of Rating Scale Functioning |
|
|
305 | (17) |
|
|
322 | (3) |
|
|
325 | (12) |
|
14 Invariant Measurement with Raters and Rating Scales: Summary and Discussion |
|
|
327 | (10) |
|
Theories of Measurement and Judgment for Rating Scales |
|
|
329 | (1) |
|
Foundational Areas for Rating Scales |
|
|
330 | (1) |
|
Technical Issues and IRT Models for Ratings |
|
|
330 | (1) |
|
|
331 | (1) |
|
Future Trends and Promising Areas for New Developments |
|
|
332 | (2) |
|
|
334 | (3) |
Glossary |
|
337 | (8) |
Index |
|
345 | |