|
|
1 | (18) |
|
Measurements in the Physical World |
|
|
1 | (1) |
|
Measurements in the Psycho-social Science Context |
|
|
1 | (1) |
|
|
2 | (1) |
|
Formal Definitions of Psycho-social Measurement |
|
|
3 | (1) |
|
|
3 | (3) |
|
|
4 | (1) |
|
|
4 | (1) |
|
|
4 | (1) |
|
|
5 | (1) |
|
Increasing Levels of Measurement in the Meaningfulness of the Numbers |
|
|
5 | (1) |
|
The Process of Constructing Psycho-social Measurements |
|
|
6 | (3) |
|
|
7 | (1) |
|
Distinguish Between a General Survey and a Measuring Instrument |
|
|
7 | (1) |
|
Write, Administer, and Score Test Items |
|
|
8 | (1) |
|
|
9 | (1) |
|
|
9 | (4) |
|
|
10 | (1) |
|
|
11 | (1) |
|
Graphical Representations of Reliability and Validity |
|
|
12 | (1) |
|
|
13 | (1) |
|
|
13 | (2) |
|
|
14 | (1) |
|
|
14 | (1) |
|
|
15 | (2) |
|
|
17 | (1) |
|
|
18 | (1) |
|
2 Construct, Framework and Test Development---From IRT Perspectives |
|
|
19 | (22) |
|
|
19 | (1) |
|
Linking Validity to Construct |
|
|
20 | (1) |
|
Construct in the Context of Classical Test Theory (CTT) and Item Response Theory (IRT) |
|
|
21 | (3) |
|
Unidimensionality in Relation to a Construct |
|
|
24 | (2) |
|
The Nature of a Construct--Psychological Trait or Arbitrarily Defined Construct? |
|
|
24 | (1) |
|
Practical Considerations of Unidimensionality |
|
|
25 | (1) |
|
Theoretical and Practical Considerations in Reporting Sub-scale Scores |
|
|
25 | (1) |
|
|
26 | (1) |
|
Frameworks and Test Blueprints |
|
|
27 | (1) |
|
|
27 | (4) |
|
|
28 | (1) |
|
Number of Options for Multiple-Choice Items |
|
|
29 | (1) |
|
How Many Items Should There Be in a Test? |
|
|
30 | (1) |
|
|
31 | (3) |
|
Awarding Partial Credit Scores |
|
|
32 | (1) |
|
|
33 | (1) |
|
|
34 | (1) |
|
|
35 | (3) |
|
|
38 | (1) |
|
|
38 | (3) |
|
|
41 | (18) |
|
|
41 | (1) |
|
|
41 | (5) |
|
Magnitude of Measurement Error for Individual Students |
|
|
42 | (1) |
|
Scores in Standard Deviation Unit |
|
|
43 | (1) |
|
What Accuracy Is Sufficient? |
|
|
44 | (1) |
|
Summary About Measuring Individuals |
|
|
45 | (1) |
|
|
46 | (2) |
|
Computation of Sampling Error |
|
|
47 | (1) |
|
Summary About Measuring Populations |
|
|
47 | (1) |
|
Placement of Items in a Test |
|
|
48 | (3) |
|
Implications of Fatigue Effect |
|
|
48 | (1) |
|
Balanced Incomplete Block (BIB) Booklet Design |
|
|
49 | (2) |
|
|
51 | (2) |
|
|
53 | (1) |
|
|
54 | (1) |
|
|
54 | (2) |
|
Appendix 1 Computation of Measurement Error |
|
|
56 | (1) |
|
|
57 | (1) |
|
|
57 | (2) |
|
4 Test Administration and Data Preparation |
|
|
59 | (14) |
|
|
59 | (1) |
|
Sampling and Test Administration |
|
|
59 | (5) |
|
|
60 | (2) |
|
|
62 | (2) |
|
Data Collection and Processing |
|
|
64 | (4) |
|
|
64 | (1) |
|
|
65 | (1) |
|
|
66 | (1) |
|
|
67 | (1) |
|
|
68 | (1) |
|
|
69 | (1) |
|
|
69 | (1) |
|
|
70 | (2) |
|
|
72 | (1) |
|
|
72 | (1) |
|
|
73 | (18) |
|
|
73 | (1) |
|
Concepts of Measurement Error and Reliability |
|
|
73 | (3) |
|
Formal Definitions of Reliability and Measurement Error |
|
|
76 | (6) |
|
Assumptions of Classical Test Theory |
|
|
76 | (1) |
|
Definition of Parallel Tests |
|
|
77 | (1) |
|
Definition of Reliability Coefficient |
|
|
77 | (2) |
|
Computation of Reliability Coefficient |
|
|
79 | (2) |
|
Standard Error of Measurement (SEM) |
|
|
81 | (1) |
|
Correction for Attenuation (Dis-attenuation) of Population Variance |
|
|
81 | (1) |
|
Correction for Attenuation (Dis-attenuation) of Correlation |
|
|
82 | (1) |
|
|
82 | (6) |
|
|
82 | (2) |
|
Item Discrimination Measures |
|
|
84 | (1) |
|
Item Discrimination for Partial Credit Items |
|
|
85 | (2) |
|
Distinguishing Between Item Difficulty and Item Discrimination |
|
|
87 | (1) |
|
|
88 | (1) |
|
|
88 | (1) |
|
|
89 | (1) |
|
|
90 | (1) |
|
|
91 | (18) |
|
|
91 | (1) |
|
|
91 | (1) |
|
Ability Estimates Based on Raw Scores |
|
|
92 | (2) |
|
|
94 | (1) |
|
Estimating Ability Using Item Response Theory |
|
|
95 | (7) |
|
Estimation of Ability Using IRT |
|
|
98 | (3) |
|
Invariance of Ability Estimates Under IRT |
|
|
101 | (1) |
|
Computer Adaptive Tests Using IRT |
|
|
102 | (1) |
|
|
102 | (3) |
|
|
105 | (1) |
|
|
105 | (1) |
|
|
105 | (1) |
|
|
106 | (1) |
|
|
106 | (1) |
|
|
107 | (1) |
|
|
107 | (2) |
|
7 Rasch Model (The Dichotomous Case) |
|
|
109 | (30) |
|
|
109 | (1) |
|
|
109 | (2) |
|
Properties of the Rasch Model |
|
|
111 | (11) |
|
|
111 | (1) |
|
Indeterminacy of an Absolute Location of Ability |
|
|
112 | (1) |
|
|
113 | (1) |
|
Indeterminacy of an Absolute Discrimination or Scale Factor |
|
|
113 | (2) |
|
Different Discrimination Between Item Sets |
|
|
115 | (1) |
|
|
116 | (1) |
|
Building Learning Progressions Using the Rasch Model |
|
|
117 | (3) |
|
Raw Scores as Sufficient Statistics |
|
|
120 | (1) |
|
How Different Is IRT from CTT? |
|
|
121 | (1) |
|
Fit of Data to the Rasch Model |
|
|
122 | (1) |
|
Estimation of Item Difficulty and Person Ability Parameters |
|
|
122 | (1) |
|
Weighted Likelihood Estimate of Ability (WLE) |
|
|
123 | (1) |
|
|
124 | (1) |
|
Transformation of Logit Scores |
|
|
124 | (1) |
|
An Illustrative Example of a Rasch Analysis |
|
|
125 | (5) |
|
|
130 | (1) |
|
|
131 | (5) |
|
|
131 | (3) |
|
Task 2 Compare Logistic and Normal Ogive Functions |
|
|
134 | (1) |
|
Task 3 Compute the Likelihood Function |
|
|
135 | (1) |
|
|
136 | (1) |
|
|
137 | (1) |
|
|
138 | (1) |
|
8 Residual-Based Fit Statistics |
|
|
139 | (20) |
|
|
139 | (1) |
|
|
140 | (1) |
|
Residual-Based Fit Statistics |
|
|
141 | (2) |
|
|
143 | (1) |
|
Interpretations of Fit Mean-Square |
|
|
143 | (7) |
|
|
143 | (2) |
|
Not About the Amount of "Noise" Around the Item Characteristic Curve |
|
|
145 | (1) |
|
Discrete Observations and Fit |
|
|
146 | (1) |
|
Distributional Properties of Fit Mean-Square |
|
|
147 | (3) |
|
|
150 | (1) |
|
Item Fit Is Relative, Not Absolute |
|
|
151 | (2) |
|
|
153 | (2) |
|
|
155 | (1) |
|
|
155 | (2) |
|
|
157 | (2) |
|
|
159 | (28) |
|
|
159 | (1) |
|
The Derivation of the Partial Credit Model |
|
|
160 | (1) |
|
PCM Probabilities for All Response Categories |
|
|
161 | (1) |
|
|
161 | (1) |
|
Dichotomous Rasch Model Is a Special Case |
|
|
161 | (1) |
|
The Score Categories of PCM Are "Ordered" |
|
|
162 | (1) |
|
PCM Is not a Sequential Steps Model |
|
|
162 | (1) |
|
The Interpretation of δ&kapp; |
|
|
162 | (5) |
|
Item Characteristic Curves (ICC) for PCM |
|
|
163 | (1) |
|
Graphical Interpretation of the Delta (δ) Parameters |
|
|
163 | (1) |
|
Problems with the Interpretation of the Delta (δ) Parameters |
|
|
164 | (1) |
|
Linking the Graphical Interpretation of δ to the Derivation of PCM |
|
|
165 | (1) |
|
Examples of Delta (δ) Parameters and Item Response Categories |
|
|
165 | (2) |
|
|
167 | (3) |
|
Interpretation of δ•et; and τ&kapp; |
|
|
168 | (2) |
|
Thurstonian Thresholds, or Gammas (γ) |
|
|
170 | (3) |
|
Interpretation of the Thurstonian Thresholds |
|
|
170 | (1) |
|
Comparing with the Dichotomous Case Regarding the Notion of Item Difficulty |
|
|
171 | (1) |
|
Compare Thurstonian Thresholds with Delta Parameters |
|
|
172 | (1) |
|
Further Note on Thurstonian Probability Curves |
|
|
173 | (1) |
|
Using Expected Scores as Measures of Item Difficulty |
|
|
173 | (2) |
|
Applications of the Partial Credit Model |
|
|
175 | (6) |
|
Awarding Partial Credit Scores to Item Responses |
|
|
175 | (2) |
|
An Example Item Analysis of Partial Credit Items |
|
|
177 | (4) |
|
|
181 | (1) |
|
|
182 | (1) |
|
Generalized Partial Credit Model |
|
|
182 | (1) |
|
|
182 | (1) |
|
|
183 | (1) |
|
|
184 | (1) |
|
|
185 | (1) |
|
|
185 | (2) |
|
10 Two-Parameter IRT Models |
|
|
187 | (20) |
|
|
187 | (1) |
|
Discrimination Parameter as Score of an Item |
|
|
188 | (1) |
|
An Example Analysis of Dichotomous Items Using Rasch and 2PL Models |
|
|
189 | (5) |
|
|
191 | (3) |
|
A Note on the Constraints of Estimated Parameters |
|
|
194 | (2) |
|
A Note on the Parameterisation of Item Difficulty Parameters Under 2PL Model |
|
|
196 | (1) |
|
Impact of Different Item Weights on Ability Estimates |
|
|
196 | (1) |
|
Choosing Between the Rasch Model and 2PL Model |
|
|
197 | (2) |
|
2PL Models for Partial Credit Items |
|
|
197 | (1) |
|
|
198 | (1) |
|
A More Generalised Partial Credit Model |
|
|
199 | (1) |
|
A Note About Item Difficulty and Item Discrimination |
|
|
200 | (3) |
|
|
203 | (1) |
|
|
203 | (1) |
|
|
204 | (1) |
|
|
205 | (2) |
|
11 Differential Item Function |
|
|
207 | (20) |
|
|
207 | (1) |
|
|
208 | (2) |
|
|
208 | (2) |
|
Methods for Detecting DIF |
|
|
210 | (7) |
|
|
210 | (2) |
|
|
212 | (1) |
|
Statistical Significance Test |
|
|
213 | (2) |
|
|
215 | (1) |
|
|
216 | (1) |
|
How to Deal with DIF Items? |
|
|
217 | (5) |
|
Remove DIF Items from the Test |
|
|
219 | (1) |
|
Split DIF Items as Two New Items |
|
|
220 | (1) |
|
Retain DIF Items in the Data Set |
|
|
220 | (1) |
|
Cautions on the Presence of DIF Items |
|
|
221 | (1) |
|
A Practical Approach to Deal with DIF Items |
|
|
222 | (1) |
|
|
222 | (1) |
|
|
223 | (1) |
|
|
223 | (2) |
|
|
225 | (1) |
|
|
225 | (2) |
|
|
227 | (18) |
|
|
227 | (2) |
|
Overview of Equating Methods |
|
|
229 | (11) |
|
|
229 | (1) |
|
Checking for Item Invariance |
|
|
229 | (4) |
|
Number of Common Items Required for Equating |
|
|
233 | (1) |
|
Factors Influencing Change in Item Difficulty |
|
|
233 | (1) |
|
|
234 | (1) |
|
|
235 | (1) |
|
Shift and Scale Method by Matching Ability Distributions |
|
|
236 | (1) |
|
|
237 | (1) |
|
The Joint Calibration Method (Concurrent Calibration) |
|
|
237 | (1) |
|
Common Person Equating Method |
|
|
238 | (1) |
|
Horizontal and Vertical Equating |
|
|
239 | (1) |
|
Equating Errors (Link Errors) |
|
|
240 | (2) |
|
How Are Equating Errors Incorporated in the Results of Assessment? |
|
|
241 | (1) |
|
Challenges in Test Equating |
|
|
242 | (1) |
|
|
242 | (1) |
|
|
243 | (1) |
|
|
244 | (1) |
|
|
244 | (1) |
|
|
245 | (16) |
|
|
245 | (1) |
|
DIF Can Be Analysed Using a Facets Model |
|
|
246 | (1) |
|
An Example Analysis of Marker Harshness |
|
|
246 | (8) |
|
Ability Estimates in Facets Models |
|
|
250 | (3) |
|
|
253 | (1) |
|
An Example---Using a Facets Model to Detect Item Position Effect |
|
|
254 | (4) |
|
Structure of the Data Set |
|
|
254 | (1) |
|
Analysis of Booklet Effect Where Test Design Is not Balanced |
|
|
255 | (2) |
|
Analysis of Booklet Effect---Balanced Design |
|
|
257 | (1) |
|
Discussion of the Results |
|
|
257 | (1) |
|
|
258 | (1) |
|
|
258 | (1) |
|
|
259 | (1) |
|
|
259 | (1) |
|
|
259 | (2) |
|
14 Bayesian IRT Models (MML Estimation) |
|
|
261 | (22) |
|
|
261 | (1) |
|
|
262 | (5) |
|
|
266 | (1) |
|
Unidimensional Bayesian IRT Models (MML Estimation) |
|
|
267 | (6) |
|
|
267 | (1) |
|
|
267 | (1) |
|
|
268 | (1) |
|
Simulation 1: 40 Items and 2000 Persons 500 Replications |
|
|
269 | (2) |
|
Simulation 2: 12 Items and 2000 Persons 500 Replication |
|
|
271 | (1) |
|
Summary of Comparisons Between JML and MML Estimation Methods |
|
|
272 | (1) |
|
|
273 | (4) |
|
|
274 | (2) |
|
|
276 | (1) |
|
|
277 | (3) |
|
Facets and Latent Regression Models |
|
|
277 | (2) |
|
Relationship Between Latent Regression Model and Facets Model |
|
|
279 | (1) |
|
|
280 | (1) |
|
|
280 | (1) |
|
|
281 | (1) |
|
|
281 | (1) |
|
|
281 | (2) |
|
15 Multidimensional IRT Models |
|
|
283 | (16) |
|
|
283 | (1) |
|
Using Collateral Information to Enhance Measurement |
|
|
284 | (1) |
|
A Simple Case of Two Correlated Latent Variables |
|
|
285 | (3) |
|
Comparison of Population Statistics |
|
|
288 | (3) |
|
Comparisons of Population Means |
|
|
289 | (1) |
|
Comparisons of Population Variances |
|
|
289 | (1) |
|
Comparisons of Population Correlations |
|
|
290 | (1) |
|
Comparison of Test Reliability |
|
|
291 | (1) |
|
Data Sets with Missing Responses |
|
|
291 | (4) |
|
Production of Data Set for Secondary Data Analysts |
|
|
292 | (1) |
|
Imputation of Missing Scores |
|
|
293 | (2) |
|
|
295 | (1) |
|
|
295 | (1) |
|
|
296 | (1) |
|
|
296 | (1) |
|
|
296 | (3) |
Glossary |
|
299 | |