| Acknowledgments |
|
xv | |
| A Foreword (in Berkian Style) by Mike Theall |
|
xix | |
| Introduction |
|
1 | (8) |
| 1 TOP 13 SOURCES OF EVIDENCE OF TEACHING EFFECTIVENESS |
|
9 | (38) |
|
|
|
10 | (1) |
|
Teaching Effectiveness: Defining the Construct |
|
|
11 | (3) |
|
|
|
12 | (1) |
|
|
|
13 | (1) |
|
A Unified Conceptualization |
|
|
13 | (1) |
|
Thirteen Sources of Evidence |
|
|
14 | (30) |
|
|
|
15 | (4) |
|
|
|
19 | (3) |
|
|
|
22 | (1) |
|
|
|
23 | (1) |
|
|
|
24 | (1) |
|
|
|
25 | (2) |
|
|
|
27 | (1) |
|
|
|
28 | (1) |
|
|
|
29 | (1) |
|
|
|
30 | (1) |
|
|
|
31 | (1) |
|
Learning Outcome Measures |
|
|
32 | (2) |
|
|
|
34 | (3) |
|
BONUS: 360° Multisource Assessment |
|
|
37 | (7) |
|
|
|
44 | (1) |
|
|
|
45 | (1) |
|
|
|
45 | (1) |
|
|
|
45 | (1) |
|
|
|
45 | (2) |
| 2 CREATING THE RATING SCALE STRUCTURE |
|
47 | (18) |
|
Overview of the Scale Construction Process |
|
|
48 | (1) |
|
Specifying the Purpose of the Scale |
|
|
48 | (2) |
|
Delimiting What Is to Be Measured |
|
|
50 | (4) |
|
|
|
50 | (1) |
|
|
|
50 | (1) |
|
|
|
51 | (3) |
|
Determining How to Measure the "What" |
|
|
54 | (4) |
|
|
|
55 | (1) |
|
|
|
55 | (1) |
|
Commercially Published Student Rating Scales |
|
|
56 | (2) |
|
|
|
58 | (2) |
|
Structure of Rating Scale Items |
|
|
60 | (5) |
|
|
|
60 | (2) |
|
|
|
62 | (3) |
| 3 GENERATING THE STATEMENTS |
|
65 | (20) |
|
|
|
66 | (1) |
|
|
|
66 | (1) |
|
|
|
66 | (1) |
|
Rules for Writing Statements |
|
|
66 | (19) |
|
1. The statement should be clear and direct. |
|
|
69 | (1) |
|
2. The statement should be brief and concise. |
|
|
69 | (2) |
|
3. The statement should contain only one complete behavior, thought, or concept. |
|
|
71 | (1) |
|
4. The statement should be a simple sentence. |
|
|
72 | (1) |
|
5. The statement should be at the appropriate reading level. |
|
|
72 | (1) |
|
6. The statement should be grammatically correct. |
|
|
73 | (1) |
|
7. The statement should be worded strongly. |
|
|
73 | (1) |
|
8. The statement should be congruent with the behavior it is intended to measure. |
|
|
74 | (1) |
|
9. The statement should accurately measure a positive or negative behavior. |
|
|
74 | (1) |
|
10. The statement should be applicable to all respondents. |
|
|
75 | (1) |
|
11. The respondents should be in the best position to respond to the statement. |
|
|
76 | (2) |
|
12. The statement should be interpretable in only one way. |
|
|
78 | (1) |
|
13. The statement should NOT contain a double negative. |
|
|
78 | (1) |
|
14. The statement should NOT contain universal or absolute terms. |
|
|
79 | (1) |
|
15. The statement should NOT contain nonabsolute, warm-and fuzzy terms. |
|
|
79 | (1) |
|
16. The statement should NOT contain value-laden or inflammatory words. |
|
|
80 | (1) |
|
17. The statement should NOT contain words, phrases, or abbreviations that would be unfamiliar to all respondents. |
|
|
81 | (1) |
|
18. The statement should NOT tap a behavior appearing in any other statement. |
|
|
81 | (1) |
|
19. The statement should NOT be factual or capable of being interpreted as factual. |
|
|
82 | (1) |
|
20. The statement should NOT be endorsed or given one answer by almost all respondents or by almost none. |
|
|
83 | (2) |
| 4 SELECTING THE ANCHORS |
|
85 | (20) |
|
|
|
86 | (8) |
|
|
|
86 | (1) |
|
|
|
87 | (2) |
|
|
|
89 | (1) |
|
|
|
90 | (1) |
|
|
|
91 | (3) |
|
Rules for Selecting Anchors |
|
|
94 | (11) |
|
1. The anchors should be consistent with the purpose of the rating scale. |
|
|
94 | (2) |
|
2. The anchors should match the statements, phrases, or word topics. |
|
|
96 | (2) |
|
3. The anchors should be logically appropriate with each statement. |
|
|
98 | (1) |
|
4. The anchors should be grammatically consistent with each question. |
|
|
99 | (1) |
|
5. The anchors should provide the most accurate and concrete responses possible. |
|
|
100 | (1) |
|
6. The anchors should elicit a range of responses. |
|
|
101 | (1) |
|
7. The anchors on bipolar scales should be balanced, not biased. |
|
|
101 | (1) |
|
8. The anchors on unipolar scales should be graduated appropriately. |
|
|
102 | (3) |
| 5 REFINING THE ITEM STRUCTURE |
|
105 | (16) |
|
Preparing for Structural Changes |
|
|
106 | (2) |
|
Issues in Scale Construction |
|
|
108 | (13) |
|
1. What rating scale format is best? |
|
|
108 | (1) |
|
2. How many anchor points should be on the scale? |
|
|
109 | (1) |
|
3. Should there be a designated midpoint position, such as "Neutral," "Uncertain," or "Undecided," on the scale? |
|
|
110 | (1) |
|
4. How many anchors should be specified on the scale? |
|
|
111 | (1) |
|
5. Should numbers be placed on the anchor scale? |
|
|
112 | (1) |
|
6. Should a "Not Applicable" (NA) or "Not Observed" (NO) option be provided? |
|
|
113 | (2) |
|
7. How can response set biases be minimized? |
|
|
115 | (6) |
| 6 ASSEMBLING THE SCALE FOR ADMINISTRATION |
|
121 | (20) |
|
|
|
122 | (8) |
|
Identification Information |
|
|
122 | (1) |
|
|
|
123 | (1) |
|
|
|
124 | (1) |
|
|
|
125 | (4) |
|
|
|
129 | (1) |
|
|
|
130 | (11) |
|
Paper-Based Administration |
|
|
131 | (2) |
|
|
|
133 | (4) |
|
Comparability of Paper-Based and Online Ratings |
|
|
137 | (2) |
|
|
|
139 | (2) |
| 7 FIELD TESTING AND ITEM ANALYSES |
|
141 | (20) |
|
Preparing the Draft Scale for a Test Spin |
|
|
142 | (1) |
|
|
|
143 | (5) |
|
|
|
143 | (4) |
|
|
|
147 | (1) |
|
|
|
148 | (13) |
|
Stage 1: Item Descriptive Statistics |
|
|
148 | (4) |
|
Stage 2: Interitem and Item-Scale Correlations |
|
|
152 | (3) |
|
|
|
155 | (6) |
| 8 COLLECTING EVIDENCE OF VALIDITY AND RELIABILITY |
|
161 | (24) |
|
|
|
162 | (12) |
|
Evidence Based on Job Content Domain |
|
|
164 | (4) |
|
Evidence Based on Response Processes |
|
|
168 | (1) |
|
Evidence Based on Internal Scale Structure |
|
|
169 | (2) |
|
Evidence Related to Other Measures of Teaching Effectiveness |
|
|
171 | (1) |
|
Evidence Based on the Consequences of Ratings |
|
|
172 | (2) |
|
|
|
174 | (8) |
|
Classical Reliability Theory |
|
|
174 | (1) |
|
Summated Rating Scale Theory |
|
|
175 | (1) |
|
Methods for Estimating Reliability |
|
|
176 | (6) |
|
|
|
182 | (3) |
| 9 REPORTING AND INTERPRETING SCALE RESULTS |
|
185 | (30) |
|
Generic Levels of Score Reporting |
|
|
186 | (8) |
|
|
|
186 | (1) |
|
|
|
187 | (3) |
|
|
|
190 | (1) |
|
|
|
191 | (1) |
|
|
|
192 | (1) |
|
Subject Matter/Program-Level State, Regional, and National Norms |
|
|
193 | (1) |
|
Criterion-Referenced versus Norm-Referenced Score Interpretations |
|
|
194 | (3) |
|
|
|
194 | (1) |
|
Criterion-Referenced Interpretations |
|
|
195 | (1) |
|
Norm-Referenced Interpretations |
|
|
196 | (1) |
|
Formative, Summative, and Program Decisions |
|
|
197 | (16) |
|
|
|
198 | (6) |
|
|
|
204 | (8) |
|
|
|
212 | (1) |
|
|
|
213 | (2) |
| References |
|
215 | (26) |
| Appendices |
|
241 | (40) |
|
A. Sample "Home-Grown" Rating Scales |
|
|
241 | (16) |
|
B. Sample 360° Assessment Rating Scales |
|
|
257 | (16) |
|
C. Sample Reporting Formats |
|
|
273 | (4) |
|
D. Commercially Published Student Rating Scale Systems |
|
|
277 | (4) |
| Index |
|
281 | |