About the author |
|
xiii | |
What makes this book different? |
|
xv | |
|
1 The conventional method is a flawed fusion |
|
|
1 | (4) |
|
1.1 Three statisticians, two methods, and the mess that should be banned |
|
|
1 | (2) |
|
1.2 Wise use and testing nulls that must be false |
|
|
3 | (1) |
|
1.3 Null hypothesis testing in perspective |
|
|
4 | (1) |
|
|
4 | (1) |
|
2 The point is to generalize beyond our results |
|
|
5 | (6) |
|
2.1 Samples and populations |
|
|
5 | (1) |
|
2.2 Real and hypothetical populations |
|
|
6 | (1) |
|
|
6 | (1) |
|
2.4 Know your population, and do not generalize beyond it |
|
|
7 | (1) |
|
|
8 | (3) |
|
3 Null hypothesis testing explained |
|
|
11 | (16) |
|
3.1 The effect of sampling error |
|
|
11 | (1) |
|
3.2 The logic of testing a null hypothesis |
|
|
12 | (2) |
|
3.3 We should know from the start that many null hypotheses cannot be correct |
|
|
14 | (1) |
|
3.4 The traditional explanation of how to use p |
|
|
15 | (1) |
|
3.5 What use of a accomplishes |
|
|
16 | (1) |
|
3.6 The flawed hybrid in action |
|
|
17 | (1) |
|
3.7 Criticisms of the flawed hybrid |
|
|
17 | (1) |
|
3.8 We should test nulls in a way that answers the criticisms |
|
|
18 | (1) |
|
|
19 | (1) |
|
3.10 Mouse preference, done right this time |
|
|
20 | (1) |
|
3.11 More p-values in action |
|
|
21 | (2) |
|
3.12 What were the nulls and predictions? |
|
|
23 | (1) |
|
3.13 What if p = 0.05000? |
|
|
23 | (1) |
|
3.14 A radical but wise way to use p |
|
|
24 | (1) |
|
|
24 | (1) |
|
|
24 | (3) |
|
4 How often do we get it wrong? |
|
|
27 | (18) |
|
4.1 Distributions around means |
|
|
27 | (2) |
|
4.2 Distributions of test statistics |
|
|
29 | (1) |
|
4.3 Null hypothesis testing explained with distributions |
|
|
30 | (1) |
|
4.4 Type I errors explained |
|
|
31 | (1) |
|
4.5 Probabilities before and after collecting data |
|
|
32 | (1) |
|
4.6 The null's precision explained |
|
|
32 | (1) |
|
4.7 The awkward definition of p explained |
|
|
33 | (1) |
|
|
33 | (2) |
|
4.9 Power and errors in direction |
|
|
35 | (1) |
|
4.10 Manipulating power to lower p-values |
|
|
36 | (1) |
|
4.11 Increasing power with one-tailed tests |
|
|
37 | (3) |
|
4.12 Power and why we should we set a to 0.10 or higher |
|
|
40 | (1) |
|
4.13 Power, estimated effect size, and type M errors |
|
|
40 | (2) |
|
4.14 How can we know a population's distribution? |
|
|
42 | (1) |
|
|
42 | (3) |
|
5 Important things to know about null hypothesis testing |
|
|
45 | (12) |
|
5.1 Examples of null hypotheses in proper statistics books and what they really mean |
|
|
45 | (1) |
|
5.2 Categories of null hypotheses? |
|
|
46 | (3) |
|
5.3 What if is important to accept the null? |
|
|
49 | (1) |
|
|
49 | (2) |
|
5.5 Null hypothesis testing as never explained before |
|
|
51 | (1) |
|
5.6 Effect size: what is it and when is it important? |
|
|
52 | (1) |
|
5.7 We should provide all results, even those not statistically "significant" |
|
|
53 | (2) |
|
|
55 | (2) |
|
|
57 | (10) |
|
6.1 Null hypothesis testing is misunderstood by many |
|
|
57 | (1) |
|
6.2 Statistical "significance" means a difference is large enough to be important-wrong! |
|
|
57 | (2) |
|
6.3 P is the probability of a type I error-wrong! |
|
|
59 | (1) |
|
6.4 If results are statistically "significant," we should accept the alternative hypothesis that something other than the null is correct-wrong! |
|
|
59 | (1) |
|
6.5 If results are not statistically "significant," we should accept the null hypothesis-wrong! |
|
|
60 | (1) |
|
6.6 Based on p we should either reject or fail to reject the null hypothesis-often wrong! |
|
|
60 | (1) |
|
6.7 Null hypothesis testing is so flawed that we should use confidence intervals instead-wrong! |
|
|
61 | (1) |
|
6.8 Power can be used to justify accepting the null hypothesis-wrong! |
|
|
62 | (1) |
|
6.9 The null hypothesis is a statement of no difference-not always |
|
|
63 | (1) |
|
6.10 The null hypothesis is that there will be no significant difference between the expected and observed values-very, very wrong! |
|
|
64 | (1) |
|
6.11 A null hypothesis should not be a negative statement-wrong! |
|
|
64 | (1) |
|
|
64 | (3) |
|
7 The debate over null hypothesis testing and wise use as the solution |
|
|
67 | (10) |
|
7.1 The debate over null hypothesis testing |
|
|
67 | (1) |
|
7.2 Communicate to educate |
|
|
68 | (1) |
|
|
69 | (1) |
|
7.4 Test nulls when appropriate, not promiscuously |
|
|
69 | (1) |
|
7.5 Strike the right balance between what is conventional and what is best |
|
|
70 | (1) |
|
7.6 Think outside of the null hypothesis test |
|
|
71 | (1) |
|
7.7 Encourage our audience to draw their own conclusions |
|
|
71 | (1) |
|
7.8 Allow ourselves to draw our own conclusions |
|
|
72 | (1) |
|
7.9 Strike the right balance when providing our results |
|
|
72 | (1) |
|
7.10 Know the misconceptions and do not fall for them |
|
|
72 | (1) |
|
7.11 Do not say that two groups "differ" or "do not differ" |
|
|
73 | (1) |
|
7.12 Provide all results somehow |
|
|
73 | (1) |
|
7.13 Other reformed methods of null hypothesis testing |
|
|
74 | (2) |
|
|
76 | (1) |
|
8 Simple principles behind the mathematics and some essential concepts |
|
|
77 | (18) |
|
8.1 Why different types of data require different types of tests |
|
|
77 | (2) |
|
8.1.1 Simple principles behind the mathematics |
|
|
77 | (1) |
|
8.1.2 Numerical data exhibit variation |
|
|
77 | (1) |
|
8.1.3 Nominal data do not exhibit variation |
|
|
78 | (1) |
|
8.1.4 How to tell the difference between nominal and numerical data |
|
|
78 | (1) |
|
8.2 Simple principles behind the analysis of groups of measurements and discrete numerical data |
|
|
79 | (4) |
|
8.2.1 Variance: a statistic of huge importance |
|
|
79 | (2) |
|
8.2.2 Incorporating sample size and the difference between our prediction and our outcome |
|
|
81 | (2) |
|
8.3 Drawing conclusions when we knew all along that the null must be false |
|
|
83 | (1) |
|
8.4 Degrees of freedom explained |
|
|
83 | (1) |
|
8.5 Other types of t tests |
|
|
84 | (1) |
|
8.6 Analysis of variance and t tests have certain requirements |
|
|
85 | (1) |
|
8.7 Do not test for equal variances unless |
|
|
85 | (1) |
|
8.8 Simple principles behind the analysis of counts of observations within categories |
|
|
86 | (4) |
|
8.8.1 Counts of observations within categories |
|
|
86 | (1) |
|
8.8.2 When the null hypothesis specifies the prediction |
|
|
86 | (2) |
|
8.8.3 When there is only one degree of freedom |
|
|
88 | (1) |
|
8.8.4 When the null hypothesis does not specify the prediction |
|
|
89 | (1) |
|
8.9 Interpreting p when the null hypothesis cannot be correct |
|
|
90 | (1) |
|
8.10 2 × 2 Designs and other variations |
|
|
90 | (1) |
|
8.11 The problem with chi-squared tests |
|
|
91 | (1) |
|
8.12 The reasoning behind the mathematics |
|
|
92 | (1) |
|
8.13 Rules for chi-squared tests |
|
|
93 | (1) |
|
|
93 | (2) |
|
9 The two-sample r test and the importance of pooled variance |
|
|
95 | (4) |
|
10 Comparing more than two groups to each other |
|
|
99 | (10) |
|
10.1 If we have three or more samples, most say we cannot use two-sample f tests to compare them two samples at a time |
|
|
99 | (1) |
|
10.2 Analysis of variance |
|
|
99 | (3) |
|
10.3 The price we pay is power |
|
|
102 | (1) |
|
10.4 Comparing every group to every other group |
|
|
103 | (2) |
|
10.5 Comparing multiple groups to a single reference, like a control |
|
|
105 | (2) |
|
10.6 Is all of this a load of rubbish? |
|
|
107 | (1) |
|
|
108 | (1) |
|
11 Assessing the combined effects of multiple independent variables |
|
|
109 | (16) |
|
11.1 Independent variables alone and in combination |
|
|
109 | (6) |
|
11.2 No, we may not use multiple f tests |
|
|
115 | (2) |
|
11.3 We have a statistical main effect: now what? |
|
|
117 | (1) |
|
11.4 We have a statistical interaction: things to consider |
|
|
118 | (1) |
|
11.5 We have a statistical interaction and we want to keep testing nulls |
|
|
119 | (1) |
|
11.6 Which is more important, the main effect or the interaction? |
|
|
120 | (1) |
|
11.7 Designs with more than two independent variables |
|
|
121 | (1) |
|
11.8 Use of analysis of variance to reduce variation and increase power |
|
|
122 | (2) |
|
|
124 | (1) |
|
12 Comparing slopes: analysis of covariance |
|
|
125 | (6) |
|
12.1 Analysis of covariance |
|
|
125 | (1) |
|
12.2 Use of analysis of covariance to reduce variation and increase power |
|
|
125 | (3) |
|
12.3 More on the use of analysis of covariance to reduce variation and increase power |
|
|
128 | (1) |
|
12.4 Use of analysis of covariance to limit the effects of a confound |
|
|
129 | (1) |
|
|
130 | (1) |
|
13 When data do not meet the requirements of f tests and analysis of variance |
|
|
131 | (10) |
|
13.1 When do we need to take action? |
|
|
131 | (1) |
|
13.2 Floor effects and the square root transformation |
|
|
132 | (1) |
|
13.3 Floor and ceiling effects and the arcsine transformation |
|
|
133 | (2) |
|
13.4 Not as simple as a floor or ceiling effect-the rank transformation |
|
|
135 | (2) |
|
13.5 Making analysis of variance sensitive to differences in proportion-the logarithmic transformation |
|
|
137 | (1) |
|
|
138 | (1) |
|
13.7 Transforming data changes the question being asked |
|
|
139 | (1) |
|
|
140 | (1) |
|
14 Reducing variation and increasing power by comparing subjects to themselves |
|
|
141 | (3) |
|
14.1 The simple principle behind the mathematics |
|
|
141 | (2) |
|
14.2 Repeated measures analysis of variances |
|
|
143 | (1) |
|
14.3 Multiple comparisons tests on repeated measures |
|
|
143 | (1) |
|
14.4 When subjects are not organisms |
|
|
144 | (7) |
|
14.5 When repeated does not mean repeated over time |
|
|
144 | (1) |
|
14.6 Pretest-posttest designs illustrate the danger of measures repeated over time |
|
|
145 | (1) |
|
14.7 Repeated measures analysis of variance versus t tests |
|
|
145 | (1) |
|
14.8 The problem with repeated measures |
|
|
146 | (3) |
|
14.8.1 The requirement for sphericity |
|
|
146 | (1) |
|
14.8.2 Correcting for a lack of sphericity |
|
|
147 | (1) |
|
14.8.3 Multiple comparisons tests when there is a lack of sphericity |
|
|
148 | (1) |
|
14.8.4 The multivariate alternative to correction |
|
|
148 | (1) |
|
|
149 | (2) |
|
15 What do those error bars mean? |
|
|
151 | (6) |
|
15.1 Confidence intervals |
|
|
151 | (1) |
|
15.2 Testing null hypotheses in our heads |
|
|
152 | (1) |
|
15.3 Plotting confidence intervals |
|
|
153 | (1) |
|
15.4 Error bars and repeated measures |
|
|
154 | (1) |
|
15.5 Plot comparative confidence intervals to make the overlap myth a reality |
|
|
155 | (1) |
|
|
156 | (1) |
|
Appendix A Philosophical objections |
|
|
157 | (10) |
|
A.1 Decades of bitter debate |
|
|
157 | (1) |
|
A.2 We want to know when we are wrong, not how often |
|
|
157 | (1) |
|
A.3 Setting a to 0.05 does not mean that 5% of all null-based decisions are wrong |
|
|
158 | (1) |
|
A.4 There are better ways to analyze and interpret data |
|
|
159 | (1) |
|
A.5 The fallacy of affirming the consequent |
|
|
159 | (1) |
|
A.6 Some say our method cannot be used to determine direction |
|
|
160 | (5) |
|
A.6.1 The return of one-tailed tests |
|
|
160 | (1) |
|
A.6.2 Kaiser's absurd directional two-tailed tests |
|
|
161 | (2) |
|
A.6.3 Invoking power to justify Kaiser's directional two-tailed tests |
|
|
163 | (1) |
|
A.6.4 Fisher did not follow Kaiser's rules |
|
|
164 | (1) |
|
A.6.5 Still not convinced? |
|
|
165 | (1) |
|
|
165 | (2) |
|
Appendix B How Fisher used null hypothesis tests |
|
|
167 | (12) |
|
B.1 Why follow my advice? |
|
|
167 | (1) |
|
B.2 Fisher tested for direction |
|
|
167 | (2) |
|
|
169 | (1) |
|
B.4 Fisher believed a should vary according to the circumstances |
|
|
169 | (1) |
|
B.5 Fisher came close to saying there should be no a at all |
|
|
170 | (1) |
|
B.6 In practice, Fisher did not categorize outcomes |
|
|
171 | (2) |
|
B.7 Fisher's language answers many criticisms of null hypothesis testing |
|
|
173 | (1) |
|
B.8 Except for Fisher's use of "significant" |
|
|
173 | (1) |
|
B.9 Fisher's inconsistency explained |
|
|
174 | (1) |
|
B.10 Fisher's thinking expressed in one word |
|
|
174 | (1) |
|
B.11 We have come a long way since Fisher, but the wrong way? |
|
|
175 | (4) |
Notes |
|
177 | (12) |
|
Appendix C The method attributed to Neyman and Pearson |
|
|
179 | (10) |
|
C.1 Neyman and Pearson with Pearson |
|
|
179 | (1) |
|
C.2 Neyman and Pearson without Pearson |
|
|
180 | (2) |
|
C.3 An important limitation |
|
|
182 | (1) |
|
C.4 Alternatives are always infinitely numerically precise |
|
|
182 | (1) |
|
C.5 The method step-by-step |
|
|
183 | (1) |
|
C.6 The method's influence on the flawed hybrid |
|
|
184 | (1) |
|
C.7 The method's fate in the world of the flawed hybrid |
|
|
185 | (1) |
|
C.8 Power spreads its wings |
|
|
185 | (1) |
|
C.9 Neyman et al.'s method has no place in science |
|
|
186 | (1) |
|
|
187 | (2) |
Index |
|
189 | |