Preface |
|
xiii | |
1 Why Do We Need Another Book on Statistics? |
|
xiii | |
2 Statistics and Scientific Rigour |
|
xiv | |
3 Why Is Statistics Difficult? |
|
xvi | |
4 Looking Down the Observer's End of the Telescope |
|
xvii | |
5 What Do Linguists Need to Know About Statistics? |
|
xix | |
Acknowledgments |
|
xxii | |
A Note on Terminology and Notation |
|
xxiv | |
Contingency Tests for Different Purposes |
|
xxvi | |
|
|
1 | (24) |
|
1 What Might Corpora Tell Us About Language? |
|
|
3 | (22) |
|
|
3 | (3) |
|
1.2 What Might a Corpus Tell Us? |
|
|
6 | (2) |
|
|
8 | (6) |
|
1.3.1 Annotation, Abstraction and Analysis |
|
|
8 | (3) |
|
1.3.2 The Problem of Representational Plurality |
|
|
11 | (1) |
|
1.3.3 ICECUP: A Platform for Treebank Research |
|
|
12 | (2) |
|
1.4 What Might a Richly Annotated Corpus Tell Us? |
|
|
14 | (1) |
|
1.5 External Influences: Modal Shall I Will Over Time |
|
|
15 | (2) |
|
1.6 Interacting Grammatical Decisions: NP Premodification |
|
|
17 | (3) |
|
1.7 Framing Constraints and Interaction Evidence |
|
|
20 | (3) |
|
1.7.1 Framing Frequency Evidence |
|
|
20 | (1) |
|
1.7.2 Framing Interaction Evidence |
|
|
21 | (1) |
|
1.7.3 Framing and Annotation |
|
|
22 | (1) |
|
1.7.4 Framing and Sampling |
|
|
22 | (1) |
|
|
23 | (2) |
|
PART 2 Designing Experiments with Corpora |
|
|
25 | (70) |
|
2 The Idea of Corpus Experiments |
|
|
27 | (20) |
|
|
27 | (1) |
|
2.2 Experimentation and Observation |
|
|
28 | (4) |
|
|
28 | (1) |
|
2.2.2 Research Questions and Hypotheses |
|
|
29 | (2) |
|
2.2.3 From Hypothesis to Experiment |
|
|
31 | (1) |
|
2.3 Evaluating a Hypothesis |
|
|
32 | (5) |
|
2.3.1 The Chi-Square Test |
|
|
33 | (1) |
|
|
34 | (1) |
|
2.3.3 Visualising Proportions, Probabilities and Significance |
|
|
35 | (2) |
|
2.4 Refining the Experiment |
|
|
37 | (2) |
|
2.5 Correlations and Causes |
|
|
39 | (1) |
|
2.6 A Linguistic Interaction Experiment |
|
|
40 | (3) |
|
2.7 Experiments and Disproof |
|
|
43 | (1) |
|
2.8 What Is the Point of an Experiment? |
|
|
44 | (1) |
|
|
44 | (3) |
|
3 That Vexed Problem of Choice |
|
|
47 | (30) |
|
|
47 | (6) |
|
3.1.1 The Traditional `Per Million Words' Approach |
|
|
47 | (1) |
|
3.1.2 How Did Per Million Word Statistics Become Dominant? |
|
|
48 | (1) |
|
3.1.3 Choice Models and Linguistic Theory |
|
|
49 | (1) |
|
3.1.4 The Vexed Problem of Choice |
|
|
50 | (1) |
|
3.1.5 Exposure Rates and Other Experimental Models |
|
|
51 | (1) |
|
3.1.6 What Do We Mean by `Choice'? |
|
|
52 | (1) |
|
|
53 | (9) |
|
3.2.1 Types of Mutual Substitution |
|
|
53 | (1) |
|
3.2.2 Multi-Way Choices and Decision Trees |
|
|
54 | (2) |
|
3.2.3 Binomial Statistics, Tests and Time Series |
|
|
56 | (3) |
|
3.2.4 Lavandera's Dangerous Hypothesis |
|
|
59 | (3) |
|
3.3 A Methodological Progression? |
|
|
62 | (5) |
|
|
62 | (1) |
|
3.3.2 Selecting a More Plausible Baseline |
|
|
63 | (2) |
|
3.3.3 Enumerating Alternates |
|
|
65 | (1) |
|
3.3.4 Linguistically Restricting the Sample |
|
|
66 | (1) |
|
3.3.5 Eliminating Non-Alternating Cases |
|
|
67 | (1) |
|
3.3.6 A Methodological Progression |
|
|
67 | (1) |
|
3.4 Objections to Variationism |
|
|
67 | (7) |
|
|
67 | (2) |
|
|
69 | (1) |
|
|
70 | (1) |
|
3.4.4 The Problem of Polysemy |
|
|
71 | (1) |
|
|
72 | (1) |
|
3.4.6 Necessary Reductionism Versus Complex Statistical Models |
|
|
72 | (1) |
|
|
73 | (1) |
|
|
74 | (3) |
|
|
77 | (10) |
|
|
77 | (1) |
|
|
78 | (1) |
|
|
79 | (4) |
|
4.4 Refining Baselines by Type |
|
|
83 | (2) |
|
|
85 | (2) |
|
5 Balanced Samples and Imagined Populations |
|
|
87 | (8) |
|
|
87 | (1) |
|
5.2 A Study in Genre Variation |
|
|
88 | (2) |
|
5.3 Imagining Populations |
|
|
90 | (1) |
|
5.4 Multi-Variate and Multi-Level Modelling |
|
|
91 | (1) |
|
5.5 More Texts -- or Longer Ones? |
|
|
92 | (1) |
|
|
92 | (3) |
|
PART 3 Confidence Intervals and Significance Tests |
|
|
95 | (124) |
|
6 Introducing Inferential Statistics |
|
|
97 | (19) |
|
6.1 Why Is Statistics Difficult? |
|
|
97 | (2) |
|
6.2 The Idea of Inferential Statistics |
|
|
99 | (1) |
|
6.3 The Randomness of Life |
|
|
99 | (14) |
|
6.3.1 The Binomial Distribution |
|
|
99 | (3) |
|
6.3.2 The Ideal Binomial Distribution |
|
|
102 | (1) |
|
6.3.3 Skewed Distributions |
|
|
103 | (1) |
|
6.3.4 From Binomial to Normal |
|
|
104 | (3) |
|
6.3.5 From Gauss to Wilson |
|
|
107 | (4) |
|
6.3.6 Scatter and Confidence |
|
|
111 | (2) |
|
|
113 | (3) |
|
7 Plotting With Confidence |
|
|
116 | (18) |
|
|
116 | (5) |
|
|
118 | (1) |
|
7.1.2 Comparing Observations and Identifying Significant Differences |
|
|
119 | (2) |
|
|
121 | (3) |
|
7.2.1 Step 1. Gather Raw Data |
|
|
121 | (1) |
|
7.2.2 Step 2. Calculate Basic Wilson Score Interval Terms |
|
|
122 | (1) |
|
7.2.3 Step 3. Calculate the Wilson Interval |
|
|
123 | (1) |
|
7.2.4 Step 4. Plotting Intervals on Graphs |
|
|
124 | (1) |
|
7.3 Comparing and Plotting Change |
|
|
124 | (7) |
|
7.3.1 The Newcombe-Wilson Interval |
|
|
124 | (2) |
|
7.3.2 Comparing Intervals: An Illustration |
|
|
126 | (1) |
|
7.3.3 What Does the Newcombe-Wilson Interval Represent? |
|
|
127 | (1) |
|
7.3.4 Comparing Multiple Points |
|
|
127 | (1) |
|
7.3.5 Plotting Percentage Difference |
|
|
128 | (2) |
|
7.3.6 Floating Bar Charts |
|
|
130 | (1) |
|
|
131 | (1) |
|
|
131 | (3) |
|
8 From Intervals to Tests |
|
|
134 | (32) |
|
|
134 | (6) |
|
8.1.1 Binomial Intervals and Tests |
|
|
135 | (1) |
|
8.1.2 Sampling Assumptions |
|
|
135 | (2) |
|
8.1.3 Deriving a Binomial Distribution |
|
|
137 | (2) |
|
|
139 | (1) |
|
8.2 Tests for a Single Binomial Proportion |
|
|
140 | (9) |
|
8.2.1 The Single-Sample z Test |
|
|
140 | (2) |
|
8.2.2 The 2 × 1 Goodness of Fit Χ2 Test |
|
|
142 | (1) |
|
8.2.3 The Wilson Score Interval |
|
|
143 | (1) |
|
8.2.4 Correcting for Continuity |
|
|
144 | (2) |
|
8.2.5 The `Exact' Binomial Test |
|
|
146 | (1) |
|
8.2.6 The Clopper-Pearson Interval |
|
|
147 | (1) |
|
8.2.7 The Log-Likelihood Test |
|
|
147 | (1) |
|
8.2.8 A Simple Performance Comparison |
|
|
148 | (1) |
|
8.3 Tests for Comparing Two Observed Proportions |
|
|
149 | (6) |
|
8.3.1 The 2 × 2 Χ2 and z Test for Two Independent Proportions |
|
|
149 | (2) |
|
8.3.2 The z Test for Two Independent Proportions from Independent Populations |
|
|
151 | (2) |
|
8.3.3 The z Test for Two Independent Proportions with a Given Difference in Population Means |
|
|
153 | (1) |
|
8.3.4 Continuity-Corrected 2 × 2 Tests |
|
|
154 | (1) |
|
8.3.5 The Fisher `Exact' Test |
|
|
154 | (1) |
|
8.4 Applying Contingency Tests |
|
|
155 | (7) |
|
|
155 | (1) |
|
8.4.2 Analysing Larger Tables |
|
|
156 | (2) |
|
|
158 | (1) |
|
|
159 | (1) |
|
8.4.5 Large Samples and Small Populations |
|
|
160 | (2) |
|
8.5 Comparing the Results of Experiments |
|
|
162 | (1) |
|
|
163 | (3) |
|
9 Comparing Frequencies in the Same Distribution |
|
|
166 | (5) |
|
|
166 | (1) |
|
9.2 The Single-Sample z Test |
|
|
166 | (2) |
|
9.2.1 Comparing Frequency Pairs for Significant Difference |
|
|
168 | (1) |
|
9.2.2 Performing the Test |
|
|
168 | (1) |
|
9.3 Testing and Interpreting Intervals |
|
|
168 | (1) |
|
9.3.1 The Wilson Interval Comparison Heuristic |
|
|
168 | (1) |
|
9.3.2 Visualising the Test |
|
|
169 | (1) |
|
|
169 | (2) |
|
10 Reciprocating the Wilson Interval |
|
|
171 | (7) |
|
|
171 | (1) |
|
10.2 The Wilson Interval of Mean Utterance Length |
|
|
171 | (4) |
|
10.2.1 Scatter and Confidence |
|
|
171 | (1) |
|
10.2.2 From Length to Proportion |
|
|
172 | (1) |
|
10.2.3 Example: Confidence Intervals on Mean Length of Utterance |
|
|
173 | (1) |
|
10.2.4 Plotting the Results |
|
|
174 | (1) |
|
10.3 Intervals on Monotonic Functions of p |
|
|
175 | (1) |
|
|
176 | (2) |
|
11 Competition Between Choices Over Time |
|
|
178 | (17) |
|
|
178 | (1) |
|
|
178 | (2) |
|
11.3 Boundaries and Confidence Intervals |
|
|
180 | (2) |
|
11.3.1 Confidence Intervals for p |
|
|
180 | (1) |
|
11.3.2 Logistic Curves and Wilson Intervals |
|
|
180 | (2) |
|
|
182 | (7) |
|
11.4.1 From Linear to Logistic Regression |
|
|
183 | (1) |
|
11.4.2 Logit-Wilson Regression |
|
|
183 | (1) |
|
11.4.3 Example 1: The Decline of the To-infinitive Perfect |
|
|
184 | (2) |
|
11.4.4 Example 2: Catenative Verbs in Competition |
|
|
186 | (1) |
|
|
186 | (3) |
|
11.5 Impossible Logistic Multinomials |
|
|
189 | (4) |
|
|
190 | (1) |
|
11.5.2 Impossible Multinomials |
|
|
190 | (1) |
|
11.5.3 Possible Hierarchical Multinomials |
|
|
191 | (1) |
|
11.5.4 A Hierarchical Reanalysis of Example 2 |
|
|
191 | (1) |
|
11.5.5 The Three-Body Problem |
|
|
191 | (2) |
|
|
193 | (2) |
|
12 The Replication Crisis and the New Statistics |
|
|
195 | (10) |
|
|
195 | (1) |
|
12.2 A Corpus Linguistics Debate |
|
|
195 | (2) |
|
|
197 | (1) |
|
12.4 The Road Not Travelled |
|
|
198 | (1) |
|
12.5 What Does This Mean for Corpus Linguistics? |
|
|
199 | (2) |
|
12.6 Some Recommendations |
|
|
201 | (2) |
|
12.6.1 Recommendation 1: Include a Replication Step |
|
|
201 | (1) |
|
12.6.2 Recommendation 2: Focus on Large Effects - and Clear Visualisations |
|
|
202 | (1) |
|
12.6.3 Recommendation 3: Play Devil's Advocate |
|
|
202 | (1) |
|
12.6.4 A Checklist for Empirical Linguistics |
|
|
203 | (1) |
|
|
203 | (2) |
|
13 Choosing the Right Test |
|
|
205 | (14) |
|
|
205 | (4) |
|
13.1.1 Choosing a Dependent Variable and Baseline |
|
|
206 | (1) |
|
13.1.2 Choosing Independent Variables |
|
|
207 | (2) |
|
13.2 Tests for Categorical Data |
|
|
209 | (4) |
|
13.2.1 Two Types of Contingency Test |
|
|
209 | (1) |
|
13.2.2 The Benefits of Simple Tests |
|
|
210 | (1) |
|
13.2.3 Visualising Uncertainty |
|
|
210 | (1) |
|
13.2.4 When to Use Goodness of Fit Tests |
|
|
211 | (1) |
|
13.2.5 Tests for Comparing Results |
|
|
212 | (1) |
|
13.2.6 Optimum Methods of Calculation |
|
|
212 | (1) |
|
13.3 Tests for Other Types of Data |
|
|
213 | (4) |
|
13.3.1 T Tests for Comparing Two Independent Samples of Numeric Data |
|
|
213 | (2) |
|
|
215 | (1) |
|
13.3.3 Tests for Other Types of Variables |
|
|
216 | (1) |
|
|
217 | (1) |
|
|
217 | (2) |
|
PART 4 Effect Sizes and Meta-Tests |
|
|
219 | (42) |
|
|
221 | (12) |
|
|
221 | (1) |
|
14.2 Effect Sizes for Two-Variable Tables |
|
|
221 | (3) |
|
|
221 | (1) |
|
14.2.2 The Problem of Prediction |
|
|
222 | (1) |
|
|
223 | (1) |
|
14.2.4 Other Probabilistic Approaches to Dependent Probability |
|
|
224 | (1) |
|
14.3 Confidence Intervals on φ |
|
|
224 | (5) |
|
14.3.1 Confidence Intervals on 2 × 2 φ |
|
|
225 | (1) |
|
14.3.2 Confidence Intervals for Cramir's φ |
|
|
225 | (1) |
|
14.3.3 Example: Investigating Grammatical Priming |
|
|
226 | (3) |
|
14.4 Goodness of Fit Effect Sizes |
|
|
229 | (2) |
|
|
229 | (1) |
|
14.4.2 Variance-Weighted φe |
|
|
229 | (1) |
|
14.4.3 Example: Correlating the Present Perfect |
|
|
230 | (1) |
|
|
231 | (2) |
|
15 Meta-Tests for Comparing Tables of Results |
|
|
233 | (28) |
|
|
233 | (4) |
|
15.1.1 How Not to Compare Test Results |
|
|
234 | (2) |
|
15.1.2 Comparing Sizes of Effect |
|
|
236 | (1) |
|
|
236 | (1) |
|
|
237 | (2) |
|
|
237 | (1) |
|
15.2.2 Correcting for Continuity |
|
|
237 | (2) |
|
15.2.3 Example Data and Notation |
|
|
239 | (1) |
|
15.3 Point and Multi-Point Tests for Homogeneity Tables |
|
|
239 | (4) |
|
15.3.1 Reorganising Contingency Tables for 2 × 1 Tests |
|
|
239 | (1) |
|
15.3.2 The Newcombe-Wilson Point Test |
|
|
240 | (1) |
|
15.3.3 The Gaussian Point Test |
|
|
241 | (1) |
|
15.3.4 The Multi-Point Test for r × c Homogeneity Tables |
|
|
242 | (1) |
|
15.4 Gradient Tests for Homogeneity Tables |
|
|
243 | (6) |
|
15.4.1 The 2 × 2 Newcombe-Wilson Gradient Test |
|
|
244 | (1) |
|
15.4.2 Cramer's φ Interval and Test |
|
|
245 | (1) |
|
15.4.3 R × 2 Homogeneity Gradient Tests |
|
|
246 | (3) |
|
15.4.4 Interpreting Gradient Meta-Tests for Large Tables |
|
|
249 | (1) |
|
15.5 Gradient Tests for Goodness of Fit Tables |
|
|
249 | (3) |
|
15.5.1 The 2 × 1 Wilson Interval Gradient Test |
|
|
250 | (2) |
|
15.5.2 R × 1 Goodness of Fit Gradient Tests |
|
|
252 | (1) |
|
|
252 | (6) |
|
15.6.1 Point Tests for Subsets |
|
|
253 | (2) |
|
15.6.2 Multi-Point Subset Tests |
|
|
255 | (1) |
|
15.6.3 Gradient Subset Tests |
|
|
255 | (1) |
|
15.6.4 Goodness of Fit Subset Tests |
|
|
255 | (3) |
|
|
258 | (3) |
|
PART 5 Statistical Solutions for Corpus Samples |
|
|
261 | (34) |
|
16 Conducting Research with Imperfect Data |
|
|
263 | (14) |
|
|
263 | (1) |
|
16.2 Reviewing Subsamples |
|
|
264 | (5) |
|
16.2.1 Example 1: Get Versus Be Passive |
|
|
264 | (1) |
|
16.2.2 Subsampling and Reviewing |
|
|
265 | (1) |
|
16.2.3 Estimating the Observed Probability p |
|
|
266 | (1) |
|
16.2.4 Contingency Tests and Multinomial Dependent Variables |
|
|
267 | (2) |
|
16.3 Reviewing Preliminary Analyses |
|
|
269 | (5) |
|
16.3.1 Example 2: Embedded and Sequential Postmodifiers |
|
|
269 | (1) |
|
16.3.2 Testing the Worst-Case Scenario |
|
|
270 | (2) |
|
16.3.3 Combining Subsampling with Worst-Case Analysis |
|
|
272 | (2) |
|
16.3.4 Ambiguity and Error |
|
|
274 | (1) |
|
16.4 Resampling and p-hacking |
|
|
274 | (1) |
|
|
275 | (2) |
|
17 Adjusting Intervals for Random-Text Samples |
|
|
277 | (18) |
|
|
277 | (1) |
|
17.2 Recalibrating Binomial Models |
|
|
278 | (2) |
|
17.3 Examples with Large Samples |
|
|
280 | (7) |
|
17.3.1 Example 1: Interrogative Clause Proportion, `Direct Conversations' |
|
|
280 | (2) |
|
17.3.2 Example 2: Clauses Per Word, `Direct Conversations' |
|
|
282 | (2) |
|
17.3.3 Uneven-Size Subsamples |
|
|
284 | (1) |
|
17.3.4 Example 1 Revisited, Across ICE-GB |
|
|
284 | (3) |
|
17.4 Alternation Studies with Small Samples |
|
|
287 | (6) |
|
17.4.1 Applying the Large Sample Method |
|
|
288 | (1) |
|
17.4.2 Singletons, Partitioning and Pooling |
|
|
289 | (3) |
|
|
292 | (1) |
|
|
293 | (2) |
|
PART 6 Concluding Remarks |
|
|
295 | (22) |
|
18 Plotting the Wilson Distribution |
|
|
297 | (17) |
|
|
297 | (1) |
|
18.2 Plotting the Distribution |
|
|
298 | (4) |
|
18.2.1 Calculating w-(α) from the Standard Normal Distribution |
|
|
298 | (2) |
|
|
300 | (1) |
|
18.2.3 Delta Approximation |
|
|
300 | (2) |
|
|
302 | (5) |
|
18.3.1 Sample Size n = 10, Observed Proportion p = 0.5 |
|
|
302 | (1) |
|
18.3.2 Properties of Wilson Areas |
|
|
303 | (1) |
|
18.3.3 The Effect of p Tending to Extremes |
|
|
303 | (1) |
|
18.3.4 The Effect of Very Small n |
|
|
304 | (3) |
|
18.4 Further Perspectives on Wilson Distributions |
|
|
307 | (1) |
|
18.4.1 Percentiles of Wilson Distributions |
|
|
307 | (1) |
|
18.4.2 The Logit-Wilson Distribution |
|
|
307 | (1) |
|
18.5 Alternative Distributions |
|
|
308 | (2) |
|
18.5.1 Continuity-Corrected Wilson Distributions |
|
|
308 | (2) |
|
18.5.2 Clopper-Pearson Distributions |
|
|
310 | (1) |
|
|
310 | (4) |
|
|
314 | (3) |
|
|
317 | (2) |
|
A The Interval Equality Principle |
|
|
319 | (5) |
|
|
319 | (1) |
|
|
319 | (1) |
|
|
319 | (1) |
|
|
320 | (2) |
|
2.1 Wilson Score Interval |
|
|
320 | (1) |
|
2.2 Wilson Score Interval with Continuity Correction |
|
|
321 | (1) |
|
2.3 Binomial and Clopper-Pearson Intervals |
|
|
321 | (1) |
|
2.4 Log-Likelihood and Other Significance Test Functions |
|
|
321 | (1) |
|
3 Searching for Interval Bounds with a Computer |
|
|
322 | (2) |
|
B Pseudo-Code for Computational Procedures |
|
|
324 | (5) |
|
1 Simple Logistic Regression Algorithm with Logit-Wilson Variance |
|
|
324 | (2) |
|
1.1 Calculate Sum of Squared Errors e for Known m and k |
|
|
324 | (1) |
|
1.2 Find Optimum Value of k by Search for Smallest Error e for Gradient m |
|
|
324 | (1) |
|
1.3 Find Optimum Values of m and k by the Method of Least Squares |
|
|
325 | (1) |
|
|
326 | (1) |
|
2 Binomial and Fisher Functions |
|
|
326 | (3) |
|
|
326 | (1) |
|
2.2 The Clopper-Pearson Interval |
|
|
327 | (2) |
Glossary |
|
329 | (13) |
References |
|
342 | (5) |
Index |
|
347 | |