|
|
1 | (14) |
|
1.1 From Introspective to Corpus-Informed Judgments |
|
|
1 | (2) |
|
1.2 Looking for Corpus Linguistics |
|
|
3 | (12) |
|
1.2.1 What Counts as a Corpus |
|
|
3 | (3) |
|
1.2.2 What Linguists Do with the Corpus |
|
|
6 | (2) |
|
1.2.3 How Central the Corpus Is to a Linguist's Work |
|
|
8 | (2) |
|
|
10 | (5) |
|
Part I Methods in Corpus Linguistics |
|
|
|
|
15 | (36) |
|
|
15 | (1) |
|
2.2 Downloads and Installs |
|
|
15 | (2) |
|
2.2.1 Downloading and Installing R |
|
|
16 | (1) |
|
2.2.2 Downloading and Installing RStudio |
|
|
16 | (1) |
|
2.2.3 Downloading the Book Materials |
|
|
17 | (1) |
|
2.3 Setting the Working Directory |
|
|
17 | (1) |
|
|
17 | (1) |
|
|
18 | (1) |
|
2.5.1 Downloading Packages |
|
|
18 | (1) |
|
|
19 | (1) |
|
|
19 | (1) |
|
2.7 Variables and Assignment |
|
|
20 | (1) |
|
2.8 Functions and Arguments |
|
|
21 | (3) |
|
2.8.1 Ready-Made Functions |
|
|
21 | (1) |
|
2.8.2 User-Defined Functions |
|
|
22 | (2) |
|
|
24 | (17) |
|
|
24 | (9) |
|
|
33 | (1) |
|
|
34 | (2) |
|
2.9.4 Data Frames (and Factors) |
|
|
36 | (5) |
|
|
41 | (2) |
|
2.11 If and if ... else Statements |
|
|
43 | (2) |
|
|
43 | (1) |
|
2.11.2 If ... else Statements |
|
|
44 | (1) |
|
|
45 | (1) |
|
2.13 Common Mistakes and How to Avoid Them |
|
|
46 | (1) |
|
|
47 | (4) |
|
|
47 | (2) |
|
|
49 | (2) |
|
|
51 | (18) |
|
|
51 | (1) |
|
3.2 Corpus Compilation: Kennedy's Five Steps |
|
|
52 | (2) |
|
|
54 | (4) |
|
3.3.1 Collecting Textual Data |
|
|
54 | (1) |
|
3.3.2 Character Encoding Issues |
|
|
55 | (2) |
|
3.3.3 Creating an Unannotated Corpus |
|
|
57 | (1) |
|
|
58 | (7) |
|
|
58 | (1) |
|
|
58 | (1) |
|
|
59 | (4) |
|
|
63 | (2) |
|
|
65 | (4) |
|
|
65 | (1) |
|
|
66 | (3) |
|
4 Processing and Manipulating Character Strings |
|
|
69 | (18) |
|
|
69 | (1) |
|
|
69 | (2) |
|
|
70 | (1) |
|
4.2.2 Loading Several Text Files |
|
|
70 | (1) |
|
4.3 First Forays into Character String Processing |
|
|
71 | (2) |
|
|
71 | (1) |
|
|
72 | (1) |
|
4.3.3 Replacing and Deleting |
|
|
72 | (1) |
|
|
73 | (1) |
|
|
73 | (14) |
|
|
73 | (1) |
|
4.4.2 Literals vs. Metacharacters |
|
|
74 | (1) |
|
|
74 | (1) |
|
|
75 | (1) |
|
4.4.5 Alternations and Groupings |
|
|
76 | (1) |
|
|
77 | (2) |
|
4.4.7 Lazy vs. Greedy Matching |
|
|
79 | (1) |
|
|
80 | (1) |
|
4.4.9 Exact Matching with strapply () |
|
|
81 | (1) |
|
|
82 | (3) |
|
|
85 | (2) |
|
5 Applied Character String Processing |
|
|
87 | (28) |
|
|
87 | (1) |
|
|
87 | (17) |
|
5.2.1 A Concordance Based on an Unannotated Corpus |
|
|
87 | (8) |
|
5.2.2 A Concordance Based on an Annotated Corpus |
|
|
95 | (9) |
|
5.3 Making a Data Frame from an Annotated Corpus |
|
|
104 | (4) |
|
5.3.1 Planning the Data Frame |
|
|
104 | (1) |
|
5.3.2 Compiling the Data Frame |
|
|
104 | (2) |
|
|
106 | (2) |
|
|
108 | (7) |
|
5.4.1 A Frequency List of a Raw Text File |
|
|
108 | (2) |
|
5.4.2 A Frequency List of an Annotated File |
|
|
110 | (3) |
|
|
113 | (1) |
|
|
114 | (1) |
|
6 Summary Graphics for Frequency Data |
|
|
115 | (24) |
|
|
115 | (1) |
|
6.2 Plots, Barplots, and Histograms |
|
|
115 | (3) |
|
|
118 | (4) |
|
|
122 | (3) |
|
|
125 | (2) |
|
6.6 Reshaping Tabulated Data |
|
|
127 | (5) |
|
|
132 | (7) |
|
|
133 | (2) |
|
|
135 | (4) |
|
Part II Statistics for Corpus Linguistics |
|
|
|
|
139 | (12) |
|
|
139 | (1) |
|
|
140 | (5) |
|
|
140 | (2) |
|
|
142 | (1) |
|
|
143 | (2) |
|
|
145 | (6) |
|
|
145 | (1) |
|
|
146 | (1) |
|
7.3.3 Variance and Standard Deviation |
|
|
147 | (1) |
|
|
148 | (3) |
|
8 Notions of Statistical Testing |
|
|
151 | (46) |
|
|
151 | (1) |
|
|
151 | (6) |
|
|
151 | (1) |
|
8.2.2 Simple Probabilities |
|
|
152 | (1) |
|
8.2.3 Joint and Marginal Probabilities |
|
|
153 | (2) |
|
8.2.4 Union vs. Intersection |
|
|
155 | (1) |
|
8.2.5 Conditional Probabilities |
|
|
155 | (1) |
|
|
156 | (1) |
|
8.3 Populations, Samples, and Individuals |
|
|
157 | (1) |
|
|
158 | (1) |
|
8.5 Response/Dependent vs. Explanatory/Descriptive/Independent Variables |
|
|
159 | (1) |
|
|
160 | (2) |
|
|
162 | (1) |
|
8.8 Probability Distributions |
|
|
163 | (15) |
|
8.8.1 Discrete Distributions |
|
|
165 | (4) |
|
8.8.2 Continuous Distributions |
|
|
169 | (9) |
|
|
178 | (7) |
|
8.9.1 A Case Study: The Quotative System in British and Canadian Youth |
|
|
178 | (7) |
|
8.10 The Fisher Exact Test of Independence |
|
|
185 | (1) |
|
|
186 | (11) |
|
|
186 | (3) |
|
|
189 | (3) |
|
|
192 | (1) |
|
8.11.4 Correlation Is Not Causation |
|
|
193 | (1) |
|
|
193 | (1) |
|
|
194 | (3) |
|
9 Association and Productivity |
|
|
197 | (42) |
|
|
197 | (1) |
|
9.2 Cooccurrence Phenomena |
|
|
198 | (5) |
|
|
198 | (2) |
|
|
200 | (2) |
|
|
202 | (1) |
|
|
203 | (23) |
|
9.3.1 Measuring Significant Co-occurrences |
|
|
203 | (1) |
|
9.3.2 The Logic of Association Measures |
|
|
204 | (1) |
|
9.3.3 A Quick Inventory of Association Measures |
|
|
205 | (5) |
|
9.3.4 A Loop for Association Measures |
|
|
210 | (3) |
|
9.3.5 There Is No Perfect Association Measure |
|
|
213 | (1) |
|
|
213 | (9) |
|
9.3.7 Asymmetric Association Measures |
|
|
222 | (4) |
|
9.4 Lexical Richness and Productivity |
|
|
226 | (13) |
|
9.4.1 Hapax-Based Measures |
|
|
226 | (1) |
|
9.4.2 Types, Tokens, and Type-Token Ratio |
|
|
227 | (1) |
|
9.4.3 Vocabulary Growth Curves |
|
|
228 | (7) |
|
|
235 | (1) |
|
|
235 | (4) |
|
|
239 | (56) |
|
|
239 | (3) |
|
10.1.1 Multidimensional Data |
|
|
239 | (1) |
|
|
240 | (2) |
|
10.2 Principal Component Analysis |
|
|
242 | (10) |
|
10.2.1 Principles of Principal Component Analysis |
|
|
243 | (1) |
|
10.2.2 A Case Study: Characterizing Genres with Prosody in Spoken French |
|
|
243 | (2) |
|
|
245 | (7) |
|
10.3 An Alternative to PCA: t-SNE |
|
|
252 | (5) |
|
10.4 Correspondence Analysis |
|
|
257 | (11) |
|
10.4.1 Principles of Correspondence Analysis |
|
|
257 | (1) |
|
10.4.2 Case Study: General Extenders in the Speech of English Teenagers |
|
|
257 | (4) |
|
|
261 | (5) |
|
10.4.4 Supplementary Variables |
|
|
266 | (2) |
|
10.5 Multiple Correspondence Analysis |
|
|
268 | (8) |
|
10.5.1 Principles of Multiple Correspondence Analysis |
|
|
269 | (1) |
|
10.5.2 Case Study: Predeterminer vs. Preadjectival Uses of Quite and Rather |
|
|
270 | (5) |
|
10.5.3 Confidence Ellipses |
|
|
275 | (1) |
|
|
276 | (1) |
|
10.6 Hierarchical Cluster Analysis |
|
|
276 | (7) |
|
10.6.1 The Principles of Hierarchical Cluster Analysis |
|
|
277 | (1) |
|
10.6.2 Case Study: Clustering English Intensifiers |
|
|
278 | (1) |
|
|
279 | (2) |
|
10.6.4 Standardizing Variables |
|
|
281 | (2) |
|
|
283 | (12) |
|
|
283 | (2) |
|
10.7.2 The Linguistic Relevance of Graphs |
|
|
285 | (5) |
|
|
290 | (2) |
|
|
292 | (3) |
|
|
295 | (6) |
|
|
295 | (2) |
|
|
295 | (2) |
|
|
297 | (4) |
|
|
297 | (1) |
|
A.2.2 Discrete Probability Distributions |
|
|
298 | (2) |
|
A.2.3 A Χ2 Distribution Table |
|
|
300 | (1) |
|
|
301 | (8) |
Solutions |
|
309 | (42) |
Index |
|
351 | |