|
|
viii | |
|
|
x | |
Acknowledgments |
|
xi | |
|
|
1 | (6) |
|
1.1 Why Another Introduction to Corpus Linguistics? |
|
|
1 | (3) |
|
|
4 | (3) |
|
2 The Four Central Corpus-Linguistic Methods |
|
|
7 | (14) |
|
|
7 | (5) |
|
|
7 | (2) |
|
2.1.2 What Kinds of Corpora Are There? |
|
|
9 | (3) |
|
|
12 | (2) |
|
2.3 Dispersion Information |
|
|
14 | (1) |
|
2.4 Lexical Co-occurrence: Collocations |
|
|
15 | (2) |
|
2.5 (Lexico-)Grammatical Co-occurrence: Concordances |
|
|
17 | (4) |
|
|
21 | (120) |
|
3.1 Data Structures, Functions, Arguments |
|
|
25 | (6) |
|
|
31 | (18) |
|
|
31 | (5) |
|
|
36 | (4) |
|
3.2.3 Accessing and Processing (Parts of) Vectors |
|
|
40 | (8) |
|
|
48 | (1) |
|
|
49 | (2) |
|
|
51 | (9) |
|
3.4.1 Generating Data Frames in R |
|
|
51 | (2) |
|
3.4.2 Loading and Saving Data Frames in R |
|
|
53 | (2) |
|
3.4.3 Accessing and Processing (Parts of) Data Frames in R |
|
|
55 | (5) |
|
|
60 | (5) |
|
3.6 Elementary Programming Issues |
|
|
65 | (11) |
|
3.6.1 Conditional Expressions |
|
|
65 | (2) |
|
|
67 | (2) |
|
3.6.3 Rules of Programming |
|
|
69 | (7) |
|
3.7 Character/String Processing |
|
|
76 | (35) |
|
3.7.1 Getting Information From and Accessing Character Vectors |
|
|
76 | (1) |
|
3.7.2 Elementary Ways to Change Character Vectors |
|
|
77 | (1) |
|
3.7.3 Merging/Splitting Character Vectors Without Regular Expressions |
|
|
78 | (2) |
|
3.7.4 Searching and Replacing Without Regular Expressions |
|
|
80 | (9) |
|
3.7.5 Searching and Replacing With Regular Expressions |
|
|
89 | (18) |
|
3.7.6 Merging/Splitting Character Vectors With Regular Expressions |
|
|
107 | (4) |
|
3.8 Two Particularly Relevant Areas: Unicode and XML |
|
|
111 | (18) |
|
3.8.1 Some Notes on Handling Unicode |
|
|
111 | (6) |
|
3.8.2 Some Notes on Handling XML Data |
|
|
117 | (12) |
|
3.9 File and Directory Operations |
|
|
129 | (4) |
|
3.10 Writing Your Own Functions and Some Final Recommendations |
|
|
133 | (8) |
|
4 Some Basic Statistical Notions and Tests |
|
|
141 | (36) |
|
4.1 Introduction to Statistical Thinking |
|
|
141 | (10) |
|
4.1.1 Variables and Their Roles in an Analysis |
|
|
142 | (1) |
|
4.1.2 Variables and Their Information Value |
|
|
142 | (1) |
|
4.1.3 Hypotheses: Formulation and Operationalization |
|
|
142 | (6) |
|
|
148 | (2) |
|
4.1.5 Hypothesis (and Significance) Testing |
|
|
150 | (1) |
|
4.2 Categorical Dependent Variables |
|
|
151 | (9) |
|
4.2.1 No Independent Variables |
|
|
151 | (3) |
|
4.2.2 One Independent Categorical Variable |
|
|
154 | (6) |
|
4.3 Numeric Dependent Variables |
|
|
160 | (14) |
|
4.3.1 No Independent Variables |
|
|
161 | (6) |
|
4.3.2 One Independent Categorical Variable |
|
|
167 | (3) |
|
4.3.3 One Independent Numeric Variable |
|
|
170 | (4) |
|
|
174 | (3) |
|
5 Using R in Corpus Linguistics: Case Studies |
|
|
177 | (92) |
|
|
179 | (5) |
|
5.1.1 Dispersion 1: HIV, Keeper, and Lively in the BNC |
|
|
179 | (3) |
|
5.1.2 Dispersion 2: Perl in a Wikipedia Entry |
|
|
182 | (2) |
|
5.2 Frequencies, Frequency Lists, and Key Words |
|
|
184 | (24) |
|
|
184 | (3) |
|
|
187 | (2) |
|
5.2.3 Zero-Derivation of Run and Walk in the BNC |
|
|
189 | (3) |
|
5.2.4 Word and Sentence Lengths in the BNC |
|
|
192 | (2) |
|
5.2.5 Approximating Syntactic Complexity: Fichtner's C |
|
|
194 | (3) |
|
|
197 | (3) |
|
5.2.7 Frequencies of -ic and -ical Adjectives |
|
|
200 | (3) |
|
5.2.8 Frequencies of All Word-Tag Combinations in the BNC |
|
|
203 | (5) |
|
5.3 Co-Occurrence Data: Collocation/Colligation/Collostruction |
|
|
208 | (20) |
|
5.3.1 The Collocation Alphabetical Order in the BNC |
|
|
208 | (2) |
|
5.3.2 Frequencies of Collocates of -ic and -ical Adjectives |
|
|
210 | (2) |
|
5.3.3 The Reduction of to BE Before Verbs |
|
|
212 | (3) |
|
5.3.4 Verb Collexemes After Must |
|
|
215 | (3) |
|
5.3.5 Noun Collocates After Speed Adjectives in COCA (Fiction) |
|
|
218 | (3) |
|
5.3.6 Collocates of Will and Shall in COHA (1810--1890) |
|
|
221 | (4) |
|
|
225 | (3) |
|
|
228 | (41) |
|
5.4.1 Corpus Conversion: the ICE-GB |
|
|
228 | (3) |
|
5.4.2 Three Indexing Applications |
|
|
231 | (4) |
|
|
235 | (2) |
|
|
237 | (1) |
|
5.4.5 Retrieving Adjective Sequences From Untagged Corpora |
|
|
237 | (5) |
|
5.4.6 Type-Token Ratios/Vocabulary Growth: Hamlet vs. Macbeth |
|
|
242 | (6) |
|
5.4.7 Hyphenated Forms and Their Alternative Spellings |
|
|
248 | (3) |
|
5.4.8 Lexical Frequency Profiles |
|
|
251 | (6) |
|
5.4.9 CHAT Files 1: Eve's MLUs and ttrs |
|
|
257 | (6) |
|
5.4.10 CHAT Files 2: Merging Multiple Files |
|
|
263 | (6) |
|
|
269 | (2) |
Appendix |
|
271 | (1) |
Index |
|
272 | |