|
|
1 | (22) |
|
|
1 | (8) |
|
1.1.1 What Are Multiword Expressions? |
|
|
1 | (3) |
|
1.1.2 Why Do They Matter? |
|
|
4 | (3) |
|
1.1.3 What Happens If We Ignore Them? |
|
|
7 | (2) |
|
1.2 A New Framework for MWE Treatment |
|
|
9 | (5) |
|
|
9 | (1) |
|
|
10 | (1) |
|
|
11 | (3) |
|
|
14 | (2) |
|
|
16 | (7) |
|
|
17 | (6) |
|
Part I Multiword Expressions: A Tough Nut to Crack |
|
|
|
2 Definitions and Characteristics |
|
|
23 | (30) |
|
|
23 | (5) |
|
2.1.1 Theoretical Linguistics |
|
|
24 | (2) |
|
2.1.2 Computational Linguistics |
|
|
26 | (2) |
|
|
28 | (6) |
|
|
28 | (1) |
|
|
29 | (2) |
|
2.2.3 A Note on Terminology |
|
|
31 | (3) |
|
2.3 Characteristics and Characterisations |
|
|
34 | (11) |
|
2.3.1 The Compositionality Continuum |
|
|
34 | (2) |
|
2.3.2 Derived MWE Properties |
|
|
36 | (3) |
|
2.3.3 Existing MWE Typologies |
|
|
39 | (2) |
|
2.3.4 A Simplified Typology |
|
|
41 | (4) |
|
2.4 A Snapshot of the Research Field |
|
|
45 | (1) |
|
|
46 | (7) |
|
|
47 | (6) |
|
3 State of the Art in MWE Processing |
|
|
53 | (52) |
|
|
53 | (17) |
|
3.1.1 Linguistic Processing: Analysis |
|
|
54 | (3) |
|
3.1.2 Word Frequency Distributions |
|
|
57 | (3) |
|
3.1.3 N-Grams, Language Models and Suffix Arrays |
|
|
60 | (3) |
|
3.1.4 Lexical Association Measures |
|
|
63 | (7) |
|
3.2 Methods for Automatic MWE Acquisition |
|
|
70 | (10) |
|
3.2.1 Monolingual Methods |
|
|
71 | (3) |
|
3.2.2 Bi- and Multilingual Methods |
|
|
74 | (2) |
|
|
76 | (4) |
|
3.3 Other Tasks Related to MWE Processing |
|
|
80 | (11) |
|
|
80 | (3) |
|
|
83 | (1) |
|
|
84 | (2) |
|
|
86 | (5) |
|
|
91 | (14) |
|
|
93 | (12) |
|
|
|
4 Evaluation of MWE Acquisition |
|
|
105 | (22) |
|
|
106 | (8) |
|
|
106 | (3) |
|
4.1.2 Evaluation Measures |
|
|
109 | (2) |
|
|
111 | (3) |
|
|
114 | (5) |
|
4.2.1 Characteristics of Target Constructions |
|
|
115 | (1) |
|
4.2.2 Characteristics of Corpora |
|
|
116 | (3) |
|
|
119 | (1) |
|
|
119 | (2) |
|
|
121 | (6) |
|
|
122 | (5) |
|
5 A New Framework for MWE Acquisition |
|
|
127 | (32) |
|
5.1 The mwetoolkit Framework |
|
|
127 | (14) |
|
5.1.1 General Architecture |
|
|
128 | (2) |
|
|
130 | (8) |
|
|
138 | (3) |
|
|
141 | (4) |
|
5.2.1 Candidate Extraction |
|
|
141 | (1) |
|
5.2.2 Candidate Filtering |
|
|
142 | (2) |
|
|
144 | (1) |
|
5.3 Comparison with Related Approaches |
|
|
145 | (7) |
|
|
145 | (1) |
|
|
146 | (1) |
|
|
147 | (5) |
|
|
152 | (7) |
|
|
154 | (5) |
|
|
|
6 Application 1: Lexicography |
|
|
159 | (22) |
|
6.1 A Dictionary of Nominal Compounds in Greek |
|
|
159 | (7) |
|
6.1.1 Greek Nominal Compounds |
|
|
160 | (2) |
|
6.1.2 Automatic Acquisition Setup |
|
|
162 | (1) |
|
|
163 | (3) |
|
6.2 A Dictionary of Complex Predicates in Portuguese |
|
|
166 | (10) |
|
6.2.1 Portuguese Complex Predicates |
|
|
167 | (2) |
|
6.2.2 Automatic Acquisition Setup |
|
|
169 | (2) |
|
|
171 | (5) |
|
|
176 | (5) |
|
|
178 | (3) |
|
7 Application 2: Machine Translation |
|
|
181 | (20) |
|
7.1 A Brief Introduction to SMT |
|
|
183 | (3) |
|
7.2 Evaluation of Phrasal Verb Translation |
|
|
186 | (11) |
|
7.2.1 English Phrasal Verbs |
|
|
187 | (2) |
|
|
189 | (3) |
|
|
192 | (5) |
|
|
197 | (4) |
|
|
197 | (4) |
|
|
201 | (6) |
|
|
204 | (3) |
|
A Extended List of Translation Examples |
|
|
207 | (2) |
|
B Resources Used in the Experiments |
|
|
209 | (2) |
|
|
209 | (1) |
|
B.1.1 Monolingual Corpora |
|
|
209 | (1) |
|
B.1.2 Multilingual Corpora |
|
|
209 | (1) |
|
|
210 | (1) |
|
|
210 | (1) |
|
C The mwetoolkit: Documentation |
|
|
211 | (12) |
|
|
211 | (1) |
|
C.2 Installing the mwetoolkit |
|
|
212 | (1) |
|
|
212 | (1) |
|
|
212 | (1) |
|
C.2.3 Mac OS Dependencies |
|
|
213 | (1) |
|
C.2.4 Testing Your Installation |
|
|
213 | (1) |
|
|
213 | (3) |
|
|
214 | (2) |
|
C.4 Defining Patterns for Extraction |
|
|
216 | (3) |
|
|
216 | (1) |
|
C.4.2 Repetitions and Optional Elements |
|
|
216 | (1) |
|
C.4.3 Ignoring Parts of the Match |
|
|
217 | (1) |
|
|
218 | (1) |
|
|
218 | (1) |
|
C.5 Preprocessing a Corpus Using TreeTagger |
|
|
219 | (1) |
|
C.5.1 Installing TreeTagger |
|
|
219 | (1) |
|
C.5.2 Converting TreeTagger's Output to XML |
|
|
219 | (1) |
|
C.6 Preprocessing a Corpus Using RASP |
|
|
220 | (1) |
|
|
220 | (1) |
|
C.6.2 Converting RASP'S Output to XML |
|
|
220 | (1) |
|
C.7 Examples of XML Files |
|
|
220 | (1) |
|
|
221 | (2) |
|
D Tagsets for POS and Syntax |
|
|
223 | (6) |
|
|
223 | (1) |
|
D.2 RASP English POS Tagset |
|
|
223 | (3) |
|
D.3 RASP English Grammatical Relations |
|
|
226 | (1) |
|
D.4 TreeTagger English POS Tagset |
|
|
227 | (2) |
|
E Detailed Lexicon Descriptions |
|
|
229 | |
|
E.1 Sentiment Verbs Extracted from Brazilian WordNet |
|
|
229 | (1) |
|
|
230 | |