Preface |
|
ix | |
Notation |
|
xiii | |
|
|
1 | (10) |
|
1.1 Natural Language Processing and Its Neighbors |
|
|
1 | (4) |
|
1.2 Three Themes in Natural Language Processing |
|
|
5 | (6) |
|
|
11 | (106) |
|
2 Linear Text Classification |
|
|
13 | (34) |
|
|
13 | (4) |
|
|
17 | (7) |
|
2.3 Discriminative Learning |
|
|
24 | (4) |
|
2.4 Loss Functions and Large-Margin Classification |
|
|
28 | (6) |
|
|
34 | (3) |
|
|
37 | (3) |
|
2.7 *Additional Topics in Classification |
|
|
40 | (2) |
|
2.8 Summary of Learning Algorithms |
|
|
42 | (5) |
|
3 Nonlinear Classification |
|
|
47 | (20) |
|
3.1 Feedforward Neural Networks |
|
|
48 | (2) |
|
3.2 Designing Neural Networks |
|
|
50 | (3) |
|
3.3 Learning Neural Networks |
|
|
53 | (8) |
|
3.4 Convolutional Neural Networks |
|
|
61 | (6) |
|
4 Linguistic Applications of Classification |
|
|
67 | (24) |
|
4.1 Sentiment and Opinion Analysis |
|
|
67 | (4) |
|
4.2 Word Sense Disambiguation |
|
|
71 | (3) |
|
4.3 Design Decisions for Text Classification |
|
|
74 | (4) |
|
4.4 Evaluating Classifiers |
|
|
78 | (7) |
|
|
85 | (6) |
|
5 Learning without Supervision |
|
|
91 | (26) |
|
5.1 Unsupervised Learning |
|
|
91 | (8) |
|
5.2 Applications of Expectation-Maximization |
|
|
99 | (3) |
|
5.3 Semi-Supervised Learning |
|
|
102 | (3) |
|
|
105 | (4) |
|
5.5 *Other Approaches to Learning with Latent Variables |
|
|
109 | (8) |
|
|
117 | (150) |
|
|
119 | (18) |
|
6.1 Af-Gram Language Models |
|
|
120 | (2) |
|
6.2 Smoothing and Discounting |
|
|
122 | (5) |
|
6.3 Recurrent Neural Network Language Models |
|
|
127 | (5) |
|
6.4 Evaluating Language Models |
|
|
132 | (2) |
|
6.5 Out-of-Vocabulary Words |
|
|
134 | (3) |
|
|
137 | (30) |
|
7.1 Sequence Labeling as Classification |
|
|
137 | (2) |
|
7.2 Sequence Labeling as Structure Prediction |
|
|
139 | (1) |
|
7.3 The Viterbi Algorithm |
|
|
140 | (5) |
|
|
145 | (4) |
|
7.5 Discriminative Sequence Labeling with Features |
|
|
149 | (9) |
|
7.6 Neural Sequence Labeling |
|
|
158 | (3) |
|
7.7 ""Unsupervised Sequence Labeling |
|
|
161 | (6) |
|
8 Applications of Sequence Labeling |
|
|
167 | (16) |
|
8.1 Part-of-Speech Tagging |
|
|
167 | (6) |
|
8.2 Morphosyntactic Attributes |
|
|
173 | (2) |
|
8.3 Named Entity Recognition |
|
|
175 | (1) |
|
|
176 | (1) |
|
|
177 | (1) |
|
|
178 | (5) |
|
|
183 | (32) |
|
|
184 | (14) |
|
9.2 Context-Free Languages |
|
|
198 | (11) |
|
9.3 *Mildty Context-Sensitive Languages |
|
|
209 | (6) |
|
|
215 | (28) |
|
10.1 Deterministic Bottom-Up Parsing |
|
|
216 | (3) |
|
|
219 | (3) |
|
10.3 Weighted Context-Free Grammars |
|
|
222 | (5) |
|
10.4 Learning Weighted Context-Free Grammars |
|
|
227 | (4) |
|
|
231 | (7) |
|
10.6 Beyond Context-Free Parsing |
|
|
238 | (5) |
|
|
243 | (24) |
|
|
243 | (5) |
|
11.2 Graph-Based Dependency Parsing |
|
|
248 | (5) |
|
11.3 Transition-Based Dependency Parsing |
|
|
253 | (8) |
|
|
261 | (6) |
|
|
267 | (110) |
|
|
269 | (20) |
|
12.1 Meaning and Denotation |
|
|
270 | (1) |
|
12.2 Logical Representations of Meaning |
|
|
270 | (4) |
|
12.3 Semantic Parsing and the Lambda Calculus |
|
|
274 | (6) |
|
12.4 Learning Semantic Parsers |
|
|
280 | (9) |
|
13 Predicate-Argument Semantics |
|
|
289 | (20) |
|
|
291 | (4) |
|
13.2 Semantic Role Labeling |
|
|
295 | (7) |
|
13.3 Abstract Meaning Representation |
|
|
302 | (7) |
|
14 Distributional and Distributed Semantics |
|
|
309 | (24) |
|
14.1 The Distributional Hypothesis |
|
|
309 | (2) |
|
14.2 Design Decisions for Word Representations |
|
|
311 | (2) |
|
14.3 Latent Semantic Analysis |
|
|
313 | (2) |
|
|
315 | (2) |
|
14.5 Neural Word Embeddings |
|
|
317 | (5) |
|
14.6 Evaluating Word Embeddings |
|
|
322 | (2) |
|
14.7 Distributed Representations beyond Distributional Statistics |
|
|
324 | (3) |
|
14.8 Distributed Representations of Multiword Units |
|
|
327 | (6) |
|
|
333 | (24) |
|
15.1 Forms of Referring Expressions |
|
|
334 | (5) |
|
15.2 Algorithms for Coreference Resolution |
|
|
339 | (9) |
|
15.3 Representations for Coreference Resolution |
|
|
348 | (5) |
|
15.4 Evaluating Coreference Resolution |
|
|
353 | (4) |
|
|
357 | (20) |
|
|
357 | (2) |
|
16.2 Entities and Reference |
|
|
359 | (3) |
|
|
362 | (15) |
|
|
377 | |
|
17 Information Extraction |
|
|
379 | |
|
|
381 | (6) |
|
|
387 | |