Muutke küpsiste eelistusi

E-raamat: Getting Started with Natural Language Processing

  • Formaat: 456 pages
  • Ilmumisaeg: 15-Nov-2022
  • Kirjastus: Manning Publications
  • Keel: eng
  • ISBN-13: 9781638350927
Teised raamatud teemal:
  • Formaat - EPUB+DRM
  • Hind: 43,42 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 456 pages
  • Ilmumisaeg: 15-Nov-2022
  • Kirjastus: Manning Publications
  • Keel: eng
  • ISBN-13: 9781638350927
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Getting Started with Natural Language Processing is a hands-on guide filled with everything you need to get started with NLP in a friendly, understandable tutorial. Full of Python code and hands-on projects, each chapter provides a concrete example with practical techniques that you can put into practice right away.

 

By following the numerous Python-based examples and real-world case studies, youll apply NLP to search applications, extracting meaning from text, sentiment analysis, user profiling, and more. When youre done, youll have a solid grounding in NLP that will serve as a foundation for further learning.

 

Key Features

·   Extracting information from raw text

·   Named entity recognition

·   Automating summarization of key facts

·   Topic labeling

 

For beginners to NLP with basic Python skills.

 

About the technology

Natural Language Processing is a set of data science techniques that enable machines to make sense of human text and speech. Advances in machine learning and deep learning have made NLP more efficient and reliable than ever, leading to a huge number of new tools and resources. From improving search applications to sentiment analysis, the possible applications of NLP are vast and growing.

 

Ekaterina Kochmar is an Affiliated Lecturer and a Senior Research Associate at the Natural Language and Information Processing group of the Department of Computer Science and Technology, University of Cambridge. She holds an MA degree in Computational Linguistics, an MPhil in Advanced Computer Science, and a PhD in Natural Language Processing.
Preface xiii
Acknowledgments xv
About This book xvii
About the author xxii
About the cover illustration xxiii
1 Introduction
1(30)
1.1 A brief history of NLP
2(3)
1.2 Typical tasks
5(26)
Information search
5(11)
Advanced information search: Asking the machine precise questions
16(2)
Conversational agents and intelligent virtual assistants
18(2)
Text prediction and language generation
20(5)
Spam filtering
25(1)
Machine translation
26(2)
Spell- and grammar checking
28(3)
2 Your first NLP example
31(40)
2.1 Introducing NLP in practice: Spam filtering
31(5)
2.2 Understanding the task
36(10)
Step 1 Define the data and classes
37(1)
Step 2 Split the text into words
37(5)
Step 3 Extract and normalize the features
42(1)
Step 4 Train a classifier
43(2)
Step 5 Evaluate the classifier
45(1)
2.3 Implementing your own spam filter
46(19)
Step 1 Define the data and classes
46(3)
Step 2 Split the text into words
49(1)
Step 3 Extract and normalize the features
50(3)
Step 4 Train the classifier
53(9)
Step 5 Evaluate your classifier
62(3)
2.4 Deploying your spam filter in practice
65(6)
3 Introduction to information search
71(43)
3.1 Understanding the task
72(15)
Data and data structures
75(8)
Boolean search algorithm
83(4)
3.2 Processing the data further
87(9)
Preselecting the words that matter: Stopwords removal
87(3)
Matching forms of the same word: Morphological processing
90(6)
3.3 Information weighing
96(7)
Weighing words with term frequency
97(3)
Weighing words with inverse document frequency
100(3)
3.4 Practical use of the search algorithm
103(11)
Retrieval of the most similar documents
104(2)
Evaluation of the results
106(5)
Deploying search algorithm in practice
111(3)
4 Information extraction
114(37)
4.1 Use cases
116(4)
Case 1
116(1)
Case 2
117(2)
Case 3
119(1)
4.2 Understanding the task
120(4)
4.3 Detecting word types with part-of-speech tagging
124(13)
Understanding word types
124(4)
Part-of-speech tagging with spaCy
128(9)
4.4 Understanding sentence structure with syntactic parsing
137(7)
Why sentence structure is important
137(2)
Dependency parsing with spaCy
139(5)
4.5 Building your own information extraction algorithm
144(7)
5 Author profiling as a machine-learning task
151(43)
5.1 Understanding the task
153(4)
Case 1 Authorship attribution
154(1)
Case 2 User profiling
155(2)
5.2 Machine-learning pipeline at first glance
157(18)
Original data
157(6)
Testing generalization behavior
163(6)
Setting up the benchmark
169(6)
5.3 A closer look at the machine-learning pipeline
175(19)
Decision Trees classifier basics
175(3)
Evaluating which tree is better using node impurity
178(6)
Selection of the best split in Decision Trees
184(1)
Decision Trees on language data
185(9)
6 Linguistic feature engineering for author profiling
194(35)
6.1 Another close look at the machine-learning pipeline
196(4)
Evaluating the performance of your classifier
196(1)
Further evaluation measures
197(3)
6.2 Feature engineering for authorship attribution
200(26)
Word and sentence length statistics as features
201(6)
Counts of stopwords and proportion of stopwords as features
207(5)
Distributions of parts of speech as features
212(7)
Distribution of word suffixes as features
219(4)
Unique words as features
223(3)
6.3 Practical use of authorship attribution and user profiling
226(3)
7 Your first sentiment analyzer using sentiment lexicons
229(34)
7.1 Use cases
231(3)
7.2 Understanding your task
234(5)
Aggregating sentiment score with the help of a lexicon
235(2)
Learning to detect sentiment in a data-driven way
237(2)
7.3 Setting up the pipeline: Data loading and analysis
239(12)
Data loading and preprocessing
240(3)
A closer look into the data
243(8)
7.4 Aggregating sentiment scores with a sentiment lexicon
251(12)
Collecting sentiment scores from a lexicon
252(3)
Applying sentiment scores to detect review polarity
255(8)
8 Sentiment analysis with a data-driven approach
263(41)
8.1 Addressing multiple senses of a word with SentiWordNet
266(11)
8.2 Addressing dependence on context with machine learning
277(18)
Data preparation
278(6)
Extracting features from text
284(5)
Scikit-learn's machine-learning pipeline
289(3)
Full-scale evaluation with cross-validation
292(3)
8.3 Varying the length of the sentiment-bearing features
295(3)
8.4 Negation handling for sentiment analysis
298(3)
8.5 Further practice
301(3)
9 Topic analysis
304(118)
9.1 Topic classification as a supervised machine-learning task
307(18)
Data
308(4)
Topic classification with Naive Bayes
312(8)
Evaluation of the results
320(5)
9.2 Topic discovery as an unsupervised machine-learning task
325(24)
Unsupervised ML approaches
325(5)
Clustering for topic discovery
330(8)
Evaluation of the topic clustering algorithm
338(8)
Topic modeling
346(3)
10.1 Topic modeling with latent Dirichlet allocation
349(11)
Exercise 10.1 Question 1 solution
349(2)
Exercise 10.1 Question 2 solution
351(1)
Estimating parameters for the LDA
352(4)
IDA as a generative model
356(4)
10.2 Implementation of the topic modeling algorithm
360(28)
Loading the data
361(2)
Preprocessing the data
363(8)
Applying the LDA model
371(4)
Exploring the results
375(9)
Named-entity recognition
384(4)
11.1 Named entity recognition: Definitions and challenges
388(4)
Named entity types
388(2)
Challenges in named entity recognition
390(2)
11.2 Named-entity recognition as a sequence labeling task
392(11)
The basics: BIO scheme
393(2)
What does it mean for a task to be sequential?
395(2)
Sequential solution for NER
397(6)
11.3 Practical applications of NER
403(19)
Data loading and exploration
403(3)
Named entity types exploration with spaCy
406(10)
Information extraction revisited
410(6)
Named entities visualization
416(6)
Appendix Installation instructions 422(1)
Index 423
Ekaterina Kochmar is an Affiliated Lecturer and a Senior Research Associate at the Natural Language and Information Processing group of the Department of Computer Science and Technology, University of Cambridge. She holds an MA degree in Computational Linguistics, an MPhil in Advanced Computer Science, and a PhD in Natural Language Processing.