Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Getting Started with Natural Language Processing

3.89/5 (18 hinnangut Goodreads-ist)

Ekaterina Kochmar

Formaat: 456 pages
Ilmumisaeg: 15-Nov-2022
Kirjastus: Manning Publications
Keel: eng
ISBN-13: 9781638350927

Teised raamatud teemal:

Database programming

Formaat - EPUB+DRM
Hind: 43,42 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 456 pages
Ilmumisaeg: 15-Nov-2022
Kirjastus: Manning Publications
Keel: eng
ISBN-13: 9781638350927

Teised raamatud teemal:

Database programming

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Getting Started with Natural Language Processing is a hands-on guide filled with everything you need to get started with NLP in a friendly, understandable tutorial. Full of Python code and hands-on projects, each chapter provides a concrete example with practical techniques that you can put into practice right away.

By following the numerous Python-based examples and real-world case studies, youll apply NLP to search applications, extracting meaning from text, sentiment analysis, user profiling, and more. When youre done, youll have a solid grounding in NLP that will serve as a foundation for further learning.

Key Features

· Extracting information from raw text

· Named entity recognition

· Automating summarization of key facts

· Topic labeling

For beginners to NLP with basic Python skills.

About the technology

Natural Language Processing is a set of data science techniques that enable machines to make sense of human text and speech. Advances in machine learning and deep learning have made NLP more efficient and reliable than ever, leading to a huge number of new tools and resources. From improving search applications to sentiment analysis, the possible applications of NLP are vast and growing.

Ekaterina Kochmar is an Affiliated Lecturer and a Senior Research Associate at the Natural Language and Information Processing group of the Department of Computer Science and Technology, University of Cambridge. She holds an MA degree in Computational Linguistics, an MPhil in Advanced Computer Science, and a PhD in Natural Language Processing.

Preface

xiii

Acknowledgments

About This book

xvii

About the author

xxii

About the cover illustration

xxiii

1 Introduction

(30)

1.1 A brief history of NLP

(3)

1.2 Typical tasks

(26)

Information search

(11)

Advanced information search: Asking the machine precise questions

(2)

Conversational agents and intelligent virtual assistants

(2)

Text prediction and language generation

(5)

Spam filtering

(1)

Machine translation

(2)

Spell- and grammar checking

(3)

2 Your first NLP example

(40)

2.1 Introducing NLP in practice: Spam filtering

(5)

2.2 Understanding the task

(10)

Step 1 Define the data and classes

(1)

Step 2 Split the text into words

(5)

Step 3 Extract and normalize the features

(1)

Step 4 Train a classifier

(2)

Step 5 Evaluate the classifier

(1)

2.3 Implementing your own spam filter

(19)

Step 1 Define the data and classes

(3)

Step 2 Split the text into words

(1)

Step 3 Extract and normalize the features

(3)

Step 4 Train the classifier

(9)

Step 5 Evaluate your classifier

(3)

2.4 Deploying your spam filter in practice

(6)

3 Introduction to information search

(43)

3.1 Understanding the task

(15)

Data and data structures

(8)

Boolean search algorithm

(4)

3.2 Processing the data further

(9)

Preselecting the words that matter: Stopwords removal

(3)

Matching forms of the same word: Morphological processing

(6)

3.3 Information weighing

(7)

Weighing words with term frequency

(3)

Weighing words with inverse document frequency

100

(3)

3.4 Practical use of the search algorithm

103

(11)

Retrieval of the most similar documents

104

(2)

Evaluation of the results

106

(5)

Deploying search algorithm in practice

111

(3)

4 Information extraction

114

(37)

4.1 Use cases

116

(4)

Case 1

116

(1)

Case 2

117

(2)

Case 3

119

(1)

4.2 Understanding the task

120

(4)

4.3 Detecting word types with part-of-speech tagging

124

(13)

Understanding word types

124

(4)

Part-of-speech tagging with spaCy

128

(9)

4.4 Understanding sentence structure with syntactic parsing

137

(7)

Why sentence structure is important

137

(2)

Dependency parsing with spaCy

139

(5)

4.5 Building your own information extraction algorithm

144

(7)

5 Author profiling as a machine-learning task

151

(43)

5.1 Understanding the task

153

(4)

Case 1 Authorship attribution

154

(1)

Case 2 User profiling

155

(2)

5.2 Machine-learning pipeline at first glance

157

(18)

Original data

157

(6)

Testing generalization behavior

163

(6)

Setting up the benchmark

169

(6)

5.3 A closer look at the machine-learning pipeline

175

(19)

Decision Trees classifier basics

175

(3)

Evaluating which tree is better using node impurity

178

(6)

Selection of the best split in Decision Trees

184

(1)

Decision Trees on language data

185

(9)

6 Linguistic feature engineering for author profiling

194

(35)

6.1 Another close look at the machine-learning pipeline

196

(4)

Evaluating the performance of your classifier

196

(1)

Further evaluation measures

197

(3)

6.2 Feature engineering for authorship attribution

200

(26)

Word and sentence length statistics as features

201

(6)

Counts of stopwords and proportion of stopwords as features

207

(5)

Distributions of parts of speech as features

212

(7)

Distribution of word suffixes as features

219

(4)

Unique words as features

223

(3)

6.3 Practical use of authorship attribution and user profiling

226

(3)

7 Your first sentiment analyzer using sentiment lexicons

229

(34)

7.1 Use cases

231

(3)

7.2 Understanding your task

234

(5)

Aggregating sentiment score with the help of a lexicon

235

(2)

Learning to detect sentiment in a data-driven way

237

(2)

7.3 Setting up the pipeline: Data loading and analysis

239

(12)

Data loading and preprocessing

240

(3)

A closer look into the data

243

(8)

7.4 Aggregating sentiment scores with a sentiment lexicon

251

(12)

Collecting sentiment scores from a lexicon

252

(3)

Applying sentiment scores to detect review polarity

255

(8)

8 Sentiment analysis with a data-driven approach

263

(41)

8.1 Addressing multiple senses of a word with SentiWordNet

266

(11)

8.2 Addressing dependence on context with machine learning

277

(18)

Data preparation

278

(6)

Extracting features from text

284

(5)

Scikit-learn's machine-learning pipeline

289

(3)

Full-scale evaluation with cross-validation

292

(3)

8.3 Varying the length of the sentiment-bearing features

295

(3)

8.4 Negation handling for sentiment analysis

298

(3)

8.5 Further practice

301

(3)

9 Topic analysis

304

(118)

9.1 Topic classification as a supervised machine-learning task

307

(18)

Data

308

(4)

Topic classification with Naive Bayes

312

(8)

Evaluation of the results

320

(5)

9.2 Topic discovery as an unsupervised machine-learning task

325

(24)

Unsupervised ML approaches

325

(5)

Clustering for topic discovery

330

(8)

Evaluation of the topic clustering algorithm

338

(8)

Topic modeling

346

(3)

10.1 Topic modeling with latent Dirichlet allocation

349

(11)

Exercise 10.1 Question 1 solution

349

(2)

Exercise 10.1 Question 2 solution

351

(1)

Estimating parameters for the LDA

352

(4)

IDA as a generative model

356

(4)

10.2 Implementation of the topic modeling algorithm

360

(28)

Loading the data

361

(2)

Preprocessing the data

363

(8)

Applying the LDA model

371

(4)

Exploring the results

375

(9)

Named-entity recognition

384

(4)

11.1 Named entity recognition: Definitions and challenges

388

(4)

Named entity types

388

(2)

Challenges in named entity recognition

390

(2)

11.2 Named-entity recognition as a sequence labeling task

392

(11)

The basics: BIO scheme

393

(2)

What does it mean for a task to be sequential?

395

(2)

Sequential solution for NER

397

(6)

11.3 Practical applications of NER

403

(19)

Data loading and exploration

403

(3)

Named entity types exploration with spaCy

406

(10)

Information extraction revisited

410

(6)

Named entities visualization

416

(6)

Appendix Installation instructions

422

(1)

Index

423

Ekaterina Kochmar is an Affiliated Lecturer and a Senior Research Associate at the Natural Language and Information Processing group of the Department of Computer Science and Technology, University of Cambridge. She holds an MA degree in Computational Linguistics, an MPhil in Advanced Computer Science, and a PhD in Natural Language Processing.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97816383509276e.html

Märksõnad:

E-raamat: Getting Started with Natural Language Processing

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv