Muutke küpsiste eelistusi

NLTK Essentials [Pehme köide]

  • Formaat: Paperback / softback, 194 pages, kõrgus x laius: 235x191 mm
  • Ilmumisaeg: 27-Jul-2015
  • Kirjastus: Packt Publishing Limited
  • ISBN-10: 1784396907
  • ISBN-13: 9781784396909
  • Pehme köide
  • Hind: 36,99 €*
  • * saadame teile pakkumise kasutatud raamatule, mille hind võib erineda kodulehel olevast hinnast
  • See raamat on trükist otsas, kuid me saadame teile pakkumise kasutatud raamatule.
  • Kogus:
  • Lisa ostukorvi
  • Tasuta tarne
  • Lisa soovinimekirja
  • Formaat: Paperback / softback, 194 pages, kõrgus x laius: 235x191 mm
  • Ilmumisaeg: 27-Jul-2015
  • Kirjastus: Packt Publishing Limited
  • ISBN-10: 1784396907
  • ISBN-13: 9781784396909
If you are an NLP or machine learning enthusiast with some or no experience in text processing, then this book is for you. This book is also ideal for expert Python programmers who want to learn NLTK quickly.
Preface v
Chapter 1 Introduction to Natural Language Processing
1(18)
Why learn NLP?
2(3)
Let's start playing with Python!
5(6)
Lists
5(1)
Helping yourself
6(2)
Regular expression
8(1)
Dictionaries
9(1)
Writing functions
10(1)
Diving into NLTK
11(6)
Your turn
17(1)
Summary
17(2)
Chapter 2 Text Wrangling and Cleansing
19(12)
What is text wrangling?
19(3)
Text cleansing
22(1)
Sentence splitter
22(1)
Tokenization
23(1)
Stemming
24(2)
Lemmatization
26(1)
Stop word removal
26(1)
Rare word removal
27(1)
Spell correction
28(1)
Your turn
28(1)
Summary
29(2)
Chapter 3 Part of Speech Tagging
31(14)
What is Part of speech tagging
31(9)
Stanford tagger
34(1)
Diving deep into a tagger
35(1)
Sequential tagger
36(1)
N-gram tagger
37(1)
Regex tagger
38(1)
Brill tagger
39(1)
Machine learning based tagger
39(1)
Named Entity Recognition (NER)
40(2)
NER tagger
40(2)
Your Turn
42(1)
Summary
43(2)
Chapter 4 Parsing Structure in Text
45(14)
Shallow versus deep parsing
46(1)
The two approaches in parsing
46(1)
Why we need parsing
46(2)
Different types of parsers
48(2)
A recursive descent parser
48(1)
A shift-reduce parser
48(1)
A chart parser
49(1)
A regex parser
49(1)
Dependency parsing
50(2)
Chunking
52(3)
Information extraction
55(3)
Named-entity recognition (NER)
56(1)
Relation extraction
57(1)
Summary
58(1)
Chapter 5 NLP Applications
59(14)
Building your first NLP application
60(3)
Other NLP applications
63(9)
Machine translation
63(2)
Statistical machine translation
65(1)
Information retrieval
65(1)
Boolean retrieval
66(1)
Vector space model
66(1)
The probabilistic model
67(1)
Speech recognition
68(1)
Text classification
68(2)
Information extraction
70(1)
Question answering systems
70(1)
Dialog systems
71(1)
Word sense disambiguation
71(1)
Topic modeling
71(1)
Language detection
72(1)
Optical character recognition
72(1)
Summary
72(1)
Chapter 6 Text Classification
73(20)
Machine learning
74(1)
Text classification
75(2)
Sampling
77(10)
Naive Bayes
80(3)
Decision trees
83(1)
Stochastic gradient descent
84(1)
Logistic regression
85(1)
Support vector machines
85(2)
The Random forest algorithm
87(1)
Text clustering
87(2)
K-means
88(1)
Topic modeling in text
89(2)
Installing gensim
89(2)
References
91(1)
Summary
92(1)
Chapter 7 Web Crawling
93(16)
Web crawlers
93(1)
Writing your first crawler
94(3)
Data flow in Scrapy
97(8)
The Scrapy shell
98(5)
Items
103(2)
The Sitemap spider
105(1)
The item pipeline
106(2)
External references
108(1)
Summary
108(1)
Chapter 8 Using NLTK with Other Python Libraries
109(28)
NumPy
110(8)
ndarray
110(1)
Indexing
111(1)
Basic operations
111(2)
Extracting data from an array
113(1)
Complex matrix operations
114(2)
Reshaping and stacking
116(2)
Random numbers
118(1)
SciPy
118(6)
Linear algebra
119(1)
Eigenvalues and eigenvectors
120(1)
The sparse matrix
121(1)
Optimization
122(2)
pandas
124(6)
Reading data
124(3)
Series data
127(1)
Column transformation
128(1)
Noisy data
128(2)
matplotlib
130(5)
Subplot
131(2)
Adding an axis
133(1)
A scatter plot
134(1)
A bar plot
134(1)
3D plots
134(1)
External references
135(1)
Summary
135(2)
Chapter 9 Social Media Mining in Python
137(18)
Data collection
138(4)
Twitter
138(4)
Data extraction
142(2)
Trending topics
143(1)
Geovisualization
144(9)
Influencers detection
145(1)
Facebook
146(5)
Influencer friends
151(2)
Summary
153(2)
Chapter 10 Text Mining at Scale
155(14)
Different ways of using Python on Hadoop
156(1)
Python streaming
156(1)
Hive/Pig UDF
156(1)
Streaming wrappers
157(1)
NLTK on Hadoop
157(4)
A UDF
157(3)
Python streaming
160(1)
Scikit-learn on Hadoop
161(4)
PySpark
165(2)
Summary
167(2)
Index 169
Nitin Hardeniya is a data scientist with more than 4 years of experience working with companies such as Fidelity, Groupon, and [ 24]7-inc. He has worked on a variety of business problems across different domains. He holds a master's degree in computational linguistics from IIIT-H. He is the author of 5 patents in the field of customer experience. He is passionate about language processing and large unstructured data. He has been using Python for almost 5 years in his day-to-day work. He believes that Python could be a single-point solution to most of the problems related to data science. He has put on his hacker's hat to write this book and has tried to give you an introduction to all the sophisticated tools related to NLP and machine learning in a very simplified form. In this book, he has also provided a workaround using some of the amazing capabilities of Python libraries, such as NLTK, scikit-learn, pandas, and NumPy.