Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Fundamentals of Predictive Text Mining

3.17/5 (12 hinnangut Goodreads-ist)

Nitin Indurkhya, Tong Zhang, Sholom M. Weiss

Formaat: PDF+DRM
Sari: Texts in Computer Science
Ilmumisaeg: 07-Sep-2015
Kirjastus: Springer London Ltd
Keel: eng
ISBN-13: 9781447167501

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 55,56 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Sari: Texts in Computer Science
Ilmumisaeg: 07-Sep-2015
Kirjastus: Springer London Ltd
Keel: eng
ISBN-13: 9781447167501

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This successful textbook on predictive text mining offers a unified perspective on a rapidly evolving field, integrating topics spanning the varied disciplines of data science, machine learning, databases, and computational linguistics. Serving also as a practical guide, this unique book provides helpful advice illustrated by examples and case studies. This highly anticipated second edition has been thoroughly revised and expanded with new material on deep learning, graph models, mining social media, errors and pitfalls in big data evaluation, Twitter sentiment analysis, and dependency parsing discussion. The fully updated content also features in-depth discussions on issues of document classification, information retrieval, clustering and organizing documents, information extraction, web-based data-sourcing, and prediction and evaluation. Features: includes chapter summaries and exercises; explores the application of each method; provides several case studies; contains links to free text-mining software.

Arvustused

Fundamentals of predictive text mining is a second edition that is designed as a textbook, with questions and exercises in each chapter. The book can be used with data mining software for hands-on experience for students. The book will be very useful for people planning to go into this field or to learn techniques that could be used in a big data environment. (S. Srinivasan, Computing Reviews, February, 2016)

1 Overview of Text Mining

(12)

1.1 What's Special About Text Mining?

(5)

1.1.1 Structured or Unstructured Data'?

(1)

1.1.2 Is Text Different from Numbers'?

(3)

1.2 What Types of Problems Can Be Solved9 5

1.3 Document Classification

(1)

1.4 Information Retrieval

(1)

1.5 Clustering and Organizing Documents

(1)

1.6 Information Extraction

(1)

1.7 Prediction and Evaluation

(1)

1.8 The Next
Chapters

(1)

1.9 Summary

(1)

1.10 Historical and Bibliographical Remarks

(1)

1.11 Questions and Exercises

(1)

2 From Textual Information to Numerical Vectors

(28)

2.1 Collecting Documents

(2)

2.2 Document Standardization

(2)

2.3 Tokenization

(2)

2.4 Lemmatization

(2)

2.4.1 Inflectional Stemming

(2)

2.4.2 Stemming to a Root

(1)

2.5 Vector Generation for Prediction

(9)

2.5.1 Multiword Features

(3)

2.5.2 Labels for the Right Answers

(1)

2.5.3 Feature Selection by Attribute Ranking

(1)

2.6 Sentence Boundary Determination

(1)

2.7 Part-of-Speech Tagging

(2)

2.8 Word Sense Disambiguation

(1)

2.9 Phrase Recognition

(1)

2.10 Named Entity Recognition

(1)

2.11 Parsing

(1)

2.12 Feature Generation

(2)

2.13 Summary

(1)

2.14 Historical and Bibliographical Remarks

(2)

2.15 Questions and Exercises

(2)

3 Using Text for Prediction

(40)

3.1 Recognizing that Documents Fit a Pattern

(1)

3.2 How Many Documents Are Enough?

(1)

3.3 Document Classification

(1)

3.4 Learning to Predict from Text

(23)

3.4.1 Similarity and Nearest-Neighbor Methods

(1)

3.4.2 Document Similarity

(2)

3.4.3 Decision Rules

(6)

3.4.4 Decision Trees

(1)

3.4.5 Scoring by Probabilities

(3)

3.4.6 Linear Scoring Methods

(9)

3.5 Evaluation of Performance

(5)

3.5.1 Estimating Current and Future Performance

(2)

3.5.2 Getting the Most from a Learning Method

(1)

3.5.3 Errors and Pitfalls in Big Data Evaluation

(2)

3.6 Applications

(1)

3.7 Graph Models for Social Networks

(2)

3.8 Summary

(1)

3.9 Historical and Bibliographical Remarks

(2)

3.10 Questions and Exercises

(2)

4 Information Retrieval and Text Mining

(16)

4.1 Is Information Retrieval a Form of Text Mining?

(1)

4.2 Key Word Search

(1)

4.3 Nearest-Neighbor Methods

(1)

4.4 Measuring Similarity

(3)

4.4.1 Shared Word Count

(1)

4.4.2 Word Count and Bonus

(1)

4.4.3 Cosine Similarity

(1)

4.5 Web-Based Document Search

(4)

4.5.1 Link Analysis

(3)

4.6 Document Matching

(1)

4.7 Inverted Lists

(1)

4.8 Evaluation of Performance

(1)

4.9 Summary

(1)

4.10 Historical and Bibliographical Remarks

(1)

4.11 Questions and Exercises

(2)

5 Finding Structure in a Document Collection

(22)

5.1 Clustering Documents by Similarity

(1)

5.2 Similarity of Composite Documents

100

(12)

5.2.1 k-Means Clustering

102

(4)

5.2.2 Hierarchical Clustering

106

(2)

5.2.3 The EM Algorithm

108

(4)

5.3 What Do a Cluster's Labels Mean?

112

(1)

5.4 Applications

113

(1)

5.5 Evaluation of Performance

114

(2)

5.6 Summary

116

(1)

5.7 Historical and Bibliographical Remarks

116

(2)

5.8 Questions and Exercises

118

(1)

6 Looking for Information in Documents

119

(28)

6.1 Goals of Information Extraction

119

(2)

6.2 Finding Patterns and Entities from Text

121

(14)

6.2.1 Entity Extraction as Sequential Tagging

122

(1)

6.2.2 Tag Prediction as Classification

123

(1)

6.2.3 The Maximum Entropy Method

124

(5)

6.2.4 Linguistic Features and Encoding

129

(1)

6.2.5 Local Sequence Prediction Models

130

(4)

6.2.6 Global Sequence Prediction Models

134

(1)

6.3 Coreference and Relationship Extraction

135

(4)

6.3.1 Coreference Resolution

135

(3)

6.3.2 Relationship Extraction

138

(1)

6.4 Template Filling and Database Construction

139

(1)

6.5 Applications

140

(3)

6.5.1 Information Retrieval

140

(1)

6.5.2 Commercial Extraction Systems

140

(1)

6.5.3 Criminal Justice

141

(1)

6.5.4 Intelligence

142

(1)

6.6 Summary

143

(1)

6.7 Historical and Bibliographical Remarks

143

(2)

6.8 Questions and Exercises

145

(2)

7 Data Sources for Prediction: Databases, Hybrid Data and the Web

147

(18)

7.1 Ideal Models of Data

147

(3)

7.1.1 Ideal Data for Prediction

147

(1)

7.1.2 Ideal Data for Text and Unstructured Data

148

(1)

7.1.3 Hybrid and Mixed Data

148

(2)

7.2 Practical Data Sourcing

150

(1)

7.3 Prototypical Examples

151

(7)

7.3.1 Web-Based Spreadsheet Data

152

(1)

7.3.2 Web-Based XML Data

152

(1)

7.3.3 Opinion Data and Sentiment Analysis

153

(5)

7.4 Hybrid Example: Independent Sources of Numerical and Text Data

158

(1)

7.5 Mixed Data in Standard Table Format

159

(1)

7.6 Summary

160

(2)

7.7 Historical and Bibliographical Remarks

162

(1)

7.8 Questions and Exercises

162

(3)

8 Case Studies

165

(38)

8.1 Market Intelligence from the Web

165

(5)

8.1.1 The Problem

165

(1)

8.1.2 Solution Overview

166

(1)

8.1.3 Methods and Procedures

167

(1)

8.1.4 System Deployment

168

(2)

8.2 Lightweight Document Matching for Digital Libraries

170

(3)

8.2.1 The Problem

170

(1)

8.2.2 Solution Overview

170

(1)

8.2.3 Methods and Procedures

171

(2)

8.2.4 System Deployment

173

(1)

8.3 Generating Model Cases for Help Desk Applications

173

(4)

8.3.1 The Problem

173

(1)

8.3.2 Solution Overview

174

(1)

8.3.3 Methods and Procedures

174

(2)

8.3.4 System Deployment

176

(1)

8.4 Assigning Topics to News Articles

177

(5)

8.4.1 The Problem

177

(1)

8.4.2 Solution Overview

177

(1)

8.4.3 Methods and Procedures

178

(4)

8.4.4 System Deployment

182

(1)

8.5 E-mail Filtering

182

(4)

8.5.1 The Problem

182

(1)

8.5.2 Solution Overview

183

(1)

8.5.3 Methods and Procedures

184

(1)

8.5.4 System Deployment

185

(1)

8.6 Search Engines

186

(4)

8.6.1 The Problem

186

(1)

8.6.2 Solution Overview

186

(1)

8.6.3 Methods and Procedures

187

(1)

8.6.4 System Deployment

188

(2)

8.7 Extracting Named Entities from Documents

190

(4)

8.7.1 The Problem

190

(1)

8.7.2 Solution Overview

190

(1)

8.7.3 Methods and Procedures

191

(2)

8.7.4 System Deployment

193

(1)

8.8 Mining Social Media

194

(3)

8.8.1 The Problem

194

(1)

8.8.2 Solution Overview

195

(1)

8.8.3 Methods and Procedures

196

(1)

8.8.4 System Deployment

197

(1)

8.9 Customized Newspapers

197

(3)

8.9.1 The Problem

197

(1)

8.9.2 Solution Overview

198

(1)

8.9.3 Methods and Procedures

198

(1)

8.9.4 System Deployment

199

(1)

8.10 Summary

200

(1)

8.11 Historical and Bibliographical Remarks

200

(1)

8.12 Questions and Exercises

201

(2)

9 Emerging Directions

203

(20)

9.1 Summarization

203

(3)

9.2 Active Learning

206

(1)

9.3 Learning with Unlabeled Data

207

(1)

9.4 Different Ways of Collecting Samples

208

(7)

9.4.1 Ensembles and Voting Methods

208

(2)

9.4.2 Online Learning

210

(1)

9.4.3 Deep Learning

211

(3)

9.4.4 Cost-Sensitive Learning

214

(1)

9.4.5 Unbalanced Samples and Rare Events

214

(1)

9.5 Distributed Text Mining

215

(2)

9.6 Learning to Rank

217

(1)

9.7 Question Answering

218

(1)

9.8 Summary

219

(1)

9.9 Historical and Bibliographical Remarks

219

(3)

9.10 Questions and Exercises

222

(1)

References

223

(8)

Author Index

231

(4)

Subject Index

235

Dr. Sholom M. Weiss is a Professor Emeritus of Computer Science at Rutgers University, a Fellow of the Association for the Advancement of Artificial Intelligence, and co-founder of AI Data-Miner LLC, New York.

Dr. Nitin Indurkhya is faculty member at the School of Computer Science and Engineering, University of New South Wales, Australia, and the Institute of Statistical Education, Arlington, VA, USA. He is also a co-founder of AI Data-Miner LLC, New York.

Dr. Tong Zhang is a Professor of Statistics and Biostatistics at Rutgers University.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97814471675012e.html

Märksõnad:

E-raamat: Fundamentals of Predictive Text Mining

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv