Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Text Mining for Information Professionals: An Uncharted Territory

Manika Lamba, Margam Madhusudhan

Formaat: PDF+DRM
Ilmumisaeg: 21-Apr-2022
Kirjastus: Springer Nature Switzerland AG
Keel: eng
ISBN-13: 9783030850852

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 86,44 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Ilmumisaeg: 21-Apr-2022
Kirjastus: Springer Nature Switzerland AG
Keel: eng
ISBN-13: 9783030850852

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This book focuses on a basic theoretical framework dealing with the problems, solutions, and applications of text mining and its various facets in a very practical form of case studies, use cases, and stories.

The book contains 11 chapters with 14 case studies showing 8 different text mining and visualization approaches, and 17 stories. In addition, both a website and a Github account are also maintained for the book. They contain the code, data, and notebooks for the case studies; a summary of all the stories shared by the librarians/faculty; and hyperlinks to open an interactive virtual RStudio/Jupyter Notebook environment. The interactive virtual environment runs case studies based on the R programming language for hands-on practice in the cloud without installing any software.

From understanding different types and forms of data to case studies showing the application of each text mining approaches on data retrieved from various resources, this book is a must-read for all library professionals interested in text mining and its application in libraries. Additionally, this book will also be helpful to archivists, digital curators, or any other humanities and social science professionals who want to understand the basic theory behind text data, text mining, and various tools and techniques available to solve and visualize their research problems.

1 The Computational Library

(32)

1.1 Computational Thinking

(5)

1.2 Genealogy of Text Mining in Libraries

(2)

1.3 What Is Text Mining?

(9)

1.3.1 Text Characteristics

(2)

1.3.2 Different Text Mining Tasks

(3)

1.3.3 Supervised vs. Unsupervised Learning Methods

(2)

1.3.4 Cost, Benefits, and Barriers

(1)

1.3.5 Limitations

(1)

1.4 Case Study: Clustering of Documents Using Two Different Tools

(13)

References

(3)

2 Text Data and Where to Find Them?

(46)

2.1 Data

(5)

2.1.1 Digital Trace Data

(4)

2.2 Different Types of Data

(1)

2.3 Data File Types

(9)

2.3.1 Plain Text

(2)

2.3.2 CSV

(1)

2.3.3 JSON

(3)

2.3.4 XML

(1)

2.3.5 Binary Files

(2)

2.4 Metadata

(3)

2.4.1 What Is a Metadata Standard?

(1)

2.4.2 Steps to Create Quality Metadata

(1)

2.5 Digital Data Creation

(3)

2.6 Different Ways of Getting Data

(23)

2.6.1 Downloading Digital Data

(1)

2.6.2 Downloading Data from Online Repositories

(1)

2.6.3 Downloading Data from Relational Databases

(7)

2.6.4 Web APIs

(3)

2.6.5 Web Scraping/Screen Scraping

(11)

References

(2)

3 Text Pre-Processing

(26)

3.1 Introduction

(2)

3.1.1 Level of Text Representation

(1)

3.2 Text Transformation

(1)

3.2.1 Corpus Creation

(1)

3.2.2 Dictionary Creation

(1)

3.3 Text Pre-Processing

(4)

3.3.1 Case Normalization

(1)

3.3.2 Morphological Normalization

(1)

3.3.3 Token ization

(1)

3.3.4 Stemming

(1)

3.3.5 Lemmatization

(1)

3.3.6 Stopwords

(1)

3.3.7 Object Standardization

(1)

3.4 Feature Engineering

(10)

3.4.1 Semantic Parsing

(1)

3.4.2 Bag of Words (BOW)

(1)

3.4.3 N-Grams

(1)

3.4.4 Creation of Matrix

(1)

3.4.5 Term Frequency-Inverse Document Frequency (TF-IDF)

(1)

3.4.6 Syntactical Parsing

(1)

3.4.7 Parts-of-Speech Tagging (POS)

(2)

3.4.8 Named Entity Recognition (NER)

(1)

3.4.9 Similarity Computation Using Distances

(1)

3.4.10 Word Embedding

(1)

3.5 Case Study: An Analysis of Tolkien's Books

(7)

References

103

(2)

4 Topic Modeling

105

(34)

4.1 What Is Topic Modeling?

105

(5)

4.1.1 Topic Evolution

106

(1)

4.1.2 Application and Visualization

107

(1)

4.1.3 Available Tools and Packages

108

(1)

4.1.4 When to Use Topic Modeling

109

(1)

4.1.5 When Not to Use Topic Modeling

110

(1)

4.2 Methods and Algorithms

110

(3)

4.3 Topic Modeling and Libraries

113

(6)

4.3.1 Use Cases

117

(2)

4.4 Case Study: Topic Modeling of Documents Using Three Different Tools

119

(17)

References

136

(3)

5 Network Text Analysis

139

(34)

5.1 What Is Network Text Analysis?

139

(10)

5.1.1 Two-Mode Networks

141

(1)

5.1.2 Centrality Measures

142

(3)

5.1.3 Graph Algorithms

145

(1)

5.1.4 Comparison of Network Text Analysis with Others

145

(1)

5.1.5 How to Perform Network Text Analysis?

146

(1)

5.1.6 Available Tools and Packages

147

(1)

5.1.7 Applications

147

(1)

5.1.8 Advantages

148

(1)

5.1.9 Limitations

149

(1)

5.2 Topic Maps

149

(4)

5.2.1 Constructs of Topic Maps

150

(1)

5.2.2 Topic Map Software Architecture

151

(1)

5.2.3 Typical Uses

152

(1)

5.2.4 Advantages of Topic Maps

152

(1)

5.2.5 Disadvantages of Topic Maps

153

(1)

5.3 Network Text Analysis and Libraries

153

(5)

5.3.1 Use Cases

156

(2)

5.4 Case Study: Network Text Analysis of Documents Using Two Different R Packages

158

(13)

References

171

(2)

6 Burst Detection

173

(18)

6.1 What Is Burst Detection?

173

(6)

6.1.1 How to Detect a Burst?

174

(1)

6.1.2 Comparison of Burst Detection with Others

175

(1)

6.1.3 How to Perform Burst Detection?

176

(1)

6.1.4 Available Tools and Packages

177

(1)

6.1.5 Applications

178

(1)

6.1.6 Advantages

178

(1)

6.1.7 Limitations

178

(1)

6.2 Burst Detection and Libraries

179

(1)

6.2.1 Use Cases

179

(1)

6.2.2 Marketing

180

(1)

6.2.3 Reference Desk Service

180

(1)

6.3 Case Study: Burst Detection of Documents Using Two Different Tools

180

(8)

References

188

(3)

7 Sentiment Analysis

191

(22)

7.1 What Is Sentiment Analysis?

191

(6)

7.1.1 Levels of Granularity

192

(1)

7.1.2 Approaches for Sentiment Analysis

193

(1)

7.1.3 How to Perform Sentiment Analysis?

194

(1)

7.1.4 Available Tools and Packages

195

(1)

7.1.5 Applications

196

(1)

7.1.6 Advantages

196

(1)

7.1.7 Limitations

196

(1)

7.2 Sentiment Analysis and Libraries

197

(4)

7.2.1 Use Cases

200

(1)

7.3 Case Study: Sentiment Analysis of Documents Using Two Different Tools

201

(9)

References

210

(3)

8 Predictive Modeling

213

(30)

8.1 What Is Predictive Modeling?

213

(15)

8.1.1 Why Use Machine Learning?

215

(1)

8.1.2 Machine Learning Methods

215

(1)

8.1.3 Feature Selection and Representation

216

(1)

8.1.4 Machine Learning Algorithms

216

(3)

8.1.5 Classification Task

219

(2)

8.1.6 How to Perform Predictive Modeling on Text Documents?

221

(6)

8.1.7 Available Tools and Packages

227

(1)

8.1.8 Advantages

227

(1)

8.1.9 Limitations

228

(1)

8.2 Machine Learning and Libraries

228

(8)

8.2.1 Challenges

230

(4)

8.2.2 Use Cases

234

(2)

8.3 Case Study: Predictive Modeling of Documents Using RapidMiner

236

(4)

References

240

(3)

9 Information Visualization

243

(52)

9.1 What Is Information Visualization?

243

(8)

9.1.1 Information Visualization Framework

244

(1)

9.1.2 Data Scale Types

245

(1)

9.1.3 Graphic Variable Types

246

(1)

9.1.4 Types of Datasets

247

(1)

9.1.5 Attribute Semantics

248

(1)

9.1.6 What Is an Appropriate Visual Representation for a Given Dataset?

248

(1)

9.1.7 Graphical Decoding

248

(1)

9.1.8 How Does One Know How Good a Visual Encoding Is?

249

(1)

9.1.9 Main Purpose of Visualization

249

(1)

9.1.10 Modes of Visualization

250

(1)

9.1.11 Methods of Graphic Visualization

250

(1)

9.2 Fundamental Graphs

251

(3)

9.3 Networks and Trees

254

(1)

9.4 Advanced Graphs

255

(6)

9.5 Rules on Visual Design

261

(1)

9.6 Text Visualization

262

(7)

9.7 Document Visualization

269

(1)

9.8 Information Visualization and Libraries

270

(20)

9.8.1 Use Cases

282

(7)

9.8.2 Information Visualization Skills for Librarians

289

(1)

9.8.3 Conclusion

289

(1)

9.9 Case Study: To Build a Dashboard Using R

290

(2)

References

292

(3)

10 Tools and Techniques for Text Mining and Visualization

295

(24)

10.1 Introduction

295

(1)

10.2 Text Mining Tools

296

(14)

10.2.1 R

296

(1)

10.2.2 Topic-Modeling-Tool

297

(2)

10.2.3 RapidMiner

299

(2)

10.2.4 Waikato Environment for Knowledge Analysis (WEKA)

301

(1)

10.2.5 Orange

302

(2)

10.2.6 Voyant Tools

304

(2)

10.2.7 Science of Science (Sci2) Tool

306

(1)

10.2.8 LancsBox

307

(1)

10.2.9 ConText

308

(1)

10.2.10 Overview Docs

309

(1)

10.3 Visualization Tools

310

(8)

10.3.1 Gephi

310

(1)

10.3.2 Tableau Public

311

(1)

10.3.3 Infogram

312

(1)

10.3.4 Microsoft Power BI

312

(2)

10.3.5 Datawrapper

314

(1)

10.3.6 RAWGraphs

315

(1)

10.3.7 WORDij

315

(1)

10.3.8 Palladio

316

(1)

10.3.9 Chart Studio

317

(1)

References

318

(1)

11 Text Data and Mining Ethics

319

(30)

11.1 Text Data Management

319

(11)

11.1.1 Plan

320

(1)

11.1.2 Lifecycle

320

(5)

11.1.3 Citation

325

(1)

11.1.4 Sharing

326

(1)

11.1.5 Need of Data Management for Text Mining

326

(1)

11.1.6 Benefits of Data Management for Text Mining

327

(1)

11.1.7 Ethical and Legal Rules Related to Text Data

327

(3)

11.2 Social Media Ethics

330

(2)

11.2.1 Framework for Ethical Research with Social Media Data

332

(1)

11.3 Ethical and Legal Issues Related to Text Mining

332

(15)

11.3.1 Copyright

337

(1)

11.3.2 License Conditions

337

(1)

11.3.3 Algorithmic Confounding/Biasness

338

(9)

References

347

(2)

Index

349

Manika Lamba is a Ph.D. candidate at the Department of Library and Information Science, University of Delhi, India. She is currently serving as the Editor-in-Chief of the International Journal of Library and Information Services (IJLIS), the Elected Standing Committee Member for IFLA Science and Technology Libraries Section, and Newsletter Officer for ASIS&T South Asia Chapter. She was Editor-at-large for dh+lib (an ACRL Digital Humanities Interest Group project) and was featured in the Information Professionals Share their Top Tips for 2019 blog by the Copyright Clearance Center (CCC). She is an active reviewer for more than 17 international journals, including IEEE Access, Scientometrics, Library Hi-Tech, and the Journal of Information Science. Her scholarship focuses on the intersections of computational social science, social informatics, information retrieval, services, and management.

Margam Madhusudhan is currently working as a Professor in the Department of Library and Information Science, University of Delhi, India. He has worked as Deputy Dean Academics and Member of Academic Council at the University of Delhi. He is a member of many academic bodies, editorial board of national and international LIS journals. He is the recipient of the "Award for Excellence" (Highly Commended) in 2019, Excellence in Research in 2017, P.V. Verghese Award in 2013. He has 22 years of teaching, administration, and research experience at the university level.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97830308508522e.html

Märksõnad:

E-raamat: Text Mining for Information Professionals: An Uncharted Territory

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv