This book will introduce digital humanists at all levels of education to Python. It provides background and guidance on learning the Python computer programming language, and as it presumes no knowledge on the part of the reader about computers or coding concepts allows the reader to gradually learn the more complex tasks that are currently popular in the field of digital humanities. This book will be aimed at undergraduates, graduates, and faculty who are interested in learning how to use Python as a tool within their workflow. An Introduction to Python for Digital Humanists will act as a primer for students who wish to use Python, allowing them to engage with more advanced textbooks. This book fills a real need, as it is first Python introduction to be aimed squarely at humanities students, as other books currently available do not approach Python from a humanities perspective. It will be designed so that those experienced in Python can teach from it, in addition to allowing those who are interested in being self-taught can use it for that purpose.
Key Features:
- Data analysis
- Data science
- Computational humanities
- Digital humanities
- Python
- Natural language processing
- Social network analysis
- App development
This book will introduce digital humanists at all levels of education to Python. It provides background and guidance on learning the Python computer programming language. It is designed so that those experienced in Python can teach from it, and those interested in being self-taught can use it.
Part I. The Basics of Python.
Chapter
1. Introduction to Python.
Chapter
2. Data and Data Structures.
Chapter
3. Loops and Logic.
Chapter
4. Formal
Coding: Functions, Classes, and Libraries.
Chapter
5. Working with External
Data.
Chapter
6. Working with Data on the Web. Part II. Data Analysis with
Pandas.
Chapter
7. Introduction to Pandas.
Chapter
8. Working with Data in
Pandas.
Chapter
9. Searching for Data.
Chapter
10. Advanced Pandas. Part III.
Natural Language Processing with spaCy.
Chapter
11. Introduction to spaCy.
Chapter
12. Rules-Based spaCy.
Chapter
13. Solving a Domain-Specific Problem:
A Case Study with Holocaust NER.
Chapter
14. Topic Modeling: Concepts and
Theory.
Chapter
15. Text Analysis with BookNLP.
Chapter
16. Social Network
Analysis. Part IV. Designing an Application with Streamlit.
Chapter
17.
Introduction to Streamlit.
Chapter
18. Advanced Streamlit Features.
Chapter
19. Building a Database Query Application. Part V. Conclusion.
Chapter
21.
Conclusion.
William Mattingly is a 2022 Harry Frank Guggenheim Distinguished Scholar and a 2022-2023 ACLS Grantee for his work as co-principal investigator and lead developer for the Bitter Aloe Project which examines testimonies of violence from South Africas Truth and Reconciliation Commission. He is currently the Postdoctoral Fellow for the Analysis of Historical Documents at the Smithsonian Institutions Data Science Lab. Mattingly currently works on two projects at the Smithsonian. The first is based at the United States Holocaust Memorial Museum (USHMM), where he is developing a robust pipeline of machine learning image classification and natural language processing (NLP) models to automate the cataloging of millions of images. At the Smithsonian, he is working on a project connected to the American Womens History Initiative. Here, he is developing machine learning and heuristic pipelines with spaCy, a Python NLP library. This pipeline will identify women in Smithsonian documents and automatically extract knowledge about them so that we can better understand the influential role women played at the Smithsonian.