Foreword |
|
vii | |
|
Foreword |
|
ix | |
|
Preface |
|
xiii | |
|
1 Introduction to Weak Supervision |
|
|
1 | (16) |
|
What Is Weak Supervision? |
|
|
1 | (1) |
|
Real-World Weak Supervision with Snorkel |
|
|
2 | (4) |
|
Approaches to Weak Supervision |
|
|
6 | (5) |
|
|
6 | (3) |
|
|
9 | (1) |
|
|
10 | (1) |
|
|
11 | (2) |
|
|
13 | (3) |
|
How Data Programming Is Helping Accelerate Software 2.0 |
|
|
14 | (2) |
|
|
16 | (1) |
|
2 Diving into Data Programming with Snorkel |
|
|
17 | (32) |
|
Snorkel, a Data Programming Framework |
|
|
18 | (1) |
|
Getting Started with Labeling Functions |
|
|
19 | (10) |
|
Applying the Labels to the Datasets |
|
|
21 | (1) |
|
Analyzing the Labeling Performance |
|
|
22 | (5) |
|
|
27 | (2) |
|
Reaching Labeling Consensus with LabelModel |
|
|
29 | (3) |
|
Intuition Behind LabelModel |
|
|
30 | (1) |
|
LabelModel Parameter Estimation |
|
|
30 | (2) |
|
Strategies to Improve the Labeling Functions |
|
|
32 | (1) |
|
Data Augmentation with Snorkel Transformers |
|
|
33 | (14) |
|
Data Augmentation Through Word Removal |
|
|
36 | (2) |
|
|
38 | (1) |
|
Data Augmentation Through GPT-2 Prediction |
|
|
39 | (3) |
|
Data Augmentation Through Translation |
|
|
42 | (3) |
|
Applying the Transformation Functions to the Dataset |
|
|
45 | (2) |
|
|
47 | (2) |
|
|
49 | (38) |
|
Labeling a Text Dataset: Identifying Fake News |
|
|
50 | (17) |
|
Exploring the Fake News Detection(FakeNewsNet) Dataset |
|
|
51 | (1) |
|
Importing Snorkel and Setting Up Representative Constants |
|
|
52 | (1) |
|
|
52 | (9) |
|
|
61 | (2) |
|
Twitter Profile and Botometer Score |
|
|
63 | (1) |
|
Generating Agreements Between Weak Classifiers |
|
|
64 | (3) |
|
Labeling an Images Dataset: Determining Indoor Versus Outdoor Images |
|
|
67 | (18) |
|
Creating a Dataset of Images from Bing |
|
|
71 | (1) |
|
Defining and Training Weak Classifiers in TensorFlow |
|
|
71 | (3) |
|
Training the Various Classifiers |
|
|
74 | (2) |
|
Weak Classifiers out of Image Tags |
|
|
76 | (1) |
|
Deploying the Computer Vision Service |
|
|
77 | (1) |
|
Interacting with the Computer Vision Service |
|
|
78 | (2) |
|
|
80 | (1) |
|
|
81 | (4) |
|
|
85 | (2) |
|
4 Using the Snorkel-Labeled Dataset for Text Classification |
|
|
87 | (24) |
|
Getting Started with Natural Language Processing (NLP) |
|
|
88 | (3) |
|
|
89 | (2) |
|
Hard Versus Probabilistic Labels |
|
|
91 | (1) |
|
Using ktrain for Performing Text Classification |
|
|
91 | (9) |
|
|
92 | (1) |
|
Dealing with an Imbalanced Dataset |
|
|
93 | (2) |
|
|
95 | (2) |
|
Using the Text Classification Model for Prediction |
|
|
97 | (2) |
|
Finding a Good Learning Rate |
|
|
99 | (1) |
|
Using Hugging Face and Transformers |
|
|
100 | (9) |
|
Loading the Relevant Python Packages |
|
|
101 | (1) |
|
|
101 | (1) |
|
Checking Whether GPU Hardware Is Available |
|
|
102 | (1) |
|
|
102 | (2) |
|
|
104 | (4) |
|
Testing the Fine-Tuned Model |
|
|
108 | (1) |
|
|
109 | (2) |
|
5 Using the Snorkel-Labeled Dataset for Image Classification |
|
|
111 | (20) |
|
Visual Object Recognition Overview |
|
|
111 | (3) |
|
Representing Image Features |
|
|
112 | (1) |
|
Transfer Learning for Computer Vision |
|
|
113 | (1) |
|
Using PyTorch for Image Classification |
|
|
114 | (16) |
|
Loading the Indoor/Outdoor Dataset |
|
|
115 | (3) |
|
|
118 | (1) |
|
Visualizing the Training Data |
|
|
119 | (1) |
|
Fine-Tuning the Pretrained Model |
|
|
120 | (10) |
|
|
130 | (1) |
|
6 Scalability and Distributed Training |
|
|
131 | (30) |
|
|
132 | (1) |
|
|
133 | (2) |
|
Apache Spark: An Introduction |
|
|
135 | (3) |
|
|
137 | (1) |
|
Using Azure Databricks to Scale |
|
|
138 | (5) |
|
Cluster Setup for Weak Supervision |
|
|
141 | (2) |
|
Fake News Detection Dataset on Databricks |
|
|
143 | (16) |
|
Labeling Functions for Snorkel |
|
|
145 | (2) |
|
|
147 | (2) |
|
|
149 | (2) |
|
|
151 | (2) |
|
Transfer Learning Using the LIAR Dataset |
|
|
153 | (1) |
|
Weak Classifiers: Generating Agreement |
|
|
154 | (2) |
|
Type Conversions Needed for Spark Runtime |
|
|
156 | (3) |
|
|
159 | (2) |
Index |
|
161 | |