Muutke küpsiste eelistusi

E-raamat: Practical Weak Supervision

  • Formaat: 192 pages
  • Ilmumisaeg: 30-Sep-2021
  • Kirjastus: O'Reilly Media
  • Keel: eng
  • ISBN-13: 9781492077039
  • Formaat - PDF+DRM
  • Hind: 47,96 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 192 pages
  • Ilmumisaeg: 30-Sep-2021
  • Kirjastus: O'Reilly Media
  • Keel: eng
  • ISBN-13: 9781492077039

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models.

You'll learn how to build natural language processing and computer vision projects using weakly labeled datasets from Snorkel, a spin-off from the Stanford AI Lab. Because so many companies have pursued ML projects that never go beyond their labs, this book also provides a guide on how to ship the deep learning models you build.

Get up to speed on the field of weak supervision, including ways to use it as part of the data science process Use Snorkel AI for weak supervision and data programming Get code examples for using Snorkel to label text and image datasets Use a weakly labeled dataset for text and image classification Learn practical considerations for using Snorkel with large datasets and using Spark clusters to scale labeling
Foreword vii
Xuedong Huang
Foreword ix
Alex Ratner
Preface xiii
1 Introduction to Weak Supervision
1(16)
What Is Weak Supervision?
1(1)
Real-World Weak Supervision with Snorkel
2(4)
Approaches to Weak Supervision
6(5)
Incomplete Supervision
6(3)
Inexact Supervision
9(1)
Inaccurate Supervision
10(1)
Data Programming
11(2)
Getting Training Data
13(3)
How Data Programming Is Helping Accelerate Software 2.0
14(2)
Summary
16(1)
2 Diving into Data Programming with Snorkel
17(32)
Snorkel, a Data Programming Framework
18(1)
Getting Started with Labeling Functions
19(10)
Applying the Labels to the Datasets
21(1)
Analyzing the Labeling Performance
22(5)
Using a Validation Set
27(2)
Reaching Labeling Consensus with LabelModel
29(3)
Intuition Behind LabelModel
30(1)
LabelModel Parameter Estimation
30(2)
Strategies to Improve the Labeling Functions
32(1)
Data Augmentation with Snorkel Transformers
33(14)
Data Augmentation Through Word Removal
36(2)
Snorkel Preprocessors
38(1)
Data Augmentation Through GPT-2 Prediction
39(3)
Data Augmentation Through Translation
42(3)
Applying the Transformation Functions to the Dataset
45(2)
Summary
47(2)
3 Labeling in Action
49(38)
Labeling a Text Dataset: Identifying Fake News
50(17)
Exploring the Fake News Detection(FakeNewsNet) Dataset
51(1)
Importing Snorkel and Setting Up Representative Constants
52(1)
Fact-Checking Sites
52(9)
Is the Speaker a "Liar"?
61(2)
Twitter Profile and Botometer Score
63(1)
Generating Agreements Between Weak Classifiers
64(3)
Labeling an Images Dataset: Determining Indoor Versus Outdoor Images
67(18)
Creating a Dataset of Images from Bing
71(1)
Defining and Training Weak Classifiers in TensorFlow
71(3)
Training the Various Classifiers
74(2)
Weak Classifiers out of Image Tags
76(1)
Deploying the Computer Vision Service
77(1)
Interacting with the Computer Vision Service
78(2)
Preparing the DataFrame
80(1)
Learning a LabelModel
81(4)
Summary
85(2)
4 Using the Snorkel-Labeled Dataset for Text Classification
87(24)
Getting Started with Natural Language Processing (NLP)
88(3)
Transformers
89(2)
Hard Versus Probabilistic Labels
91(1)
Using ktrain for Performing Text Classification
91(9)
Data Preparation
92(1)
Dealing with an Imbalanced Dataset
93(2)
Training the Model
95(2)
Using the Text Classification Model for Prediction
97(2)
Finding a Good Learning Rate
99(1)
Using Hugging Face and Transformers
100(9)
Loading the Relevant Python Packages
101(1)
Dataset Preparation
101(1)
Checking Whether GPU Hardware Is Available
102(1)
Performing Tokenization
102(2)
Model Training
104(4)
Testing the Fine-Tuned Model
108(1)
Summary
109(2)
5 Using the Snorkel-Labeled Dataset for Image Classification
111(20)
Visual Object Recognition Overview
111(3)
Representing Image Features
112(1)
Transfer Learning for Computer Vision
113(1)
Using PyTorch for Image Classification
114(16)
Loading the Indoor/Outdoor Dataset
115(3)
Utility Functions
118(1)
Visualizing the Training Data
119(1)
Fine-Tuning the Pretrained Model
120(10)
Summary
130(1)
6 Scalability and Distributed Training
131(30)
The Need for Scalability
132(1)
Distributed Training
133(2)
Apache Spark: An Introduction
135(3)
Spark Application Design
137(1)
Using Azure Databricks to Scale
138(5)
Cluster Setup for Weak Supervision
141(2)
Fake News Detection Dataset on Databricks
143(16)
Labeling Functions for Snorkel
145(2)
Setting Up Dependencies
147(2)
Loading the Data
149(2)
Fact-Checking Sites
151(2)
Transfer Learning Using the LIAR Dataset
153(1)
Weak Classifiers: Generating Agreement
154(2)
Type Conversions Needed for Spark Runtime
156(3)
Summary
159(2)
Index 161
Wee Hyong is a product and AI leader with a background in product management, machine learning/deep learning, research, and working on complex technical engagements with customers. Over the years, he has demonstrated that the early thought-leadership whitepapers he wrote on tech trends have become reality, and are deeply integrated into many products. Wee Hyong has worn many hats in his careerdeveloper, program/product manager, data scientist, researcher, and strategist, and his range of experience has given him unique superpowers to lead and define the strategy for high-performing data and AI innovation teams.