Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Crowdsourced Data Management: Hybrid Machine-Human Computing

Guoliang Li, Ju Fan, Jiannan Wang, Michael J. Franklin, Yudian Zheng

Formaat: EPUB+DRM
Ilmumisaeg: 12-Oct-2018
Kirjastus: Springer Verlag, Singapore
Keel: eng
ISBN-13: 9789811078477

Teised raamatud teemal:

Formaat - EPUB+DRM
Hind: 110,53 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: EPUB+DRM
Ilmumisaeg: 12-Oct-2018
Kirjastus: Springer Verlag, Singapore
Keel: eng
ISBN-13: 9789811078477

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This book provides an overview of crowdsourced data management. Covering all aspects including the workflow, algorithms and research potential, it particularly focuses on the latest techniques and recent advances. The authors identify three key aspects in determining the performance of crowdsourced data management: quality control, cost control and latency control. By surveying and synthesizing a wide spectrum of studies on crowdsourced data management, the book outlines important factors that need to be considered to improve crowdsourced data management. It also introduces a practical crowdsourced-database-system design and presents a number of crowdsourced operators. Self-contained and covering theory, algorithms, techniques and applications, it is a valuable reference resource for researchers and students new to crowdsourced data management with a basic knowledge of data structures and databases.

1 Introduction

(1)

1.1 Motivation

(1)

1.2 Crowdsourcing Overview

(2)

1.3 Crowdsourced Data Management

(4)

References

(3)

2 Crowdsourcing Background

(1)

2.1 Crowdsourcing Overview

(1)

2.2 Crowdsourcing Workflow

(4)

2.2.1 Workflow from Requester Side

(3)

2.2.2 Workflow from Worker Side

(1)

2.2.3 Workflow from Platform Side

(1)

2.3 Crowdsourcing Platforms

(2)

2.3.1 Amazon Mechanical Turk (AMT)

(1)

2.3.2 CrowdFlower

(1)

2.3.3 Other Platforms

(1)

2.4 Existing Surveys, Tutorials, and Books

(1)

2.5 Optimization Goal of Crowdsourced Data Management

(3)

References

(2)

3 Quality Control

(24)

3.1 Overview of Quality Control

(2)

3.2 Truth Inference

(13)

3.2.1 Truth Inference Problem

(2)

3.2.2 Unified Solution Framework

(3)

3.2.3 Comparisons of Existing Works

(7)

3.2.4 Extensions of Truth Inference

(1)

3.3 Task Assignment

(6)

3.3.1 Task Assignment Setting

(4)

3.3.2 Worker Selection Setting

(2)

3.4 Summary of Quality Control

(3)

References

(3)

4 Cost Control

(18)

4.1 Overview of Cost Control

(1)

4.2 Task Pruning

(3)

4.2.1 Difficulty Measurement

(1)

4.2.2 Threshold Selection

(1)

4.2.3 Pros and Cons

(1)

4.3 Answer Deduction

(2)

4.3.1 Iterative Workflow

(1)

4.3.2 Presentation Order

(1)

4.3.3 Pros and Cons

(1)

4.4 Task Selection

(3)

4.4.1 Model-Driven

(1)

4.4.2 Problem-Driven

(1)

4.4.3 Pros and Cons

(1)

4.5 Sampling

(3)

4.5.1 Crowdsourced Aggregation

(1)

4.5.2 Data Cleaning

(2)

4.5.3 Pros and Cons

(1)

4.6 Task Design

(3)

4.6.1 User Interface Design

(1)

4.6.2 Non-monetary Incentives

(1)

4.6.3 Pros and Cons

(1)

4.7 Summary of Cost Control

(3)

References

(2)

5 Latency Control

(8)

5.1 Overview of Latency Control

(1)

5.2 Single-Task Latency Control

(2)

5.2.1 Recruitment Time

(1)

5.2.2 Qualification Test Time

(1)

5.2.3 Work Time

(1)

5.3 Single-Batch Latency Control

(2)

5.3.1 Statistical Model

(1)

5.3.2 Straggler Mitigation

(2)

5.4 Multi-batch Latency Control

(1)

5.4.1 Motivation of Multiple Batches

(1)

5.4.2 Two Basic Ideas

(1)

5.5 Summary of Latency Control

(2)

References

(1)

6 Crowdsourcing Database Systems and Optimization

(26)

6.1 Overview of Crowdsourcing Database Systems

(4)

6.2 Crowdsourcing Query Language

(7)

6.2.1 CrowdDB

(1)

6.2.2 Qurk

(1)

6.2.3 Deco

(1)

6.2.4 CDAS

(2)

6.2.5 CDB

(2)

6.3 Crowdsourcing Query Optimization

(11)

6.3.1 CrowdDB

(2)

6.3.2 Qurk

(1)

6.3.3 Deco

(2)

6.3.4 CDAS

(4)

6.3.5 CDB

(2)

6.4 Summary of Crowdsourcing Database Systems

(4)

References

(3)

7 Crowdsourced Operators

7.1 Crowdsourced Selection

(4)

7.1.1 Crowdsourced Filtering

(1)

7.1.2 Crowdsourced Find

(2)

7.1.3 Crowdsourced Search

101

(1)

7.2 Crowdsourced Collection

101

(3)

7.2.1 Crowdsourced Enumeration

101

(3)

7.2.2 Crowdsourced Fill

104

(1)

7.3 Crowdsourced Join (Crowdsourced Entity Resolution)

104

(9)

7.3.1 Background

104

(1)

7.3.2 Candidate Set Generation

105

(1)

7.3.3 Candidate Set Verification

106

(2)

7.3.4 Human Interface for Join

108

(1)

7.3.5 Other Approaches

109

(4)

7.4 Crowdsourced Sort, Top-k, and Max/Min

113

(8)

7.4.1 Workflow

113

(1)

7.4.2 Pairwise Comparisons

113

(1)

7.4.3 Result Inference

114

(5)

7.4.4 Task Selection

119

(1)

7.4.5 Crowdsourced Max

120

(1)

7.5 Crowdsourced Aggregation

121

(2)

7.5.1 Crowdsourced Count

121

(1)

7.5.2 Crowdsourced Median

122

(1)

7.5.3 Crowdsourced Group By

123

(1)

7.6 Crowdsourced Categorization

123

(1)

7.7 Crowdsourced Skyline

124

(2)

7.7.1 Crowdsourced Skyline on Incomplete Data

125

(1)

7.7.2 Crowdsourced Skyline with Comparisons

126

(1)

7.8 Crowdsourced Planning

126

(6)

7.8.1 General Crowdsourced Planning Query

127

(2)

7.8.2 An Application: Route Planning

129

(3)

7.9 Crowdsourced Schema Matching

132

Guoliang Li is an associate professor at the Department of Computer Science, Tsinghua University, Beijing, China. His research interests include crowdsourced data management, big spatio-temporal data analytics, large-scale data cleaning and integration. He has published more than 100 papers at leading conferences and in journals, such as SIGMOD, VLDB, ICDE, SIGKDD, SIGIR, TODS, VLDB Journal, and TKDE. He is a PC co-chair of WAIM 2014, WebDB 2014, and NDBC 2016. He servers as associate editor for IEEE Transactions and Data Engineering, the VLDB Journal, BigData Research, IEEE Data Engineering Bulletin. He has regularly served as a PC member for several conferences, such as SIGMOD, VLDB, KDD, ICDE, WWW, IJCAI, and AAAI. His papers have been cited more than 4500 times. He received the VLDB 2017 Early Research Contribution Award, IEEE TCDE Early Career Award 2014, The national youth talent support program 2016, Young ChangJiang Scholar 2016, NSFC Excellent Young Scholars Award 2014, and the CCF Young Scientist award 2014.

Prof. Michael J. Franklin is the inaugural holder of the Liew Family Chair of Computer Science at the University of Chicago. An authority on databases, data analytics, data management and distributed systems, he also serves as senior advisor to the provost on computation and data science. Most recently he was the Thomas M. Siebel Professor of Computer Science and chair of the Computer Science Division of the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley, where he currently is an adjunct professor. He co-founded and directs Berkeleys Algorithms, Machines and People Laboratory (AMPLab), a leading academic big data analytics research center. The AMPLab won a National Science Foundation CISE "Expeditions in Computing" award, which was announced as part of the White House Big Data Research initiative in March 2012, and has received support from over 30 industrial sponsors. AMPLab has created industry-changing open source big data software including Apache Spark and BDAS, the Berkeley Data Analytics Stack. At Berkeley Professor Franklin also served as an executive committee member for the Berkeley Institute for Data Science, a campus-wide initiative to advance data science environments. He is a fellow of the Association for Computing Machinery and two-time recipient of the ACM SIGMOD.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97898110784776e.html

E-raamat: Crowdsourced Data Management: Hybrid Machine-Human Computing

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv