Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Web Information Retrieval

3.38/5 (26 hinnangut Goodreads-ist)

Stefano Ceri, Piero Fraternali, Silvia Quarteroni, Emanuele Della Valle, Marco Brambilla, Alessandro Bozzon

Formaat: PDF+DRM
Sari: Data-Centric Systems and Applications
Ilmumisaeg: 30-Aug-2013
Kirjastus: Springer-Verlag Berlin and Heidelberg GmbH & Co. K
Keel: eng
ISBN-13: 9783642393143

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 55,56 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Sari: Data-Centric Systems and Applications
Ilmumisaeg: 30-Aug-2013
Kirjastus: Springer-Verlag Berlin and Heidelberg GmbH & Co. K
Keel: eng
ISBN-13: 9783642393143

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

With the proliferation of huge amounts of (heterogeneous) data on the Web, the importance of information retrieval (IR) has grown considerably over the last few years. Big players in the computer industry, such as Google, Microsoft and Yahoo!, are the primary contributors of technology for fast access to Web-based information; and searching capabilities are now integrated into most information systems, ranging from business management software and customer relationship systems to social networks and mobile phone applications.

Ceri and his co-authors aim at taking their readers from the foundations of modern information retrieval to the most advanced challenges of Web IR. To this end, their book is divided into three parts. The first part addresses the principles of IR and provides a systematic and compact description of basic information retrieval techniques (including binary, vector space and probabilistic models as well as natural language search processing) before focusing on its application to the Web. Part two addresses the foundational aspects of Web IR by discussing the general architecture of search engines (with a focus on the crawling and indexing processes), describing link analysis methods (specifically Page Rank and HITS), addressing recommendation and diversification, and finally presenting advertising in search (the main source of revenues for search engines). The third and final part describes advanced aspects of Web search, each chapter providing a self-contained, up-to-date survey on current Web research directions. Topics in this part include meta-search and multi-domain search, semantic search, search in the context of multimedia data, and crowd search.

The book is ideally suited to courses on information retrieval, as it covers all Web-independent foundational aspects. Its presentation is self-contained and does not require prior background knowledge. It can also be used in the context of classic courses on data management, allowing the instructor to cover both structured and unstructured data in various formats. Its classroom use is facilitated by a set of slides, which can be downloaded from www.search-computing.org.

Arvustused

From the reviews:

The book covers not only a wide range, but everything that is essential to the topic of Web information retrieval. this book is an excellent starting point into the field of Web information retrieval, and can be recommended for classroom use. (Gottfried Vossen, zbMATH, Vol. 1283, 2014)

... this book is a valuable resource for students and instructors in web IR, primarily as a reference to supplement course teaching. Researchers and practitioners should find the book a useful quick reference guide for key concepts, techniques, and recent trends in web IR. (Wingyan Chung, ACM Computing Reviews, July 2014)

Part I Principles of Information Retrieval

1 An Introduction to Information Retrieval

(10)

1.1 What Is Information Retrieval?

(3)

1.1.1 Defining Relevance

(1)

1.1.2 Dealing with Large, Unstructured Data Collections

(1)

1.1.3 Formal Characterization

(1)

1.1.4 Typical Information Retrieval Tasks

(1)

1.2 Evaluating an Information Retrieval System

(5)

1.2.1 Aspects of Information Retrieval Evaluation

(1)

1.2.2 Precision, Recall, and Their Trade-Offs

(2)

1.2.3 Ranked Retrieval

(1)

1.2.4 Standard Test Collections

(1)

1.3 Exercises

(2)

2 The Information Retrieval Process

(14)

2.1 A Bird's Eye View

(2)

2.1.1 Logical View of Documents

(1)

2.1.2 Indexing Process

(1)

2.2 A Closer Look at Text

(4)

2.2.1 Textual Operations

(2)

2.2.2 Empirical Laws About Text

(1)

2.3 Data Structures for Indexing

(6)

2.3.1 Inverted Indexes

(1)

2.3.2 Dictionary Compression

(2)

2.3.3 B and B+ Trees

(2)

2.3.4 Evaluation of B and B+ Trees

(1)

2.4 Exercises

(2)

3 Information Retrieval Models

(12)

3.1 Similarity and Matching Strategies

(1)

3.2 Boolean Model

(2)

3.2.1 Evaluating Boolean Similarity

(1)

3.2.2 Extensions and Limitations of the Boolean Model

(1)

3.3 Vector Space Model

(2)

3.3.1 Evaluating Vector Similarity

(1)

3.3.2 Weighting Schemes and tf x idf

(1)

3.3.3 Evaluation of the Vector Space Model

(1)

3.4 Probabilistic Model

(4)

3.4.1 Binary Independence Model

(1)

3.4.2 Bootstrapping Relevance Estimation

(1)

3.4.3 Iterative Refinement and Relevance Feedback

(1)

3.4.4 Evaluation of the Probabilistic Model

(1)

3.5 Exercises

(3)

4 Classification and Clustering

(18)

4.1 Addressing Information Overload with Machine Learning

(1)

4.2 Classification

(5)

4.2.1 Naive Bayes Classifiers

(1)

4.2.2 Regression Classifiers

(1)

4.2.3 Decision Trees

(1)

4.2.4 Support Vector Machines

(1)

4.3 Clustering

(8)

4.3.1 Data Processing

(1)

4.3.2 Similarity Function Selection

(2)

4.3.3 Cluster Analysis

(3)

4.3.4 Cluster Validation

(1)

4.3.5 Labeling

(1)

4.4 Application Scenarios for Clustering

(3)

4.4.1 Search Results Clustering

(2)

4.4.2 Database Clustering

(1)

4.5 Exercises

(1)

5 Natural Language Processing for Search

(14)

5.1 Challenges of Natural Language Processing

(2)

5.1.1 Dealing with Ambiguity

(1)

5.1.2 Leveraging Probability

(1)

5.2 Modeling Natural Language Tasks with Machine Learning

(2)

5.2.1 Language Models

(1)

5.2.2 Hidden Markov Models

(1)

5.2.3 Conditional Random Fields

(1)

5.3 Question Answering Systems

(7)

5.3.1 What Is Question Answering?

(1)

5.3.2 Question Answering Phases

(2)

5.3.3 Deep Question Answering

(2)

5.3.4 Shallow Semantic Structures for Text Representation

(1)

5.3.5 Answer Reranking

(1)

5.4 Exercises

(3)

Part II Information Retrieval for the Web

6 Search Engines

(20)

6.1 The Search Challenge

(1)

6.2 A Brief History of Search Engines

(2)

6.3 Architecture and Components

(1)

6.4 Crawling

(10)

6.4.1 Crawling Process

(2)

6.4.2 Architecture of Web Crawlers

(2)

6.4.3 DNS Resolution and URL Filtering

(1)

6.4.4 Duplicate Elimination

(1)

6.4.5 Distribution and Parallelization

(1)

6.4.6 Maintenance of the URL Frontier

(2)

6.4.7 Crawling Directives

(1)

6.5 Indexing

(5)

6.5.1 Distributed Indexing

(1)

6.5.2 Dynamic Indexing

(1)

6.5.3 Caching

(1)

6.6 Exercises

(1)

7 Link Analysis

(20)

7.1 The Web Graph

(2)

7.2 Link-Based Ranking

(1)

7.3 PageRank

(7)

7.3.1 Random Surfer Interpretation

(1)

7.3.2 Managing Dangling Nodes

(2)

7.3.3 Managing Disconnected Graphs

(1)

7.3.4 Efficient Computation of the PageRank Vector

100

(1)

7.3.5 Use of PageRank in Google

101

(1)

7.4 Hypertext-Induced Topic Search (HITS)

101

(8)

7.4.1 Building the Query-Induced Neighborhood Graph

102

(1)

7.4.2 Computing the Hub and Authority Scores

103

(4)

7.4.3 Uniqueness of Hub and Authority Scores

107

(1)

7.4.4 Issues in HITS Application

108

(1)

7.5 On the Value of Link-Based Analysis

109

(1)

7.6 Exercises

110

(1)

8 Recommendation and Diversification for the Web

111

(10)

8.1 Pruning Information

111

(1)

8.2 Recommendation Systems

112

(4)

8.2.1 User Profiling

112

(1)

8.2.2 Types of Recommender Systems

113

(1)

8.2.3 Content-Based Recommendation Techniques

113

(1)

8.2.4 Collaborative Filtering Techniques

114

(2)

8.3 Result Diversification

116

(4)

8.3.1 Scope

116

(1)

8.3.2 Diversification Definition

116

(1)

8.3.3 Diversity Criteria

117

(1)

8.3.4 Balancing Relevance and Diversity

117

(1)

8.3.5 Diversification Approaches

118

(1)

8.3.6 Multi-domain Diversification

119

(1)

8.4 Exercises

120

(1)

9 Advertising in Search

121

(16)

9.1 Web Monetization

121

(1)

9.2 Advertising on the Web

121

(3)

9.3 Terminology of Online Advertising

124

(1)

9.4 Auctions

125

(4)

9.4.1 First-Price Auctions

126

(1)

9.4.2 Second-Price Auctions

127

(2)

9.5 Pragmatic Details of Auction Implementation

129

(1)

9.6 Federated Advertising

130

(2)

9.7 Exercises

132

(5)

Part III Advanced Aspects of Web Search

10 Publishing Data on the Web

137

(24)

10.1 Options for Publishing Data on the Web

137

(2)

10.2 The Deep Web

139

(3)

10.3 Web APIs

142

(3)

10.4 Microformats

145

(3)

10.5 RDFa

148

(4)

10.6 Linked Data

152

(4)

10.7 Conclusion and Outlook

156

(2)

10.8 Exercises

158

(3)

11 Meta-search and Multi-domain Search

161

(20)

11.1 Introduction and Motivation

161

(1)

11.2 Top-k Query Processing over Data Sources

162

(6)

11.2.1 OID-Based Problem

163

(3)

11.2.2 Attribute-Based Problem

166

(2)

11.3 Meta-search

168

(3)

11.4 Multi-domain Search

171

(7)

11.4.1 Service Registration

171

(2)

11.4.2 Processing Multi-domain Queries

173

(2)

11.4.3 Exploratory Search

175

(2)

11.4.4 Data Visualization

177

(1)

11.5 Exercises

178

(3)

12 Semantic Search

181

(26)

12.1 Understanding Semantic Search

181

(3)

12.2 Semantic Model

184

(4)

12.3 Resources

188

(2)

12.3.1 System Perspective

188

(2)

12.3.2 User Perspective

190

(1)

12.4 Queries

190

(5)

12.4.1 User Perspective

192

(1)

12.4.2 System Perspective

192

(2)

12.4.3 Query Translation and Presentation

194

(1)

12.5 Semantic Matching

195

(3)

12.6 Constructing the Semantic Model

198

(4)

12.7 Semantic Resources Annotation

202

(2)

12.8 Conclusions and Outlook

204

(1)

12.9 Exercises

205

(2)

13 Multimedia Search

207

(16)

13.1 Motivations and Challenges of Multimedia Search

207

(4)

13.1.1 Requirements and Applications

207

(2)

13.1.2 Challenges

209

(2)

13.2 MIR Architecture

211

(5)

13.2.1 Content Process

213

(1)

13.2.2 Query Process

214

(2)

13.3 MIR Metadata

216

(1)

13.4 MIR Content Processing

217

(1)

13.5 Research Projects and Commercial Systems

218

(3)

13.5.1 Research Projects

218

(2)

13.5.2 Commercial Systems

220

(1)

13.6 Exercises

221

(2)

14 Search Process and Interfaces

223

(12)

14.1 Search Process

223

(2)

14.2 Information Seeking Paradigms

225

(3)

14.3 User Interfaces for Search

228

(6)

14.3.1 Query Specification

228

(2)

14.3.2 Result Presentation

230

(3)

14.3.3 Faceted Search

233

(1)

14.4 Exercises

234

(1)

15 Human Computation and Crowdsearching

235

(24)

15.1 Introduction

235

(3)

15.1.1 Background

236

(2)

15.2 Applications

238

(6)

15.2.1 Games with a Purpose

238

(2)

15.2.2 Crowdsourcing

240

(2)

15.2.3 Human Sensing and Mobilization

242

(2)

15.3 The Human Computation Framework

244

(6)

15.3.1 Phases of Human Computation

244

(2)

15.3.2 Human Performers

246

(1)

15.3.3 Examples of Human Computation

246

(3)

15.3.4 Dimensions of Human Computation Applications

249

(1)

15.4 Research Challenges and Projects

250

(6)

15.4.1 The CrowdSearcher Project

250

(2)

15.4.2 The CUbRIK Project

252

(4)

15.5 Open Issues

256

(1)

15.6 Exercises

257

(2)

References

259

(18)

Index

277

Stefano Ceri is a professor of Database Systems at the Politecnico di Milano and the director of Alta Scuola Politecnica. He is the recipient of the 2013 SIGMOD Edgar F. Codd Innovation Award for a series of influential contributions to several areas of database management, including distributed databases, rule-based systems, web-based application design, and search computing.

Alessandro Bozzon is an assistant professor of Information Retrieval at the Delft University of Technology. His research is on information management on the Web, with specific focus on Information Retrieval and human- and social-computation.

Marco Brambilla is an assistant professor of Software Engineering at Politecnico di Milano and shareholder at WebRatio. His research is on Web modeling tools and methods, spanning crowdsourcing, social networks, search engines, BPM, SOA and enterprise architectures.

Emanuele Della Valle is an assistant professor of Software Project Management at Politecnico di Milano. His research is on Intelligent Web Information Systems and includes Semantic Web, Search Engines, Data Stream Processing, Rank-aware Databases and Crowdsourcing.

Piero Fraternali is a professor of Web Technologies at Politecnico di Milano, co-inventor of the Web Modeling Language, the basis of the WebRatio tool company and of the recent OMG Interaction Flow Modeling Language (IFML). His research focuses on Web development tools and on social-human computation.

Silvia Quarteroni is a senior consultant at Elca Informatique, Switzerland. She holds a Computer Science PhD on Question Answering systems and her main research interests concern statistical approaches to natural language processing.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97836423931432e.html

Märksõnad:

E-raamat: Web Information Retrieval

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv