Part I Principles of Information Retrieval |
|
|
1 An Introduction to Information Retrieval |
|
|
3 | (10) |
|
1.1 What Is Information Retrieval? |
|
|
3 | (3) |
|
|
4 | (1) |
|
1.1.2 Dealing with Large, Unstructured Data Collections |
|
|
4 | (1) |
|
1.1.3 Formal Characterization |
|
|
5 | (1) |
|
1.1.4 Typical Information Retrieval Tasks |
|
|
5 | (1) |
|
1.2 Evaluating an Information Retrieval System |
|
|
6 | (5) |
|
1.2.1 Aspects of Information Retrieval Evaluation |
|
|
6 | (1) |
|
1.2.2 Precision, Recall, and Their Trade-Offs |
|
|
7 | (2) |
|
|
9 | (1) |
|
1.2.4 Standard Test Collections |
|
|
10 | (1) |
|
|
11 | (2) |
|
2 The Information Retrieval Process |
|
|
13 | (14) |
|
|
13 | (2) |
|
2.1.1 Logical View of Documents |
|
|
14 | (1) |
|
|
15 | (1) |
|
2.2 A Closer Look at Text |
|
|
15 | (4) |
|
|
16 | (2) |
|
2.2.2 Empirical Laws About Text |
|
|
18 | (1) |
|
2.3 Data Structures for Indexing |
|
|
19 | (6) |
|
|
20 | (1) |
|
2.3.2 Dictionary Compression |
|
|
21 | (2) |
|
|
23 | (2) |
|
2.3.4 Evaluation of B and B+ Trees |
|
|
25 | (1) |
|
|
25 | (2) |
|
3 Information Retrieval Models |
|
|
27 | (12) |
|
3.1 Similarity and Matching Strategies |
|
|
27 | (1) |
|
|
28 | (2) |
|
3.2.1 Evaluating Boolean Similarity |
|
|
28 | (1) |
|
3.2.2 Extensions and Limitations of the Boolean Model |
|
|
29 | (1) |
|
|
30 | (2) |
|
3.3.1 Evaluating Vector Similarity |
|
|
30 | (1) |
|
3.3.2 Weighting Schemes and tf x idf |
|
|
31 | (1) |
|
3.3.3 Evaluation of the Vector Space Model |
|
|
32 | (1) |
|
|
32 | (4) |
|
3.4.1 Binary Independence Model |
|
|
33 | (1) |
|
3.4.2 Bootstrapping Relevance Estimation |
|
|
34 | (1) |
|
3.4.3 Iterative Refinement and Relevance Feedback |
|
|
35 | (1) |
|
3.4.4 Evaluation of the Probabilistic Model |
|
|
36 | (1) |
|
|
36 | (3) |
|
4 Classification and Clustering |
|
|
39 | (18) |
|
4.1 Addressing Information Overload with Machine Learning |
|
|
39 | (1) |
|
|
40 | (5) |
|
4.2.1 Naive Bayes Classifiers |
|
|
41 | (1) |
|
4.2.2 Regression Classifiers |
|
|
42 | (1) |
|
|
43 | (1) |
|
4.2.4 Support Vector Machines |
|
|
44 | (1) |
|
|
45 | (8) |
|
|
46 | (1) |
|
4.3.2 Similarity Function Selection |
|
|
46 | (2) |
|
|
48 | (3) |
|
|
51 | (1) |
|
|
52 | (1) |
|
4.4 Application Scenarios for Clustering |
|
|
53 | (3) |
|
4.4.1 Search Results Clustering |
|
|
53 | (2) |
|
4.4.2 Database Clustering |
|
|
55 | (1) |
|
|
56 | (1) |
|
5 Natural Language Processing for Search |
|
|
57 | (14) |
|
5.1 Challenges of Natural Language Processing |
|
|
57 | (2) |
|
5.1.1 Dealing with Ambiguity |
|
|
58 | (1) |
|
5.1.2 Leveraging Probability |
|
|
58 | (1) |
|
5.2 Modeling Natural Language Tasks with Machine Learning |
|
|
59 | (2) |
|
|
59 | (1) |
|
5.2.2 Hidden Markov Models |
|
|
60 | (1) |
|
5.2.3 Conditional Random Fields |
|
|
60 | (1) |
|
5.3 Question Answering Systems |
|
|
61 | (7) |
|
5.3.1 What Is Question Answering? |
|
|
61 | (1) |
|
5.3.2 Question Answering Phases |
|
|
62 | (2) |
|
5.3.3 Deep Question Answering |
|
|
64 | (2) |
|
5.3.4 Shallow Semantic Structures for Text Representation |
|
|
66 | (1) |
|
|
67 | (1) |
|
|
68 | (3) |
Part II Information Retrieval for the Web |
|
|
|
71 | (20) |
|
|
71 | (1) |
|
6.2 A Brief History of Search Engines |
|
|
72 | (2) |
|
6.3 Architecture and Components |
|
|
74 | (1) |
|
|
75 | (10) |
|
|
76 | (2) |
|
6.4.2 Architecture of Web Crawlers |
|
|
78 | (2) |
|
6.4.3 DNS Resolution and URL Filtering |
|
|
80 | (1) |
|
6.4.4 Duplicate Elimination |
|
|
80 | (1) |
|
6.4.5 Distribution and Parallelization |
|
|
81 | (1) |
|
6.4.6 Maintenance of the URL Frontier |
|
|
82 | (2) |
|
6.4.7 Crawling Directives |
|
|
84 | (1) |
|
|
85 | (5) |
|
6.5.1 Distributed Indexing |
|
|
87 | (1) |
|
|
88 | (1) |
|
|
89 | (1) |
|
|
90 | (1) |
|
|
91 | (20) |
|
|
91 | (2) |
|
|
93 | (1) |
|
|
94 | (7) |
|
7.3.1 Random Surfer Interpretation |
|
|
96 | (1) |
|
7.3.2 Managing Dangling Nodes |
|
|
97 | (2) |
|
7.3.3 Managing Disconnected Graphs |
|
|
99 | (1) |
|
7.3.4 Efficient Computation of the PageRank Vector |
|
|
100 | (1) |
|
7.3.5 Use of PageRank in Google |
|
|
101 | (1) |
|
7.4 Hypertext-Induced Topic Search (HITS) |
|
|
101 | (8) |
|
7.4.1 Building the Query-Induced Neighborhood Graph |
|
|
102 | (1) |
|
7.4.2 Computing the Hub and Authority Scores |
|
|
103 | (4) |
|
7.4.3 Uniqueness of Hub and Authority Scores |
|
|
107 | (1) |
|
7.4.4 Issues in HITS Application |
|
|
108 | (1) |
|
7.5 On the Value of Link-Based Analysis |
|
|
109 | (1) |
|
|
110 | (1) |
|
8 Recommendation and Diversification for the Web |
|
|
111 | (10) |
|
|
111 | (1) |
|
8.2 Recommendation Systems |
|
|
112 | (4) |
|
|
112 | (1) |
|
8.2.2 Types of Recommender Systems |
|
|
113 | (1) |
|
8.2.3 Content-Based Recommendation Techniques |
|
|
113 | (1) |
|
8.2.4 Collaborative Filtering Techniques |
|
|
114 | (2) |
|
8.3 Result Diversification |
|
|
116 | (4) |
|
|
116 | (1) |
|
8.3.2 Diversification Definition |
|
|
116 | (1) |
|
|
117 | (1) |
|
8.3.4 Balancing Relevance and Diversity |
|
|
117 | (1) |
|
8.3.5 Diversification Approaches |
|
|
118 | (1) |
|
8.3.6 Multi-domain Diversification |
|
|
119 | (1) |
|
|
120 | (1) |
|
|
121 | (16) |
|
|
121 | (1) |
|
9.2 Advertising on the Web |
|
|
121 | (3) |
|
9.3 Terminology of Online Advertising |
|
|
124 | (1) |
|
|
125 | (4) |
|
9.4.1 First-Price Auctions |
|
|
126 | (1) |
|
9.4.2 Second-Price Auctions |
|
|
127 | (2) |
|
9.5 Pragmatic Details of Auction Implementation |
|
|
129 | (1) |
|
9.6 Federated Advertising |
|
|
130 | (2) |
|
|
132 | (5) |
Part III Advanced Aspects of Web Search |
|
|
10 Publishing Data on the Web |
|
|
137 | (24) |
|
10.1 Options for Publishing Data on the Web |
|
|
137 | (2) |
|
|
139 | (3) |
|
|
142 | (3) |
|
|
145 | (3) |
|
|
148 | (4) |
|
|
152 | (4) |
|
10.7 Conclusion and Outlook |
|
|
156 | (2) |
|
|
158 | (3) |
|
11 Meta-search and Multi-domain Search |
|
|
161 | (20) |
|
11.1 Introduction and Motivation |
|
|
161 | (1) |
|
11.2 Top-k Query Processing over Data Sources |
|
|
162 | (6) |
|
|
163 | (3) |
|
11.2.2 Attribute-Based Problem |
|
|
166 | (2) |
|
|
168 | (3) |
|
|
171 | (7) |
|
11.4.1 Service Registration |
|
|
171 | (2) |
|
11.4.2 Processing Multi-domain Queries |
|
|
173 | (2) |
|
11.4.3 Exploratory Search |
|
|
175 | (2) |
|
11.4.4 Data Visualization |
|
|
177 | (1) |
|
|
178 | (3) |
|
|
181 | (26) |
|
12.1 Understanding Semantic Search |
|
|
181 | (3) |
|
|
184 | (4) |
|
|
188 | (2) |
|
12.3.1 System Perspective |
|
|
188 | (2) |
|
|
190 | (1) |
|
|
190 | (5) |
|
|
192 | (1) |
|
12.4.2 System Perspective |
|
|
192 | (2) |
|
12.4.3 Query Translation and Presentation |
|
|
194 | (1) |
|
|
195 | (3) |
|
12.6 Constructing the Semantic Model |
|
|
198 | (4) |
|
12.7 Semantic Resources Annotation |
|
|
202 | (2) |
|
12.8 Conclusions and Outlook |
|
|
204 | (1) |
|
|
205 | (2) |
|
|
207 | (16) |
|
13.1 Motivations and Challenges of Multimedia Search |
|
|
207 | (4) |
|
13.1.1 Requirements and Applications |
|
|
207 | (2) |
|
|
209 | (2) |
|
|
211 | (5) |
|
|
213 | (1) |
|
|
214 | (2) |
|
|
216 | (1) |
|
13.4 MIR Content Processing |
|
|
217 | (1) |
|
13.5 Research Projects and Commercial Systems |
|
|
218 | (3) |
|
|
218 | (2) |
|
13.5.2 Commercial Systems |
|
|
220 | (1) |
|
|
221 | (2) |
|
14 Search Process and Interfaces |
|
|
223 | (12) |
|
|
223 | (2) |
|
14.2 Information Seeking Paradigms |
|
|
225 | (3) |
|
14.3 User Interfaces for Search |
|
|
228 | (6) |
|
14.3.1 Query Specification |
|
|
228 | (2) |
|
14.3.2 Result Presentation |
|
|
230 | (3) |
|
|
233 | (1) |
|
|
234 | (1) |
|
15 Human Computation and Crowdsearching |
|
|
235 | (24) |
|
|
235 | (3) |
|
|
236 | (2) |
|
|
238 | (6) |
|
15.2.1 Games with a Purpose |
|
|
238 | (2) |
|
|
240 | (2) |
|
15.2.3 Human Sensing and Mobilization |
|
|
242 | (2) |
|
15.3 The Human Computation Framework |
|
|
244 | (6) |
|
15.3.1 Phases of Human Computation |
|
|
244 | (2) |
|
|
246 | (1) |
|
15.3.3 Examples of Human Computation |
|
|
246 | (3) |
|
15.3.4 Dimensions of Human Computation Applications |
|
|
249 | (1) |
|
15.4 Research Challenges and Projects |
|
|
250 | (6) |
|
15.4.1 The CrowdSearcher Project |
|
|
250 | (2) |
|
15.4.2 The CUbRIK Project |
|
|
252 | (4) |
|
|
256 | (1) |
|
|
257 | (2) |
References |
|
259 | (18) |
Index |
|
277 | |