Christian Jacquemin shows how the power of natural language processing (NLP) can beused to advance text indexing and information retrieval (IR).
In this book Christian Jacquemin shows how the power of natural language processing(NLP) can be used to advance text indexing and information retrieval (IR). Jacquemin's novel tool isFASTR, a parser that normalizes terms and recognizes term variants. Since there are more meanings ina language than there are words, FASTR uses a metagrammar composed of shallow linguistictransformations that describe the morphological, syntactic, semantic, and pragmatic variations ofwords and terms. The acquired parsed terms can then be applied for precise retrieval and assembly ofinformation.The use of a corpus-based unification grammar to define, recognize, and combine termvariants from their base forms allows for intelligent information access to, or "linguistic datatuning" of, heterogeneous texts. FASTR can be used to do automatic controlled indexing, to carry outcontent-based Web searches through conceptually related alternative query formulations, to abstractscientific and technical extracts, and even to translate and collect terms from multilingualmaterial. Jacquemin provides a comprehensive account of the method and implementation of thisinnovative retrieval technique for text processing.