Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Introduction to Clustering Large and High-Dimensional Data [Pehme köide]

4.67/5 (6 hinnangut Goodreads-ist)

Jacob Kogan (University of Maryland, Baltimore)

Formaat: Paperback / softback, 222 pages, kõrgus x laius x paksus: 229x153x15 mm, kaal: 307 g, Worked examples or Exercises
Ilmumisaeg: 13-Nov-2006
Kirjastus: Cambridge University Press
ISBN-10: 0521617936
ISBN-13: 9780521617932

Teised raamatud teemal:

Databases - (Hetkel poes: 1 nimetust)

Pehme köide
Hind: 54,10 €
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Paperback / softback, 222 pages, kõrgus x laius x paksus: 229x153x15 mm, kaal: 307 g, Worked examples or Exercises
Ilmumisaeg: 13-Nov-2006
Kirjastus: Cambridge University Press
ISBN-10: 0521617936
ISBN-13: 9780521617932

Teised raamatud teemal:

Databases - (Hetkel poes: 1 nimetust)

Püsilink: https://www.kriso.ee/db/9780521617932.html

Märksõnad:

Focuses on a few of the important clustering algorithms in the context of information retrieval.

There is a growing need for a more automated system of partitioning data sets into groups, or clusters. For example, digital libraries and the World Wide Web continue to grow exponentially, the ability to find useful information increasingly depends on the indexing infrastructure or search engine. Clustering techniques can be used to discover natural groups in data sets and to identify abstract structures that might reside there, without having any background knowledge of the characteristics of the data. Clustering has been used in a variety of areas, including computer vision, VLSI design, data mining, bio-informatics (gene expression analysis), and information retrieval, to name just a few. This book focuses on a few of the most important clustering algorithms, providing a detailed account of these major models in an information retrieval context. The beginning chapters introduce the classic algorithms in detail, while the later chapters describe clustering through divergences and show recent research for more advanced audiences.

Arvustused

"...this book may serve as a useful reference for scientists and engineers who need to understand the concepts of clustering in general and/or to focus on text mining applications. It is also appropriate for students who are attending a course in pattern recognition, data mining, or classification and are interested in learning more about issues related to the k-means scheme for an undergraduate or master's thesis project. Last, it supplies very interesting material for instructors." Nicolas Loménie, IAPR Newsletter

Muu info

Focuses on a few of the important clustering algorithms in the context of information retrieval.

Foreword by Michael W. Berry

Preface

xiii

1 Introduction and motivation

(8)

1.1 A way to embed ASCII documents into a finite dimensional Euclidean space

(2)

1.2 Clustering and this book

(1)

1.3 Bibliographic notes

(3)

2 Quadratic k-means algorithm

(32)

2.1 Classical batch k-means algorithm

(11)

2.1.1 Quadratic distance and centroids

(1)

2.1.2 Batch k-means clustering algorithm

(1)

2.1.3 Batch k-means: advantages and deficiencies

(7)

2.2 Incremental algorithm

(8)

2.2.1 Quadratic functions

(4)

2.2.2 Incremental k-means algorithm

(4)

2.3 Quadratic k-means: summary

(8)

2.3.1 Numerical experiments with quadratic k-means

(2)

2.3.2 Stable partitions

(4)

2.3.3 Quadratic k-means

(2)

2.4 Spectral relaxation

(1)

2.5 Bibliographic notes

(3)

3 BIRCH

(10)

3.1 Balanced iterative reducing and clustering algorithm

(3)

3.2 BIRCH-like k-means

(5)

3.3 Bibliographic notes

(2)

4 Spherical k-means algorithm

(22)

4.1 Spherical batch k-means algorithm

(6)

4.1.1 Spherical batch k-means: advantages and deficiencies

(2)

4.1.2 Computational considerations

(2)

4.2 Spherical two-cluster partition of one-dimensional data

(7)

4.2.1 One-dimensional line vs. the unit circle

(3)

4.2.2 Optimal two cluster partition on the unit circle

(4)

4.3 Spherical batch and incremental clustering algorithms

(8)

4.3.1 First variation for spherical k-means

(3)

4.3.2 Spherical incremental iterations–computations complexity

(1)

4.3.3 The "ping-pong" algorithm

(2)

4.3.4 Quadratic and spherical k-means

(1)

4.4 Bibliographic notes

(1)

5 Linear algebra techniques

(18)

5.1 Two approximation problems

(1)

5.2 Nearest line

(3)

5.3 Principal directions divisive partitioning

(10)

5.3.1 Principal direction divisive partitioning (PDDP)

(3)

5.3.2 Spherical principal directions divisive partitioning (sPDDP)

(2)

5.3.3 Clustering with PDDP and sPDDP

(5)

5.4 Largest eigenvector

(2)

5.4.1 Power method

(1)

5.4.2 An application: hubs and authorities

(1)

5.5 Bibliographic notes

(2)

6 Information theoretic clustering

(10)

6.1 Kullback–Leibler divergence

(3)

6.2 k-means with Kullback–Leibler divergence

(2)

6.3 Numerical experiments

(2)

6.4 Distance between partitions

(1)

6.5 Bibliographic notes

(2)

7 Clustering with optimization techniques

101

(24)

7.1 Optimization framework

102

(1)

7.2 Smoothing k-means algorithm

103

(6)

7.3 Convergence

109

(5)

7.4 Numerical experiments

114

(8)

7.5 Bibliographic notes

122

(3)

8 k-means clustering with divergences

125

(30)

8.1 Bregman distance

125

(3)

8.2 φ-divergences

128

(4)

8.3 Clustering with entropy-like distances

132

(3)

8.4 BIRCH-type clustering with entropy-like distances

135

(5)

8.5 Numerical experiments with (upsilon, μ) k-means

140

(4)

8.6 Smoothing with entropy-like distances

144

(2)

8.7 Numerical experiments with (upsilon, μ) smoka

146

(6)

8.8 Bibliographic notes

152

(3)

9 Assessment of clustering results

155

(6)

9.1 Internal criteria

155

(1)

9.2 External criteria

156

(4)

9.3 Bibliographic notes

160

(1)

10 Appendix: Optimization and linear algebra background

161

(18)

10.1 Eigenvalues of a symmetric matrix

161

(2)

10.2 Lagrange multipliers

163

(1)

10.3 Elements of convex analysis

164

(14)

10.3.1 Conjugate functions

166

(3)

10.3.2 Asymptotic cones

169

(4)

10.3.3 Asymptotic functions

173

(3)

10.3.4 Smoothing

176

(2)

10.4 Bibliographic notes

178

(1)

11 Solutions to selected problems

179

(10)

Bibliography

189

(14)

Index

203

Jacob Kogan is an Associate Professor in the Department of Mathematics and Statistics at the University of Maryland, Baltimore County. Dr. Kogan received his PhD in Mathematics from Weizmann Institute of Science, has held teaching and research positions at the University of Toronto and Purdue University. His research interests include Text and Data Mining, Optimization, Calculus of Variations, Optimal Control Theory, and Robust Stability of Control Systems. Dr. Kogan is the author of Bifurcations of Extremals in Optimal Control and Robust Stability and Convexity: An Introduction. Since 2001, he has also been affiliated with the Department of Computer Science and Electrical Engineering at UMBC. Dr. Kogan is a recipient of 20042005 Fulbright Fellowship to Israel. Together with Charles Nicholas of UMBC and Marc Teboulle of Tel-Aviv University he is co-editor of the volume Grouping Multidimensional Data: Recent Advances in Clustering.

Introduction to Clustering Large and High-Dimensional Data [Pehme köide]

Arvustused

Muu info

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv