Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Data Mining Techniques in Grid Computing Environments [Wiley Online]

Edited by Werner Dubitzky (University of Ulster)

Formaat: 288 pages
Ilmumisaeg: 14-Nov-2008
Kirjastus: John Wiley & Sons Inc
ISBN-10: 470699906
ISBN-13: 9780470699904

Teised raamatud teemal:

Data mining

Wiley Online
Hind: 163,83 €*
* hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks

Formaat: 288 pages
Ilmumisaeg: 14-Nov-2008
Kirjastus: John Wiley & Sons Inc
ISBN-10: 470699906
ISBN-13: 9780470699904

Teised raamatud teemal:

Data mining

Rohkem infot Wiley Online kohta

Raamatu kodulehekülg: https://onlinelibrary.wiley.com/doi/book/10.1002/9780470699904

Based around eleven international real life case studies and including contributions from leading experts in the field this groundbreaking book explores the need for the grid-enabling of data mining applications and provides a comprehensive study of the technology, techniques and management skills necessary to create them. This book provides a simultaneous design blueprint, user guide, and research agenda for current and future developments and will appeal to a broad audience; from developers and users of data mining and grid technology, to advanced undergraduate and postgraduate students interested in this field.

Preface

xiii

List of Contributors

xvii

Data mining meets grid computing: Time to dance?

(16)

Alberto Sanchez

Jesus Montes

Werner Dubitzky

Julio J. Valdes

Maria S. Perez

Pedro de Miguel

Introduction

(1)

Data mining

(3)

Complex data mining problems

(1)

Data mining challenges

(2)

Grid computing

(3)

Grid computing challenges

(1)

Data mining grid - mining grid data

(3)

Data mining grid: a grid facilitating large-scale data mining

(2)

Mining grid data: analyzing grid systems with data mining techniques

(1)

Conclusions

(1)

Summary of
Chapters in this Volume

(4)

Data analysis services in the knowledge grid

(20)

Eugenio Cesario

Antonio Congiusta

Domenico Talia

Paolo Trunfio

Introduction

(1)

Approach

(2)

Knowledge Grid services

(9)

The Knowledge Grid architecture

(3)

Implementation

(5)

Data analysis services

(2)

Design of Knowledge Grid applications

(3)

The VEGA visual language

(1)

UML application modelling

(1)

Applications and experiments

(1)

Conclusions

(3)

GridMiner: An advanced support for e-science analytics

(20)

Peter Brezany

Ivan Janciak

A. Min Tjoa

Introduction

(2)

Rationale behind the design and development of GridMiner

(1)

Use Case

(1)

Knowledge discovery process and its support by the GridMiner

(9)

Phases of knowledge discovery

(3)

Workflow management

(1)

Data management

(1)

Data mining services and OLAP

(2)

Security

(1)

Graphical user interface

(2)

Future developments

(1)

High-level data mining model

(1)

Data mining query language

(1)

Distributed mining of data streams

(1)

Conclusions

(4)

ADaM services: Scientific data mining in the service-oriented architecture paradigm

(14)

Rahul Ramachandran

Sara Graves

John Rushing

Ken Keyzer

Manil Maskey

Hong Lin

Helen Conover

Introduction

(1)

ADaM system overview

(2)

ADaM toolkit overview

(1)

Mining in a service-oriented architecture

(1)

Mining web services

(4)

Implementation architecture

(1)

Workflow example

(1)

Implementation issues

(2)

Mining grid services

(3)

Architecture components

(1)

Workflow example

(1)

Summary

(2)

Mining for misconfigured machines in grid systems

(20)

Noam Palatin

Arie Leizarowitz

Assaf Schuster

Ran Wolff

Introduction

(2)

Preliminaries and related work

(2)

System misconfiguration detection

(1)

Outlier detection

(1)

Acquiring, pre-processing and storing data

(2)

Data sources and acquisition

(1)

Pre-processing

(1)

Data organization

(1)

Data analysis

(3)

General approach

(1)

Notation

(1)

Algorithm

(2)

Correctness and termination

(1)

The GMS

(2)

Evaluation

(6)

Qualitative results

(1)

Quantitative results

(2)

Interoperability

(3)

Conclusions and future work

(3)

FAEHIM: Federated Analysis Environment for Heterogeneous Intelligent Mining

(14)

Ali Shaikh Ali

Omer F. Rana

Introduction

(2)

Requirements of a distributed knowledge discovery framework

(1)

Knowledge discovery specific requirements

(1)

Distributed framework specific requirements

(1)

Workflow-based knowledge discovery

(1)

Data mining toolkit

(1)

Data mining service framework

(3)

Distributed data mining services

(1)

Data manipulation tools

100

(1)

Availability

101

(1)

Empirical experiments

101

(3)

Evaluating the framework accuracy

102

(1)

Evaluating the running time of the framework

103

(1)

Conclusions

104

(1)

Scalable and privacy preserving distributed data analysis over a service-oriented platform

105

(14)

William K. Cheung

Introduction

105

(1)

A service-oriented solution

106

(1)

Background

107

(2)

Types of distributed data analysis

107

(1)

A brief review of distributed data analysis

108

(1)

Data mining services and data analysis management systems

108

(1)

Model-based scalable, privacy preserving, distributed data analysis

109

(2)

Hierarchical local data abstractions

109

(1)

Learning global models from local abstractions

110

(1)

Modelling distributed data mining and workflow processes

111

(1)

DDM processes in BPEL4WS

111

(1)

Implementation details

112

(1)

Lessons learned

112

(2)

Performance of running distributed data analysis on BPEL

112

(1)

Issues specific to service-oriented distributed data analysis

113

(1)

Compatibility of Web services development tools

114

(1)

Further research directions

114

(2)

Optimizing BPEL4WS process execution

114

(1)

Improved support of data analysis process management

115

(1)

Improved support of data privacy preservation

115

(1)

Conclusions

116

(3)

Building and using analytical workflows in Discovery Net

119

(22)

Moustafa Ghanem

Vasa Curin

Patrick Wendel

Yike Guo

Introduction

119

(2)

Workflows on the grid

120

(1)

Discovery Net system

121

(5)

System overview

121

(1)

Workflow representation in DPML

122

(1)

Multiple data models

123

(1)

Workflow-based services

123

(1)

Multiple execution models

123

(1)

Data flow pull model

124

(1)

Streaming and batch transfer of data elements

124

(1)

Control flow push model

125

(1)

Embedding

125

(1)

Architecture for Discovery Net

126

(5)

Motivation for a new server architecture

126

(1)

Management of hosting environments

127

(1)

Activity management

127

(1)

Collaborative workflow platform

127

(1)

Architecture overview

127

(2)

Activity service definition layer

129

(1)

Activity services bus

130

(1)

Collaboration and execution services

130

(1)

Workflow Services Bus

130

(1)

Prototyping and production clients

130

(1)

Data management

131

(2)

Example of a workflow study

133

(3)

ADR studies

133

(1)

Analysis overview

133

(1)

Service for transforming event data into patient annotations

134

(1)

Service for defining exclusions

134

(1)

Service for defining exposures

135

(1)

Service for building the classification model

135

(1)

Validation service

135

(1)

Summary

136

(1)

Future directions

136

(5)

Building workflows that traverse the bioinformatics data landscape

141

(24)

Robert Stevens

Paul Fisher

Jun Zhao

Carole Goble

Andy Brass

Introduction

141

(2)

The bioinformatics data landscape

143

(1)

The bioinformatics experiment landscape

143

(2)

Taverna for bioinformatics experiments

145

(3)

Three-tiered enactment in Taverna

146

(1)

The open-typing data models

147

(1)

Building workflows in Taverna

148

(2)

Designing a SCUFL workflow

149

(1)

Workflow case study

150

(9)

The bioinformatics task

152

(1)

Current approaches and issues

153

(1)

Constructing workflows

154

(2)

Candidate genes involved in trypanosomiasis resistance

156

(1)

Workflows and the systematic approach

157

(2)

Discussion

159

(6)

Specification of distributed data mining workflows with DataMiningGrid

165

(14)

Dennis Wegener

Michael May

Introduction

165

(2)

DataMiningGrid environment

167

(2)

General architecture

167

(1)

Grid environment

167

(1)

Scalability

167

(1)

Workflow environment

168

(1)

Operations for workflow construction

169

(2)

Chaining

169

(1)

Looping

169

(1)

Branching

170

(1)

Shipping algorithms

170

(1)

Shipping data

170

(1)

Parameter variation

171

(1)

Parallelization

171

(1)

Extensibility

171

(2)

Case studies

173

(2)

Evaluation criteria and experimental methodology

173

(1)

Partitioning data

173

(2)

Classifier comparison scenario

175

(1)

Parameter optimization

175

(1)

Discussion and related work

175

(1)

Open issues

176

(1)

Conclusions

176

(3)

Anteater: Service-oriented data mining

179

(22)

Renato A. Ferreira

Dorgival O. Guedes

Wagner Meira Jr.

Introduction

179

(2)

The architecture

181

(2)

Runtime framework

183

(6)

Labelled stream

185

(1)

Global persistent storage

185

(1)

Termination detection

186

(1)

Application of the model

187

(2)

Parallel algorithms for data mining

189

(6)

Decision trees

189

(4)

Clustering

193

(2)

Visual metaphors

195

(1)

Case studies

196

(1)

Future developments

197

(1)

Conclusions and future work

198

(3)

DMGA: A generic brokering-based Data Mining Grid Architecture

201

(20)

Alberto Sanchez

Maria S. Perez

Pierre Gueant

Jose M. Pena

Pilar Herrero

Introduction

201

(1)

DMGA overview

202

(2)

Horizontal composition

204

(2)

Vertical composition

206

(2)

The need for brokering

208

(1)

Brokering-based data mining grid architecture

209

(1)

Use cases: Apriori, ID3 and J4.8 algorithms

210

(6)

Horizontal composition use case: Apriori

210

(3)

Vertical composition use cases: ID3 and J4.8

213

(3)

Related work

216

(1)

Conclusions

217

(4)

Grid-based data mining with the Environmental Scenario Search Engine (ESSE)

221

(26)

Mikhail Zhizhin

Alexey Poyda

Dmitry Mishin

Dmitry Medvedev

Eric Kihn

Vassily Lyutsarev

Environmental data source: NCEP/NCAR reanalysis data set

222

(1)

Fuzzy search engine

223

(8)

Operators of fuzzy logic

224

(2)

Fuzzy logic predicates

226

(1)

Fuzzy states in time

227

(2)

Relative importance of parameters

229

(1)

Fuzzy search optimization

229

(2)

Software architecture

231

(6)

Database schema optimization

231

(2)

Data grid layer

233

(2)

ESSE data resource

235

(1)

ESSE data processor

235

(2)

Applications

237

(6)

Global air temperature trends

238

(1)

Statistics of extreme weather events

239

(1)

Atmospheric fronts

239

(4)

Conclusions

243

(4)

Data pre-processing using OGSA-DAI

247

(16)

Martin Swain

Neil P. Chue Hong

Introduction

247

(1)

Data pre-processing for grid-enabled data mining

248

(1)

Using OGSA-DAI to support data mining applications

248

(7)

OGSA-DAI's activity framework

249

(4)

OGSA-DAI workflows for data management and pre-processing

253

(2)

Data pre-processing scenarios in data mining applications

255

(3)

Calculating a data summary

255

(1)

Discovering association rules in protein unfolding simulations

256

(1)

Mining distributed medical databases

257

(1)

State-of-the-art solutions for grid data management

258

(1)

Discussion

259

(1)

Open Issues

259

(1)

Conclusions

260

(3)

Index

263

Werner Dubitzky, PhD, is Chair of Bioinformatics at the Biomedical Sciences Research Institute in the Faculty of Life and Health Sciences at the University of Ulster. His research investigates systems biology, knowledge management in biology, grid computing, and data mining.

Krzysztof Kurowski, PhD, leads the Applications Department at Poznan Supercomputing and Networking Center in Poland. His research is focused on the modeling of advanced applications, scheduling, and resource management in networked environments.

Püsilink: https://www.kriso.ee/db/9780470699904_pe.html

Märksõnad:

E-raamat: Data Mining Techniques in Grid Computing Environments [Wiley Online]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Kirjastuste teemad

Vali ostukorv