Muutke küpsiste eelistusi

E-raamat: Applied Data Mining for Business and Industry 2e 2nd Edition [Wiley Online]

(Faculty of Economics, University of Pavia, Italy), (University of Pavia, Italy)
  • Formaat: 272 pages
  • Ilmumisaeg: 17-Apr-2009
  • Kirjastus: John Wiley & Sons Inc
  • ISBN-10: 470745835
  • ISBN-13: 9780470745830
  • Wiley Online
  • Hind: 85,64 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Formaat: 272 pages
  • Ilmumisaeg: 17-Apr-2009
  • Kirjastus: John Wiley & Sons Inc
  • ISBN-10: 470745835
  • ISBN-13: 9780470745830
The increasing availability of data in our current, information overloaded society has led to the need for valid tools for its modelling and analysis. Data mining and applied statistical methods are the appropriate tools to extract knowledge from such data. This book provides an accessible introduction to data mining methods in a consistent and application oriented statistical framework, using case studies drawn from real industry projects and highlighting the use of data mining methods in a variety of business applications.

  • Introduces data mining methods and applications.
  • Covers classical and Bayesian multivariate statistical methodology as well as machine learning and computational data mining methods.
  • Includes many recent developments such as association and sequence rules, graphical Markov models, lifetime value modelling, credit risk, operational risk and web mining.
  • Features detailed case studies based on applied projects within industry.
  • Incorporates discussion of data mining software, with case studies analysed using R.
  • Is accessible to anyone with a basic knowledge of statistics or data analysis.
  • Includes an extensive bibliography and pointers to further reading within the text.

Applied Data Mining for Business and Industry, 2nd edition is aimed at advanced undergraduate and graduate students of data mining, applied statistics, database management, computer science and economics. The case studies will provide guidance to professionals working in industry on projects involving large volumes of data, such as customer relationship management, web design, risk management, marketing, economics and finance.

Introduction
1(4)
Part I Methodology
5(158)
Organisation of the data
7(6)
Statistical units and statistical variables
7(2)
Data matrices and their transformations
9(1)
Complex data structures
10(1)
Summary
11(2)
Summary statistics
13(28)
Univariate exploratory analysis
13(9)
Measures of location
13(2)
Measures of variability
15(1)
Measures of heterogeneity
16(1)
Measures of concentration
17(2)
Measures of asymmetry
19(1)
Measures of kurtosis
20(2)
Bivariate exploratory analysis of quantitative data
22(3)
Multivariate exploratory analysis of quantitative data
25(2)
Multivariate exploratory analysis of qualitative data
27(7)
Independence and association
28(1)
Distance measures
29(2)
Dependency measures
31(1)
Model-based measures
32(2)
Reduction of dimensionality
34(5)
Interpretation of the principal components
36(3)
Further reading
39(2)
Model specification
41(106)
Measures of distance
42(5)
Euclidean distance
43(1)
Similarity measures
44(2)
Multidimensional scaling
46(1)
Cluster analysis
47(10)
Hierarchical methods
49(4)
Evaluation of hierarchical methods
53(2)
Non-hierarchical methods
55(2)
Linear regression
57(10)
Bivariate linear regression
57(3)
Properties of the residuals
60(2)
Goodness of fit
62(1)
Multiple linear regression
63(4)
Logistic regression
67(4)
Interpretation of logistic regression
68(2)
Discriminant analysis
70(1)
Tree models
71(5)
Division criteria
73(1)
Pruning
74(2)
Neural networks
76(13)
Architecture of a neural network
79(2)
The multilayer perceptron
81(6)
Kohonen networks
87(2)
Nearest-neighbour models
89(1)
Local models
90(6)
Association rules
90(6)
Retrieval by content
96(1)
Uncertainty measures and inference
96(13)
Probability
97(2)
Statistical models
99(4)
Statistical inference
103(6)
Non-parametric modelling
109(3)
The normal linear model
112(4)
Main inferential results
113(3)
Generalised linear models
116(10)
The exponential family
117(1)
Definition of generalised linear models
118(7)
The logistic regression model
125(1)
Log-linear models
126(7)
Construction of a log-linear model
126(2)
Interpretation of a log-linear model
128(1)
Graphical log-linear models
129(3)
Log-linear model comparison
132(1)
Graphical models
133(9)
Symmetric graphical models
135(4)
Recursive graphical models
139(2)
Graphical models and neural networks
141(1)
Survival analysis models
142(2)
Further reading
144(3)
Model evaluation
147(16)
Criteria based on statistical tests
148(5)
Distance between statistical models
148(2)
Discrepancy of a statistical model
150(1)
Kullback-Leibler discrepancy
151(2)
Criteria based on scoring functions
153(2)
Bayesian criteria
155(1)
Computational criteria
156(3)
Criteria based on loss functions
159(3)
Further reading
162(1)
Part II Business case studies
163(74)
Describing website visitors
165(10)
Objectives of the analysis
165(1)
Description of the data
165(2)
Exploratory analysis
167(1)
Model building
167(4)
Cluster analysis
168(1)
Kohonen networks
169(2)
Model comparison
171(1)
Summary report
172(3)
Market basket analysis
175(18)
Objectives of the analysis
175(1)
Description of the data
176(2)
Exploratory data analysis
178(3)
Model building
181(5)
Log-linear models
181(3)
Association rules
184(2)
Model comparison
186(5)
Summary report
191(2)
Describing customer satisfaction
193(10)
Objectives of the analysis
193(1)
Description of the data
194(1)
Exploratory data analysis
194(3)
Model building
197(4)
Summary
201(2)
Predicting credit risk of small businesses
203(8)
Objectives of the analysis
203(1)
Description of the data
203(2)
Exploratory data analysis
205(1)
Model building
206(3)
Model comparison
209(1)
Summary report
210(1)
Predicting e-learning student performance
211(8)
Objectives of the analysis
211(1)
Description of the data
212(1)
Exploratory data analysis
212(2)
Model specification
214(3)
Model comparison
217(1)
Summary report
218(1)
Predicting customer lifetime value
219(8)
Objectives of the analysis
219(1)
Description of the data
220(1)
Exploratory data analysis
221(2)
Model specification
223(1)
Model comparison
224(1)
Summary report
225(2)
Operational risk management
227(10)
Context and objectives of the analysis
227(1)
Exploratory data analysis
228(2)
Model building
230(2)
Model comparison
232(3)
Summary conclusions
235(2)
References 237(6)
Index 243
Paolo Giudici  Department of Economics and Quantitative Methods, University of Pavia, A lecturer in data mining, business statistics, data analysis and risk management, Professor Giudici is also the director of the data mining laboratory. He is the author of around 80 publications, and the coordinator of 2 national research grants on data mining, and local coordinator of a European integrated project on the topic. He was the sole author of the first edition of this book, which has been translated into both Italian and Chinese. He is also one of the Editors of Wiley's Series in Computational Statistics. Silvia Figini, Ms Figini has worked for 2 years for the Competence centre for data mining analysis and business intelligence at SAS Milan. She is currently completing a PhD in statistics, and already has a collection of publications to her name