Muutke küpsiste eelistusi

E-raamat: Practical Guide to Data Mining for Business and Industry

(antz21 GmbH, Gengenbach), (University of Newcastle, UK)
  • Formaat: PDF+DRM
  • Ilmumisaeg: 21-Mar-2014
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118763728
  • Formaat - PDF+DRM
  • Hind: 72,80 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Raamatukogudele
  • Formaat: PDF+DRM
  • Ilmumisaeg: 21-Mar-2014
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118763728

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

This text is grounded in sound statistical practice but emphasizes the practicalities, offering step-by-step guidance and adaptable blueprints for solving typical business problems. Coverage encompasses the concept of data mining, types of business information systems, data warehouses, data marts, data preparation, and analytics. "Recipes" address various aspects of market prediction, intra-customer analysis, and learning from a small testing sample. The text concludes with a quick guide to software and tools, and overviews of data mining in different industries, making use of official statistics, and differences between statistical analysis and data mining. Annotation ©2014 Ringgold, Inc., Portland, OR (protoview.com)

Data mining is well on its way to becoming a recognized discipline in the overlapping areas of IT, statistics, machine learning, and AI. Practical Data Mining for Business presents a user-friendly approach to data mining methods, covering the typical uses to which it is applied.

Data mining is well on its way to becoming a recognized discipline in the overlapping areas of IT, statistics, machine learning, and AI. Practical Data Mining for Business presents a user-friendly approach to data mining methods, covering the typical uses to which it is applied. The methodology is complemented by case studies to create a versatile reference book, allowing readers to look for specific methods as well as for specific applications. The book is formatted to allow statisticians, computer scientists, and economists to cross-reference from a particular application or method to sectors of interest.

Arvustused

A Practical Guide to Data Mining for Business and Industrygives practical tools on how information can be extracted from masses of data. The book is very well written, in a conversational tone that makes it enjoyable to read. The authors are excellent communicators. If you are interested in learning about data mining, learning to do a particular task in data mining, looking for a textbook to use in a data mining or analytics course, or have a problem or data analytic task you are working on, this book would be an excellent place to start.  (Mathematical Association of America, 23 August 2014)

Glossary of terms xii
Part I Data Mining Concept
1(30)
1 Introduction
3(11)
1.1 Aims of the Book
3(2)
1.2 Data Mining Context
5(3)
1.2.1 Domain Knowledge
6(1)
1.2.2 Words to Remember
7(1)
1.2.3 Associated Concepts
7(1)
1.3 Global Appeal
8(1)
1.4 Example Datasets Used in This Book
8(3)
1.5 Recipe Structure
11(2)
1.6 Further Reading and Resources
13(1)
2 Data Mining Definition
14(17)
2.1 Types of Data Mining Questions
15(4)
2.1.1 Population and Sample
15(1)
2.1.2 Data Preparation
16(1)
2.1.3 Supervised and Unsupervised Methods
16(2)
2.1.4 Knowledge-Discovery Techniques
18(1)
2.2 Data Mining Process
19(1)
2.3 Business Task: Clarification of the Business Question behind the Problem
20(1)
2.4 Data: Provision and Processing of the Required Data
21(4)
2.4.1 Fixing the Analysis Period
22(1)
2.4.2 Basic Unit of Interest
23(1)
2.4.3 Target Variables
24(1)
2.4.4 Input Variables/Explanatory Variables
24(1)
2.5 Modelling: Analysis of the Data
25(1)
2.6 Evaluation and Validation during the Analysis Stage
25(3)
2.7 Application of Data Mining Results and Learning from the Experience
28(3)
Part II Data Mining Practicalities
31(142)
3 All about Data
33(27)
3.1 Some Basics
34(7)
3.1.1 Data, Information, Knowledge and Wisdom
35(1)
3.1.2 Sources and Quality of Data
36(1)
3.1.3 Measurement Level and Types of Data
37(2)
3.1.4 Measures of Magnitude and Dispersion
39(2)
3.1.5 Data Distributions
41(1)
3.2 Data Partition: Random Samples for Training, Testing and Validation
41(3)
3.3 Types of Business Information Systems
44(3)
3.3.1 Operational Systems Supporting Business Processes
44(1)
3.3.2 Analysis-Based Information Systems
45(1)
3.3.3 Importance of Information
45(2)
3.4 Data Warehouses
47(3)
3.4.1 Topic Orientation
47(1)
3.4.2 Logical Integration and Homogenisation
48(1)
3.4.3 Reference Period
48(1)
3.4.4 Low Volatility
48(1)
3.4.5 Using the Data Warehouse
49(1)
3.5 Three Components of a Data Warehouse: DBMS, DB and DBCS
50(2)
3.5.1 Database Management System (DBMS)
51(1)
3.5.2 Database (DB)
51(1)
3.5.3 Database Communication Systems (DBCS)
51(1)
3.6 Data Marts
52(2)
3.6.1 Regularly Filled Data Marts
53(1)
3.6.2 Comparison between Data Marts and Data Warehouses
53(1)
3.7 A Typical Example from the Online Marketing Area
54(1)
3.8 Unique Data Marts
54(4)
3.8.1 Permanent Data Marts
54(2)
3.8.2 Data Marts Resulting from Complex Analysis
56(2)
3.9 Data Mart: Do's and Don'ts
58(2)
3.9.1 Do's and Don'ts for Processes
58(1)
3.9.2 Do's and Don'ts for Handling
58(1)
3.9.3 Do's and Don'ts for Coding/Programming
59(1)
4 Data Preparation
60(18)
4.1 Necessity of Data Preparation
61(1)
4.2 From Small and Long to Short and Wide
61(4)
4.3 Transformation of Variables
65(1)
4.4 Missing Data and Imputation Strategies
66(3)
4.5 Outliers
69(1)
4.6 Dealing with the Vagaries of Data
70(1)
4.6.1 Distributions
70(1)
4.6.2 Tests for Normality
70(1)
4.6.3 Data with Totally Different Scales
70(1)
4.7 Adjusting the Data Distributions
71(1)
4.7.1 Standardisation and Normalisation
71(1)
4.7.2 Ranking
71(1)
4.7.3 Box--Cox Transformation
71(1)
4.8 Binning
72(5)
4.8.1 Bucket Method
73(1)
4.8.2 Analytical Binning for Nominal Variables
73(1)
4.8.3 Quantiles
73(1)
4.8.4 Binning in Practice
74(3)
4.9 Timing Considerations
77(1)
4.10 Operational Issues
77(1)
5 Analytics
78
5.1 Introduction
79(1)
5.2 Basis of Statistical Tests
80(3)
5.2.1 Hypothesis Tests and P Values
80(2)
5.2.2 Tolerance Intervals
82(1)
5.2.3 Standard Errors and Confidence Intervals
83(1)
5.3 Sampling
83(2)
5.3.1 Methods
83(1)
5.3.2 Sample Sizes
84(1)
5.3.3 Sample Quality and Stability
84(1)
5.4 Basic Statistics for Pre-analytics
85
5.4.1 Frequencies
85(3)
5.4.2 Comparative Tests
88(1)
5.4.3 Cross Tabulation and Contingency Tables
89(1)
5.4.4 Correlations
90(1)
5.4.5 Association Measures for Nominal Variables
91(1)
5.4.6 Examples of Output from Comparative and Cross Tabulation Tests
92(4)
5.5 Feature Selection/Reduction of Variables
96(3)
5.5.1 Feature Reduction Using Domain Knowledge
96(1)
5.5.2 Feature Selection Using Chi-Square
97(1)
5.5.3 Principal Components Analysis and Factor Analysis
97(1)
5.5.4 Canonical Correlation, PLS and SEM
98(1)
5.5.5 Decision Trees
98(1)
5.5.6 Random Forests
98(1)
5.6 Time Series Analysis
99(3)
6 Methods
102(59)
6.1 Methods Overview
104(1)
6.2 Supervised Learning
105(4)
6.2.1 Introduction and Process Steps
105(1)
6.2.2 Business Task
105(1)
6.2.3 Provision and Processing of the Required Data
106(1)
6.2.4 Analysis of the Data
107(1)
6.2.5 Evaluation and Validation of the Results (during the Analysis)
108(1)
6.2.6 Application of the Results
108(1)
6.3 Multiple Linear Regression for Use When Target is Continuous
109(10)
6.3.1 Rationale of Multiple Linear Regression Modelling
109(1)
6.3.2 Regression Coefficients
110(1)
6.3.3 Assessment of the Quality of the Model
111(2)
6.3.4 Example of Linear Regression in Practice
113(6)
6.4 Regression When the Target is Not Continuous
119(10)
6.4.1 Logistic Regression
119(2)
6.4.2 Example of Logistic Regression in Practice
121(5)
6.4.3 Discriminant Analysis
126(2)
6.4.4 Log-Linear Models and Poisson Regression
128(1)
6.5 Decision Trees
129(8)
6.5.1 Overview
129(5)
6.5.2 Selection Procedures of the Relevant Input Variables
134(1)
6.5.3 Splitting Criteria
134(1)
6.5.4 Number of Splits (Branches of the Tree)
135(1)
6.5.5 Symmetry/Asymmetry
135(1)
6.5.6 Pruning
135(2)
6.6 Neural Networks
137(4)
6.7 Which Method Produces the Best Model? A Comparison of Regression, Decision Trees and Neural Networks
141(1)
6.8 Unsupervised Learning
142(6)
6.8.1 Introduction and Process Steps
142(1)
6.8.2 Business Task
143(1)
6.8.3 Provision and Processing of the Required Data
143(2)
6.8.4 Analysis of the Data
145(2)
6.8.5 Evaluation and Validation of the Results (during the Analysis)
147(1)
6.8.6 Application of the Results
148(1)
6.9 Cluster Analysis
148(3)
6.9.1 Introduction
148(1)
6.9.2 Hierarchical Cluster Analysis
149(1)
6.9.3 K-Means Method of Cluster Analysis
150(1)
6.9.4 Example of Cluster Analysis in Practice
151(1)
6.10 Kohonen Networks and Self-Organising Maps
151(4)
6.10.1 Description
151(1)
6.10.2 Example of SOMs in Practice
152(3)
6.11 Group Purchase Methods: Association and Sequence Analysis
155(6)
6.11.1 Introduction
155(2)
6.11.2 Analysis of the Data
157(1)
6.11.3 Group Purchase Methods
158(1)
6.11.4 Examples of Group Purchase Methods in Practice
158(3)
7 Validation and Application
161(12)
7.1 Introduction to Methods for Validation
161(1)
7.2 Lift and Gain Charts
162(2)
7.3 Model Stability
164(3)
7.4 Sensitivity Analysis
167(2)
7.5 Threshold Analytics and Confusion Matrix
169(1)
7.6 ROC Curves
170(1)
7.7 Cross-Validation and Robustness
171(1)
7.8 Model Complexity
172(1)
Part III Data Mining in Action
173(112)
8 Marketing: Prediction
175(23)
8.1 Recipe 1: Response Optimisation: To Find and Address the Right Number of Customers
176(10)
8.2 Recipe 2: To Find the x% of Customers with the Highest Affinity to an Offer
186(1)
8.3 Recipe 3: To Find the Right Number of Customers to Ignore
187(3)
8.4 Recipe 4: To Find the x% of Customers with the Lowest Affinity to an Offer
190(1)
8.5 Recipe 5: To Find the x% of Customers with the Highest Affinity to Buy
191(1)
8.6 Recipe 6: To Find the x% of Customers with the Lowest Affinity to Buy
192(1)
8.7 Recipe 7: To Find the x% of Customers with the Highest Affinity to a Single Purchase
193(1)
8.8 Recipe 8: To Find the x% of Customers with the Highest Affinity to Sign a Long-Term Contract in Communication Areas
194(2)
8.9 Recipe 9: To Find the x% of Customers with the Highest Affinity to Sign a Long-Term Contract in Insurance Areas
196(2)
9 Intra-Customer Analysis
198(27)
9.1 Recipe 10: To Find the Optimal Amount of Single Communication to Activate One Customer
199(1)
9.2 Recipe 11: To Find the Optimal Communication Mix to Activate One Customer
200(6)
9.3 Recipe 12: To Find and Describe Homogeneous Groups of Products
206(4)
9.4 Recipe 13: To Find and Describe Groups of Customers with Homogeneous Usage
210(6)
9.5 Recipe 14: To Predict the Order Size of Single Products or Product Groups
216(1)
9.6 Recipe 15: Product Set Combination
217(2)
9.7 Recipe 16: To Predict the Future Customer Lifetime Value of a Customer
219(6)
10 Learning from a Small Testing Sample and Prediction
225(19)
10.1 Recipe 17: To Predict Demographic Signs (Like Sex, Age, Education and Income)
225(11)
10.2 Recipe 18: To Predict the Potential Customers of a Brand New Product or Service in Your Databases
236(5)
10.3 Recipe 19: To Understand Operational Features and General Business Forecasting
241(3)
11 Miscellaneous
244(17)
11.1 Recipe 20: To Find Customers Who Will Potentially Churn
244(5)
11.2 Recipe 21: Indirect Churn Based on a Discontinued Contract
249(1)
11.3 Recipe 22: Social Media Target Group Descriptions
250(4)
11.4 Recipe 23: Web Monitoring
254(4)
11.5 Recipe 24: To Predict Who is Likely to Click on a Special Banner
258(3)
12 Software and Tools: A Quick Guide
261(10)
12.1 List of Requirements When Choosing a Data Mining Tool
261(4)
12.2 Introduction to the Idea of Fully Automated Modelling (FAM)
265(1)
12.2.1 Predictive Behavioural Targeting
265(1)
12.2.2 Fully Automatic Predictive Targeting and Modelling Real-Time Online Behaviour
266(1)
12.3 FAM Function
266(1)
12.4 FAM Architecture
267(1)
12.5 FAM Data Flows and Databases
268(1)
12.6 FAM Modelling Aspects
269(1)
12.7 FAM Challenges and Critical Success Factors
270(1)
12.8 FAM Summary
270(1)
13 Overviews
271(14)
13.1 To Make Use of Official Statistics
272(1)
13.2 How to Use Simple Maths to Make an Impression
272(3)
13.2.1 Approximations
272(1)
13.2.2 Absolute and Relative Values
273(1)
13.2.3 % Change
273(1)
13.2.4 Values in Context
273(1)
13.2.5 Confidence Intervals
274(1)
13.2.6 Rounding
274(1)
13.2.7 Tables
274(1)
13.2.8 Figures
274(1)
13.3 Differences between Statistical Analysis and Data Mining
275(2)
13.3.1 Assumptions
275(1)
13.3.2 Values Missing Because 'Nothing Happened'
275(1)
13.3.3 Sample Sizes
276(1)
13.3.4 Goodness-of-Fit Tests
276(1)
13.3.5 Model Complexity
277(1)
13.4 How to Use Data Mining in Different Industries
277(6)
13.5 Future Views
283(2)
Bibliography 285(11)
Index 296
Andrea Ahlemeyer-Stubbe, Director Strategic Analytics, DRAFTFCB München GmbH, Germany

Shirley Coleman, Principal Statistician, Industrial Statistics Research Unit, School of Maths and Statistics, Newcastle University, UK