Muutke küpsiste eelistusi

E-raamat: Data Mining with SPSS Modeler: Theory, Exercises and Solutions

  • Formaat: PDF+DRM
  • Ilmumisaeg: 06-Jun-2016
  • Kirjastus: Springer International Publishing AG
  • Keel: eng
  • ISBN-13: 9783319287096
  • Formaat - PDF+DRM
  • Hind: 122,88 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: PDF+DRM
  • Ilmumisaeg: 06-Jun-2016
  • Kirjastus: Springer International Publishing AG
  • Keel: eng
  • ISBN-13: 9783319287096

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Introducingthe IBM SPSS Modeler, this book guides readers through data mining processesand presents relevant statistical methods. There is a special focus onstep-by-step tutorials and well-documented examples that help demystify complexmathematical algorithms and computer programs. The variety of exercises andsolutions as well as an accompanying website with data sets and SPSS Modeler streams are particularly valuable. Whileintended for students, the simplicity of the Modeler makes the book useful foranyone wishing to learn about basic and more advanced data mining, and put thisknowledge into practice.

Preface.- Introduction.- Basic Functions of the SPSS Modeler.- Univariate Statistics.- Multivariate Statistics.- Regression Models.- Factor Analysis.- Cluster Analysis.- Classification Models.- Using R with the Modeler.- Data Sets Used in This Book.
1 Introduction 1(24)
1.1 The Concept of the SPSS Modeler
2(3)
1.2 Structure and Features of This Book
5(8)
1.2.1 Prerequisites for Using This Book
5(1)
1.2.2 Structure of the Book and the Exercise/Solution Concept
6(2)
1.2.3 Using the Data and Streams Provided with the Book
8(1)
1.2.4 Datasets Provided with This Book
9(1)
1.2.5 Template Concept of This Book
10(3)
1.3 Introducing the Modeling Process
13(9)
1.3.1 Exercises
16(2)
1.3.2 Solutions
18(4)
Literature
22(3)
2 Basic Functions of the SPSS Modeler 25(160)
2.1 Defining Streams and Scrolling Through a Dataset
25(7)
2.2 Switching Between Different Streams
32(2)
2.3 Defining or Modifying Value Labels
34(6)
2.4 Adding Comments to a Stream
40(3)
2.5 Exercises
43(1)
2.6 Solutions
44(5)
2.7 Data Handling and Sampling Methods
49(135)
2.7.1 Theory
49(1)
2.7.2 Calculations
50(6)
2.7.3 String Functions
56(5)
2.7.4 Extracting/Selecting Records
61(4)
2.7.5 Filtering Data
65(8)
2.7.6 Data Standardization: Z-Transformation
73(9)
2.7.7 Partitioning Datasets
82(6)
2.7.8 Sampling Methods
88(23)
2.7.9 Merge Datasets
111(13)
2.7.10 Append Datasets
124(8)
2.7.11 Exercises
132(15)
2.7.12 Solutions
147(37)
Literature
184(1)
3 Univariate Statistics 185(102)
3.1 Theory
185(9)
3.1.1 Discrete Versus Continuous Variables
185(2)
3.1.2 Scales of Measurement
187(1)
3.1.3 Exercises
188(3)
3.1.4 Solutions
191(3)
3.2 Simple Data Examination Tasks
194(92)
3.2.1 Theory
194(1)
3.2.2 Frequency Distribution of Discrete Variables
194(5)
3.2.3 Frequency Distribution of Continuous Variables
199(3)
3.2.4 Distribution Analysis with the Data Audit Node
202(5)
3.2.5 Concept of "SuperNodes" and Transforming a Variable to Normality
207(17)
3.2.6 Reclassifying Values
224(12)
3.2.7 Binning Continuous Data
236(12)
3.2.8 Exercises
248(11)
3.2.9 Solutions
259(27)
Literature
286(1)
4 Multivariate Statistics 287(60)
4.1 Theory
287(3)
4.2 Scatterplot
290(6)
4.3 Scatterplot Matrix
296(6)
4.4 Correlation
302(8)
4.5 Correlation Matrix
310(4)
4.6 Exclusion of Spurious Correlations
314(1)
4.7 Contingency Tables
315(8)
4.8 Exercises
323(2)
4.9 Solutions
325(20)
Literature
345(2)
5 Regression Models 347(166)
5.1 Introduction to Regression Models
348(5)
5.1.1 Motivating Examples
348(2)
5.1.2 Concept of the Modeling Process and Cross-Validation
350(3)
5.2 Simple Linear Regression
353(37)
5.2.1 Theory
353(3)
5.2.2 Building the Stream in SPSS Modeler
356(4)
5.2.3 Identification and Interpretation of the Model Parameters
360(2)
5.2.4 Assessment of the Goodness of Fit
362(3)
5.2.5 Predicting Unknown Values
365(2)
5.2.6 Exercises
367(2)
5.2.7 Solutions
369(21)
5.3 Multiple Linear Regression
390(58)
5.3.1 Theory
390(2)
5.3.2 Building the Model in SPSS Modeler
392(5)
5.3.3 Final MLR Model and Its Goodness of Fit
397(7)
5.3.4 Prediction of Unknown Values
404(1)
5.3.5 Cross-Validation of the Model
404(2)
5.3.6 Boosting and Bagging (for Regression Models)
406(9)
5.3.7 Exercises
415(3)
5.3.8 Solutions
418(30)
5.4 Generalized Linear (Mixed) Model
448(40)
5.4.1 Theory
448(2)
5.4.2 Building a Model with the GLMM Node
450(5)
5.4.3 The Model Nugget
455(3)
5.4.4 Cross-Validation and Fitting a Quadric Regression Model
458(10)
5.4.5 Exercises
468(1)
5.4.6 Solutions
469(19)
5.5 The Auto Numeric Node
488(23)
5.5.1 Building a Stream with the Auto Numeric Node
490(7)
5.5.2 The Auto Numeric Model Nugget
497(3)
5.5.3 Exercises
500(1)
5.5.4 Solutions
500(11)
Literature
511(2)
6 Factor Analysis 513(74)
6.1 Motivating Example
513(2)
6.2 General Theory of Factor Analysis
515(4)
6.3 Principal Component Analysis
519(50)
6.3.1 Theory
519(1)
6.3.2 Building a Model in SPSS Modeler
520(27)
6.3.3 Exercises
547(3)
6.3.4 Solutions
550(19)
6.4 Principal Factor Analysis
569(15)
6.4.1 Theory
569(4)
6.4.2 Building a Model
573(6)
6.4.3 Exercises
579(1)
6.4.4 Solutions
579(5)
Literature
584(3)
7 Cluster Analysis 587(126)
7.1 Motivating Examples
587(2)
7.2 General Theory of Cluster Analysis
589(12)
7.2.1 Exercises
596(2)
7.2.2 Solutions
598(3)
7.3 TwoStep Hierarchical Agglomerative Clustering
601(39)
7.3.1 Theory of Hierarchical Clustering
601(13)
7.3.2 Characteristics of the TwoStep Algorithm
614(1)
7.3.3 Building a Model in SPSS Modeler
615(12)
7.3.4 Exercises
627(2)
7.3.5 Solutions
629(11)
7.4 K-Means Partitioning Clustering
640(45)
7.4.1 Theory
640(2)
7.4.2 Building a Model in SPSS Modeler
642(17)
7.4.3 Exercises
659(3)
7.4.4 Solutions
662(23)
7.5 Auto Clustering
685(25)
7.5.1 Motivation and Implementation of the Auto Cluster Node
685(2)
7.5.2 Building a Model in SPSS Modeler
687(12)
7.5.3 Exercises
699(1)
7.5.4 Solutions
700(10)
7.6 Summary
710(1)
Literature
711(2)
8 Classification Models 713(272)
8.1 Motivating Examples
714(2)
8.2 General Theory of Classification Models
716(17)
8.2.1 Process of Training and Using a Classification Model
716(2)
8.2.2 Classification Algorithms
718(2)
8.2.3 Classification vs. Clustering
720(1)
8.2.4 Making a Decision and the Decision Boundary
721(2)
8.2.5 Performance Measures of Classification Models
723(2)
8.2.6 The Analysis Node
725(2)
8.2.7 Exercises
727(3)
8.2.8 Solutions
730(3)
8.3 Logistic Regression
733(43)
8.3.1 Theory
734(2)
8.3.2 Building the Model in SPSS Modeler
736(7)
8.3.3 Optional: Model Types and Variable Interactions
743(3)
8.3.4 Final Model and Its Goodness of Fit
746(4)
8.3.5 Classification of Unknown Values
750(1)
8.3.6 Cross-Validation of the Model
751(5)
8.3.7 Exercises
756(2)
8.3.8 Solutions
758(18)
8.4 Linear Discriminate Classification
776(32)
8.4.1 Theory
776(3)
8.4.2 Building the Model with SPSS Modeler
779(6)
8.4.3 The Model Nugget and the Estimated Model Parameters
785(3)
8.4.4 Exercises
788(1)
8.4.5 Solutions
789(19)
8.5 Support Vector Machine
808(35)
8.5.1 Theory
809(1)
8.5.2 Building the Model with SPSS Modeler
810(10)
8.5.3 The Model Nugget
820(1)
8.5.4 Exercises
821(1)
8.5.5 Solutions
822(21)
8.6 Neuronal Networks
843(35)
8.6.1 Theory
844(2)
8.6.2 Building a Network with SPSS Modeler
846(10)
8.6.3 The Model Nugget
856(4)
8.6.4 Exercises
860(2)
8.6.5 Solutions
862(16)
8.7 k-Nearest Neighbor
878(39)
8.7.1 Theory
878(4)
8.7.2 Building the Model with SPSS Modeler
882(9)
8.7.3 The Model Nugget
891(2)
8.7.4 Dimensional Reduction with PCA for Data Preprocessing
893(8)
8.7.5 Exercises
901(2)
8.7.6 Solutions
903(14)
8.8 Decision Trees
917(43)
8.8.1 Theory
917(8)
8.8.2 Building a Decision Tree with the C5.0 Node
925(4)
8.8.3 The Model Nugget
929(3)
8.8.4 Building a decision tree with the CHAID node
932(6)
8.8.5 Exercises
938(1)
8.8.6 Solutions
939(21)
8.9 The Auto Classifier Node
960(23)
8.9.1 Building a Stream with the Auto Classifier Node
961(10)
8.9.2 The Auto Classifier Model Nugget
971(2)
8.9.3 Exercises
973(1)
8.9.4 Solutions
974(9)
Literature
983(2)
9 Using R with the Modeler 985(52)
9.1 Advantages of R with the Modeler
985(1)
9.2 Connecting with R
986(4)
9.3 Test the SPSS Modeler Connection to R
990(4)
9.4 Calculating New Variables in R
994(5)
9.5 Model Building in R
999(9)
9.6 Exercises
1008(10)
9.7 Solutions
1018(17)
Literature
1035(2)
10 Appendix 1037
10.1 Data Sets Used in This Book
1037(20)
10.1.1 adult_income_data.txt
1037(1)
10.1.2 beer.sav
1037(1)
10.1.3 benchmark.xlsx
1037(2)
10.1.4 car_simple.sav
1039(1)
10.1.5 car_sales_modified. sav
1039(1)
10.1.6 chess_endgame_data.txt
1039(1)
10.1.7 customer_bank_data.csv
1040(1)
10.1.8 diabetes_data_reduced.sav
1040(1)
10.1.9 DRUG1n.sav
1041(1)
10.1.10 EEG_Sleep_Signals.csv
1042(1)
10.1.11 employee_dataset_001 and employee_dataset_002
1042(1)
10.1.12 England Payment Datasets
1042(2)
10.1.13 Features_eeg_signals.csv
1044(1)
10.1.14 gene_expre ssion_leukemi a. csv
1044(1)
10.1.15 gene_expression_leukemia_short.csv
1045(1)
10.1.16 gravity_constant_data.csv
1045(1)
10.1.17 Housing.data.txt
1046(1)
10.1.18 Iris.csv
1046(1)
10.1.19 IT-projects.txt
1047(1)
10.1.20 IT user satisfaction.sav
1047(1)
10.1.21 longley.csv
1047(2)
10.1.22 LPGA2009.csv
1049(1)
10.1.23 Mtcars.csv
1050(1)
10.1.24 nutrition_habites.sav
1051(1)
10.1.25 optdigits_training.txt, optdigits_test.txt
1051(1)
10.1.26 Orthodont.csv
1052(1)
10.1.27 Ozone.csv
1052(1)
10.1.28 pisa2012_math_q45.sav
1052(2)
10.1.29 sales_list.sav
1054(1)
10.1.30 ships.csv
1054(1)
10.1.31 test_scores.sav
1054(1)
10.1.32 Titanic.xlsx
1055(1)
10.1.33 tree_credit. sav
1055(1)
10.1.34 wine_data.txt
1056(1)
10.1.35 WisconsinBreastCancerData.csv
1056(1)
10.1.36 z_pm_customer1.sav
1057(1)
Literature
1057
Prof. Dr. Tilo Wendler studied mathematics, physics and business information technology. In his doctoral thesis he examined determinants of user expectations in using information technology. With much interest he applied complex statistical methods in the banking sector especially in the field of rating methods. He has been teaching business statistics and data mining for ten years.

Dr. Sören Gröttrup studied mathematics and computer science with focus on probability theory and statistics and got his Ph.D. for his research on biological models. Parallel to his doctoral studies, he worked in a research institute as a data analyst on genomic data sets. Today, he works as a data analyst and statistician in the industrial and marketing sector.