Muutke küpsiste eelistusi

Web Mining: A Synergic Approach Resorting to Classifications and Clustering [Kõva köide]

(Government Polytechnic, Nagpur, INDIA), ,
  • Formaat: Hardback, 230 pages, kõrgus x laius: 234x156 mm, kaal: 590 g
  • Ilmumisaeg: 11-Nov-2016
  • Kirjastus: River Publishers
  • ISBN-10: 8793379838
  • ISBN-13: 9788793379831
Teised raamatud teemal:
  • Formaat: Hardback, 230 pages, kõrgus x laius: 234x156 mm, kaal: 590 g
  • Ilmumisaeg: 11-Nov-2016
  • Kirjastus: River Publishers
  • ISBN-10: 8793379838
  • ISBN-13: 9788793379831
Teised raamatud teemal:
Web mining is the application of data mining strategies to excerpt learning from web information, i.e. web content, web structure, and web usage data. With the emergence of the web as the predominant and converging platform for communication, business and scholastic information dissemination, especially in the last five years, there are ever increasing research groups working on different aspects of web mining mainly in three directions. These are: mining of web content, web structure and web usage. In this context there are good number of frameworks and benchmarks related to the metrics of the websites which is certainly weighty for B2B, B2C and in general in any e-commerce paradigm. Owing to the popularity of this topic there are few books in the market, dealing more on such performance metrics and other related issues. This book, however, omits all such routine topics and lays more emphasis on the classification and clustering aspects of the websites in order to come out with the true perception of the websites in light of its usability.In nutshell, Web Mining: A Synergic Approach Resorting to Classifications and Clustering showcases an effective methodology for classification and clustering of web sites from their usability point of view. While the clustering and classification is accomplished by using an open source tool WEKA, the basic dataset for the selected websites has been emanated by using a free tool site-analyzer. As a case study, several commercial websites have been analyzed. The dataset preparation using site-analyzer and classification through WEKA by embedding different algorithms is one of the unique selling points of this book. This text projects a complete spectrum of web mining from its very inception through data mining and takes the reader up to the application level. Salient features of the book include: Literature review of research work in the area of web miningBusiness websites domain researched, and data collected using site-analyzer toolAccessibility, design, text, multimedia, and networking are assessedDatasets are filtered further by selecting vital attributes which are Search Engine Optimized for processing using the Weka attributed toolDataset with labels have been classified using J48, RBFNetwork, NaïveBayes, and SMO techniques using WekaA comparative analysis of all classifiers is reportedCommercial applications for improving website performance based on SEO is given
Preface xiii
Acknowledgment xv
List of Figures
xvii
List of Tables
xxi
List of Graphs
xxiii
List of Abbreviations
xxv
1 Introduction
1(36)
1.1 Basic Notion of Data Mining
2(1)
1.2 Knowledge Discovery: The Very Rationale Behind Data Mining
2(2)
1.3 Challenges in the Development of Data Mining
4(2)
1.3.1 Scalability
4(1)
1.3.2 High Dimensionality
4(1)
1.3.3 Heterogeneous and Complex Data
5(1)
1.3.4 Data Ownership and Distribution
5(1)
1.3.5 Non-Traditional Analysis
5(1)
1.4 Importance of Data Mining
6(2)
1.5 Classification of Data Mining Systems
8(2)
1.5.1 The Databases Mined
9(1)
1.5.2 The Knowledge Mined
10(1)
1.5.3 The Techniques Utilized
10(1)
1.5.4 The Application Adopted
10(1)
1.6 Generic Architecture of Data Mining System
10(2)
1.7 Major Issues in Data Mining
12(2)
1.7.1 Mining Methodology and User Interaction Issues
12(1)
1.7.2 Performance Issues
13(1)
1.7.3 Issues Relating to the Diversity of Database Types
14(1)
1.8 Data Mining Strategies
14(4)
1.8.1 Classification
15(1)
1.8.2 Association
16(1)
1.8.3 Clustering
17(1)
1.8.3.1 k-Means algorithm
17(1)
1.8.4 Estimation
18(1)
1.9 Data Mining: Ever Increasing Range of Applications
18(7)
1.9.1 Games
18(1)
1.9.2 Business
18(2)
1.9.3 Science and Engineering
20(1)
1.9.4 Human Rights
21(1)
1.9.5 Medical Data Mining
21(1)
1.9.6 Spatial Data Mining
22(1)
1.9.7 Challenges in Spatial Mining
22(1)
1.9.8 Temporal Data Mining
23(1)
1.9.9 Sensor Data Mining
23(1)
1.9.10 Visual Data Mining
24(1)
1.9.11 Music Data Mining
24(1)
1.9.12 Pattern Mining
24(1)
1.9.13 Subject-based Data Mining
25(1)
1.9.14 Knowledge Grid
25(1)
1.10 Trends in Data Mining
25(3)
1.10.1 Application Exploration
25(1)
1.10.2 Scalable and Interactive Data Mining Methods
26(1)
1.10.3 Integration of Data Mining with Database Systems, Data Warehouse Systems, and Web Database Systems
26(1)
1.10.4 Standardization of Data Mining Query Language
26(1)
1.10.5 Visual Data Mining
26(1)
1.10.6 New Methods for Mining Complex Types of Data
26(1)
1.10.7 Biological Data Mining
27(1)
1.10.8 Data Mining and Software Engineering
27(1)
1.10.9 Web Mining
27(1)
1.10.10 Distributed Data Mining
27(1)
1.10.11 Real-Time Data Mining
27(1)
1.10.12 Multi-Database Data Mining
28(1)
1.10.13 Privacy Protection and Information Security in Data Mining
28(1)
1.11 Classification Techniques in Data Mining
28(3)
1.11.1 Definition of the Classification
29(1)
1.11.2 Issues Regarding Classification
29(1)
1.11.3 Evaluation Methods for Classification
29(1)
1.11.4 Classifications Techniques
30(1)
1.11.4.1 Tree structure
30(1)
1.11.4.2 Rule-based algorithm
31(1)
1.11.4.3 Distance-based algorithms
31(1)
1.11.4.4 Neural networks-based algorithms
31(1)
1.11.4.5 Statistical-based algorithms
31(1)
1.12 Applications of Classifications
31(1)
1.12.1 Target Marketing
31(1)
1.12.2 Disease Diagnosis
32(1)
1.12.3 Supervised Event Detection
32(1)
1.12.4 Multimedia Data Analysis
32(1)
1.12.5 Biological Data Analysis
32(1)
1.12.6 Document Categorization and Filtering
32(1)
1.12.7 Social Network Analysis
32(1)
1.13 WEKA: An Effective Tool for Data Mining
32(3)
1.13.1 Main Features of the Weka
33(1)
1.13.2 Weka Interface
33(1)
1.13.3 Weka for Classification
34(1)
1.13.3.1 Selecting a classifier
34(1)
1.13.3.2 Test options
34(1)
1.14 What We Aim to Cover Through the Present Book
35(2)
2 Current Literature Assessment in Data and Web Mining
37(18)
2.1 Big Data and Its Mining
37(1)
2.2 Data-Processing Basics
38(1)
2.3 Data Mining
38(2)
2.4 Pioneering Work
40(1)
2.5 Algorithms Used in Data Mining
41(2)
2.6 Classification and Mining
43(1)
2.7 Performance Metrics of Classification/Mining
43(2)
2.8 Data Mining for Web
45(1)
2.9 Categories of Web Data Mining
45(2)
2.10 Radial Basis Function Networks
47(1)
2.11 J48 Decision Tree
48(1)
2.12 Naive Bayes
49(1)
2.13 Support Vector Machine (SVM)
49(1)
2.14 Conclusion and Way Forward
49(6)
3 DataSet Creation for Web Mining
55(34)
3.1 Introduction
56(1)
3.2 Web Mining---Emerging Model of Business
56(3)
3.2.1 Introduction to Web Mining
56(3)
3.3 Tools Used for Acquisition of Parameters
59(17)
3.3.1 Accessibility
63(3)
3.3.2 Design
66(2)
3.3.3 Texts
68(2)
3.3.4 Multimedia
70(2)
3.3.5 Networking
72(4)
3.4 Difficulties Encountered
76(1)
3.4.1 Internet Problem
76(1)
3.4.2 Preparation and Selection of Websites
76(1)
3.4.3 Difficulty in Selecting Analysis Tool
76(1)
3.4.4 Unavailability of Data
76(1)
3.5 Flowchart
77(1)
3.6 Freezing Parameters
78(10)
3.6.1 Data Preprocessing
78(1)
3.6.1.1 Data Preprocessing Techniques
79(1)
3.6.2 Preprocessing and Filtering
80(1)
3.6.2.1 Preprocessed and Filtered Overall Data
80(1)
3.6.2.2 Preprocessed and Filtered Web Accessibility Data
80(1)
3.6.2.3 Preprocessed and Filtered Design Data
80(2)
3.6.2.4 Preprocessed and Filtered Texts Data
82(2)
3.6.2.5 Preprocessed and Filtered Multimedia Data
84(1)
3.6.2.6 Preprocessed and Filtered Networking Data
84(4)
3.7 Way Forward
88(1)
4 Classification of Websites
89(110)
4.1 Introduction
89(4)
4.1.1 Accessibility
90(1)
4.1.2 Design
90(1)
4.1.3 Texts
91(1)
4.1.4 Multimedia
92(1)
4.1.5 Networking
92(1)
4.2 Classification of Websites on Accessibility
93(13)
4.2.1 Dataset
93(1)
4.2.2 Clustering
93(2)
4.2.3 Clustered Instances
95(1)
4.2.4 Classification Via Clustering
95(1)
4.2.4.1 Classification via clustering using J48 algorithm
96(2)
4.2.4.2 Classification via clustering using RBFNetwork algorithm
98(3)
4.2.4.3 Classification via clustering using NaiveBayes algorithm
101(2)
4.2.4.4 Classification via clustering using SMO algorithm
103(3)
4.2.4.5 Comparison of above classification algorithms
106(1)
4.3 Classification Based on Website Design
106(19)
4.3.1 Attribute Selection
109(1)
4.3.2 Clustering
109(3)
4.3.3 Cluster Analysis
112(1)
4.3.4 Classification Through Clustering
113(1)
4.3.4.1 Classification via clustering using J48 algorithm
113(3)
4.3.4.2 Classification via clustering using RBFNetwork algorithm
116(1)
4.3.4.3 Classification via clustering using NaiveBayes algorithm
117(2)
4.3.4.4 Classification via clustering using SMO algorithm
119(4)
4.3.4.5 Comparison of above classification algorithms
123(2)
4.4 Classification Based on Text
125(17)
4.4.1 Feature Selection
125(2)
4.4.2 Clustering
127(2)
4.4.3 Cluster Analysis
129(1)
4.4.4 Classification Through Clustering
129(1)
4.4.4.1 Classification via clustering using J48 algorithm
130(3)
4.4.4.2 Classification via clustering using RBFNetwork algorithm
133(2)
4.4.4.3 Classification via clustering using NaiveBayes algorithm
135(2)
4.4.4.4 Classification via clustering using SMO algorithm
137(3)
4.4.4.5 Comparison of above classification algorithms
140(2)
4.5 Classification Based on Multimedia Content of Websites
142(16)
4.5.1 Feature Selection
143(1)
4.5.2 Clustering
143(3)
4.5.3 Cluster Analysis
146(1)
4.5.4 Classification Through Clustering
147(1)
4.5.4.1 Classification via clustering using J48 algorithm
147(3)
4.5.4.2 Classification via clustering using RBFNetwork algorithm
150(2)
4.5.4.3 Classification via clustering using NaiveBayes algorithm
152(2)
4.5.4.4 Classification via clustering using SMO algorithm
154(2)
4.5.4.5 Comparison of above classification algorithm
156(2)
4.6 Classification Based on Network Analysis of Webpage
158(17)
4.6.1 Feature Selection
159(2)
4.6.2 Clustering
161(2)
4.6.3 Observations
163(2)
4.6.4 Classification Through Clustering
165(1)
4.6.4.1 Classification via clustering using J48 algorithm
165(3)
4.6.4.2 Classification via clustering using RBFNetwork algorithm
168(1)
4.6.4.3 Classification via clustering using NaiveBayes algorithm
168(4)
4.6.4.4 Classification via clustering using SMO algorithm
172(2)
4.6.4.5 Comparison of the above classification algorithm
174(1)
4.7 Classification of Websites Using Overall Performance
175(17)
4.7.1 Clustering
176(2)
4.7.2 Cluster Analysis
178(1)
4.7.3 Classification Via Clustering
179(1)
4.7.3.1 Classification via clustering using J48 algorithm
179(4)
4.7.3.2 Classification via clustering using RBFNetwork algorithm
183(3)
4.7.3.3 Classification via clustering using NaiveBayes algorithm
186(2)
4.7.3.4 Classification via clustering using SMO algorithm
188(2)
4.7.3.5 Comparison of the above classification algorithms
190(2)
4.8 Results at a Glance and Conclusion
192(3)
4.9 Summary and Future Directions
195(4)
Index 199(4)
About the Authors 203
V.S. Kumbhar, K. S. Oza, R.K. Kamat