Muutke küpsiste eelistusi

E-raamat: Expert Systems in Chemistry Research [Taylor & Francis e-raamat]

(Fresenius University of Applied Sciences, Cologne, Germany)
  • Formaat: 416 pages, 5 Tables, black and white; 91 Illustrations, black and white
  • Ilmumisaeg: 13-Dec-2007
  • Kirjastus: CRC Press Inc
  • ISBN-13: 9780429146350
  • Taylor & Francis e-raamat
  • Hind: 281,59 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 402,26 €
  • Säästad 30%
  • Formaat: 416 pages, 5 Tables, black and white; 91 Illustrations, black and white
  • Ilmumisaeg: 13-Dec-2007
  • Kirjastus: CRC Press Inc
  • ISBN-13: 9780429146350
Expert systems allow scientists to access, manage, and apply data and specialized knowledge from various disciplines to their own research. Expert Systems in Chemistry Research explains the general scientific basis and computational principles behind expert systems and demonstrates how they can improve the efficiency of scientific workflows and support decision-making processes.

Focused initially on clarifying the fundamental concepts, limits, and drawbacks of using computer software to approach human decision making, the author also underscores the importance of putting theory into practice. The book highlights current capabilities for planning and monitoring experiments, scientific data management and interpretation, chemical characterization, problem solving, and methods for encoding chemical data. It also examines the challenges as well as requirements, strategies, and considerations for implementing expert systems effectively in an existing laboratory software environment.

Expert Systems in Chemistry Research covers various artificial intelligence technologies used to support expert systems, including nonlinear statistics, wavelet transforms, artificial neural networks, genetic algorithms, and fuzzy logic. This definitive text provides researchers, scientists, and engineers with a cornerstone resource for developing new applications in chemoinformatics, systems design, and other emerging fields.
Preface xvii
Acknowledgments xix
Trademark Information xxi
Chapter 1 Introduction 1
1.1 Introduction
1
1.2 What We Are Talking About
1
1.3 The Concise Summary
3
1.4 Some Initial Thoughts
3
References
8
Chapter 2 Basic Concepts of Expert Systems 9
2.1 What Are Expert Systems?
9
2.2 The Conceptual Design of an Expert System
10
2.3 Knowledge and Knowledge Representation
12
2.3.1 Rules
12
2.3.2 Semantic Networks
14
2.3.3 Frames
16
2.3.4 Advantages of Rules
18
2.3.4.1 Declarative Language
18
2.3.4.2 Separation of Business Logic and Data
18
2.3.4.3 Centralized Knowledge Base
18
2.3.4.4 Performance and Scalability
19
2.3.5 When to Use Rules
19
2.4 Reasoning
20
2.4.1 The Inference Engine
20
2.4.2 Forward and Backward Chaining
22
2.4.3 Case-Based Reasoning
22
2.5 The Fuzzy World
24
2.5.1 Certainty Factors
24
2.5.2 Fuzzy Logic
25
2.5.3 Hidden Markov Models
26
2.5.4 Working with Probabilities — Bayesian Networks
27
2.5.5 Dempster-Shafer Theory of Evidence
28
2.6 Gathering Knowledge — Knowledge Engineering
29
2.7 Concise Summary
31
References
32
Chapter 3 Development Tools for Expert Systems 35
3.1 Introduction
35
3.2 The Technical Design of Expert Systems
35
3.2.1 Knowledge Base
35
3.2.2 Working Memory
35
3.2.3 Inference Engine
36
3.2.4 User Interface
36
3.3 Imperative versus Declarative Programming
37
3.4 List Processing (LISP)
40
3.5 Programming Logic (PROLOG)
41
3.5.1 PROLOG Facts
41
3.5.2 PROLOG Rules
42
3.6 National Aeronautics and Space Administration's (NASA's) Alternative — C Language Integrated Production System (CLIPS)
43
3.6.1 CLIPS Facts
44
3.6.2 CLIPS Rules
45
3.7 Java-Based Expert Systems — JESS
47
3.8 Rule Engines — JBoss Rules
48
3.9 Languages for Knowledge Representation
49
3.9.1 Classification of Individuals and Concepts (CLASSIC)
50
3.9.2 Knowledge Machine
51
3.10 Advanced Development Tools
53
3.10.1 XpertRule
55
3.10.2 Rule Interpreter (RI)
56
3.11 Concise Summary
57
References
58
Chapter 4 Dealing with Chemical Information 61
4.1 Introduction
61
4.2 Structure Representation
61
4.2.1 Connection Tables (CTs)
61
4.2.2 Connectivity Matrices
62
4.2.3 Linear Notations
63
4.2.4 Simplified Molecular Input Line Entry Specification (SMILES)
63
4.2.5 SMILES Arbitrary Target Specification (SMARTS)
64
4.3 Searching for Chemical Structures
64
4.3.1 Identity Search versus Substructure Search
64
4.3.2 Isomorphism Algorithms
65
4.3.3 Prescreening
66
4.3.4 Hash Coding
66
4.3.5 Stereospecific Search
67
4.3.6 Tautomer Search
67
4.3.7 Specifying a Query Structure
68
4.4 Describing Molecules
69
4.4.1 Basic Requirements for Molecular Descriptors
70
4.4.1.1 Independency of Atom Labeling
71
4.4.1.2 Rotational/Translational Invariance
71
4.4.1.3 Unambiguous Algorithmically Computable Definition
71
4.4.1.4 Range of Values
71
4.4.2 Desired Properties of Molecular Descriptors
72
4.4.2.1 Reversible Encoding
73
4.4.3 Approaches for Molecular Descriptors
73
4.4.4 Constitutional Descriptors
73
4.4.5 Topological Descriptors
74
4.4.6 Topological Autocorrelation Vectors
74
4.4.7 Fragment-Based Coding
75
4.4.8 3D Molecular Descriptors
76
4.4.9 3D Molecular Representation Based on Electron Diffraction
77
4.4.10 Radial Distribution Functions
77
4.4.11 Finding the Appropriate Descriptor
78
4.5 Descriptive Statistics
79
4.5.1 Basic Terms
79
4.5.1.1 Standard Deviation (SD)
79
4.5.1.2 Variance
79
4.5.1.3 Covariance
80
4.5.1.4 Covariance Matrix
80
4.5.1.5 Eigenvalues and Eigenvectors
80
4.5.2 Measures of Similarity
81
4.5.3 Skewness and Kurtosis
83
4.5.4 Limitations of Regression
85
4.5.5 Conclusions for Investigations of Descriptors
86
4.6 Capturing Relationships — Principal Components
87
4.6.1 Principal Component Analysis (PCA)
87
4.6.1.1 Centering the Data
89
4.6.1.2 Calculating the Covariance Matrix
89
4.6.2 Singular Value Decomposition (SVD)
91
4.6.3 Factor Analysis
94
4.7 Transforming Descriptors
95
4.7.1 Fourier Transform
95
4.7.2 Hadamard Transform
96
4.7.3 Wavelet Transform
96
4.7.4 Discrete Wavelet Transform
97
4.7.5 Daubechies Wavelets
98
4.7.6 The Fast Wavelet Transform
99
4.8 Learning from Nature — Artificial Neural Networks
102
4.8.1 Artificial Neural Networks in a Nutshell
103
4.8.2 Kohonen Neural Networks — The Classifiers
105
4.8.3 Counterpropagation (CPG) Neural Networks The Predictors
107
4.8.4 The Tasks: Classification and Modeling
109
4.9 Genetic Algorithms (GAs)
110
4.10 Concise Summary
112
References
115
Chapter 5 Applying Molecular Descriptors 119
5.1 Introduction
119
5.2 Radial Distribution Functions (RDFs)
119
5.2.1 Radial Distribution Function
119
5.2.2 Smoothing and Resolution
120
5.2.3 Resolution and Probability
122
5.3 Making Things Comparable — Postprocessing of RDF Descriptors
123
5.3.1 Weighting
123
5.3.2 Normalization
124
5.3.3 Remark on Linear Scaling
124
5.4 Adding Properties — Property-Weighted Functions
125
5.4.1 Static Atomic Properties
125
5.4.2 Dynamic Atomic Properties
126
5.4.3 Property Products versus Averaged Properties
126
5.5 Describing Patterns
128
5.5.1 Distance Patterns
129
5.5.2 Frequency Patterns
129
5.5.3 Binary Patterns
130
5.5.4 Aromatic Patterns
130
5.5.5 Pattern Repetition
130
5.5.6 Symmetry Effects
130
5.5.7 Pattern Matching with Binary Patterns
131
5.6 From the View of an Atom — Local and Restricted RDF Descriptors
131
5.6.1 Local RDF Descriptors
132
5.6.2 Atom-Specific RDF Descriptors
132
5.7 Straight or Detour Distance Function Types
133
5.7.1 Cartesian RDF
133
5.7.2 Bond-Path RDF
133
5.7.3 Topological Path RDF
134
5.8 Constitution and Conformation
135
5.9 Constitution and Molecular Descriptors
136
5.10 Constitution and Local Descriptors
139
5.11 Constitution and Conformation in Statistical Evaluations
140
5.12 Extending the Dimension Multidimensional Function Types
145
5.13 Emphasizing the Essential Wavelet Transforms
147
5.13.1 Single-Level Transforms
150
5.13.2 Wavelet-Compressed Descriptors
151
5.14 A Tool for Generation and Evaluation of RDF Descriptors — ARC
151
5.14.1 Loading Structure Information
153
5.14.2 The Default Code Settings
153
5.14.3 Calculation and Investigation of a Single Descriptor
154
5.14.4 Calculation and Investigation of Multiple Descriptor Sets
155
5.14.5 Binary Comparison
155
5.14.6 Correlation Matrices
155
5.14.7 Training a Neural Network
155
5.14.8 Investigation of Trained Network
157
5.14.9 Prediction and Classification for a Test Set
157
5.15 Synopsis
157
5.15.1 Similarity and Diversity of Molecules
162
5.15.2 Structure and Substructure Search
162
5.15.3 Structure—Property Relationships
162
5.15.4 Structure—Activity Relationships
162
5.15.5 Structure—Spectrum Relationships
162
5.16 Concise Summary
163
References
165
Chapter 6 Expert Systems in Fundamental Chemistry 167
6.1 Introduction
167
6.2 How It Began — The DENDRAL Project
167
6.2.1 The Generator — CONGEN
168
6.2.2 The Constructor — PLANNER
168
6.2.3 The Testing — PREDICTOR
169
6.2.4 Other DENDRAL Programs
171
6.3 A Forerunner in Medical Diagnostics
171
6.4 Early Approaches in Spectroscopy
175
6.4.1 Early Approaches in Vibrational Spectroscopy
176
6.4.2 Artificial Neural Networks for Spectrum Interpretation
177
6.5 Creating Missing Information — Infrared Spectrum Simulation
178
6.5.1 Spectrum Representation
178
6.5.2 Compression with Fast Fourier Transform
179
6.5.3 Compression with Fast Hadamard Transform
179
6.6 From the Spectrum to the Structure — Structure Prediction
179
6.6.1 The Database Approach
181
6.6.2 Selection of Training Data
181
6.6.3 Outline of the Method
182
6.6.3.1 Preprocessing of Spectrum Information
182
6.6.3.2 Preprocessing of Structure Information
182
6.6.3.3 Generation of a Descriptor Database
182
6.6.3.4 Training
182
6.6.3.5 Prediction of the Radial Distribution Function (RDF) Descriptor
183
6.6.3.6 Conversion of the RDF Descriptor
184
6.6.4 Examples for Structure Derivation
184
6.6.5 The Modeling Approach
187
6.6.6 Improvement of the Descriptor
188
6.6.7 Database Approach versus Modeling Approach
189
6.7 From Structures to Properties
190
6.7.1 Searching for Similar Molecules in a Data Set
191
6.7.2 Molecular Diversity of Data Sets
193
6.7.2.1 Average Descriptor Approach
194
6.7.2.2 Correlation Approach
194
6.7.3 Prediction of Molecular Polarizability
199
6.8 Dealing with Localized Information — Nuclear Magnetic Resonance (NMR) Spectroscopy
201
6.8.1 Commercially Available Products
201
6.8.2 Local Descriptors for Nuclear Magnetic Resonance Spectroscopy
202
6.8.3 Selecting Descriptors by Evolution
205
6.8.4 Learning Chemical Shifts
206
6.8.5 Predicting Chemical Shifts
207
6.9 Applications in Analytical Chemistry
208
6.9.1 Gamma Spectrum Analysis
208
6.9.2 Developing Analytical Methods — Thermal Dissociation of Compounds
209
6.9.3 Eliminating the Unnecessary — Supporting Calibration
215
6.10 Simulating Biology
217
6.10.1 Estimation of Biological Activity
217
6.10.2 Radioligand Binding Experiments
218
6.10.3 Effective and Inhibitory Concentrations
219
6.10.4 Prediction of Effective Concentrations
221
6.10.5 Progestagen Derivatives
221
6.10.6 Calcium Agonists
223
6.10.7 Corticosteroid-Binding Globulin (CBG) Steroids
224
6.10.8 Mapping a Molecular Surface
226
6.11 Supporting Organic Synthesis
229
6.11.1 Overview of Existing Systems
230
6.11.2 Elaboration of Reactions for Organic Synthesis
232
6.11.3 Kinetic Modeling in EROS
233
6.11.4 Rules in EROS
233
6.11.5 Synthesis Planning — Workbench for the Organization of Data for Chemical Applications (WODCA)
234
6.12 Concise Summary
236
References
239
Chapter 7 Expert Systems in Other Areas of Chemistry 247
7.1 Introduction
247
7.2 Bioinformatics
247
7.2.1 Molecular Genetics (MOLGEN)
248
7.2.2 Predicting Toxicology — Deductive Estimation of Risk from Existing Knowledge (DEREK) for Windows
249
7.2.3 Predicting Metabolism — Meteor
251
7.2.4 Estimating Biological Activity — APEX-3D
251
7.2.5 Identifying Protein Structures
254
7.3 Environmental Chemistry
257
7.3.1 Environmental Assessment — Green Chemistry Expert System (GCES)
257
7.3.2 Synthetic Methodology Assessment for Reduction Techniques
258
7.3.3 Green Synthetic Reactions
259
7.3.4 Designing Safer Chemicals
260
7.3.5 Green Solvents/Reaction Conditions
261
7.3.6 Green Chemistry References
261
7.3.7 Dynamic Emergency Management — Real-Time Expert System (RTXPS)
262
7.3.8 Representing Facts — Descriptors
262
7.3.9 Changing Facts — Backward-Chaining Rules
263
7.3.10 Triggering Actions — Forward-Chaining Rules
263
7.3.11 Reasoning — The Inference Engine
264
7.3.12 A Combined Approach for Environmental Management
265
7.3.13 Assessing Environmental Impact — EIAxpert
266
7.4 Geochemistry and Exploration
267
7.4.1 Exploration
267
7.4.2 Geochemistry
268
7.4.3 X-Ray Phase Analysis
268
7.5 Engineering
269
7.5.1 Monitoring of Space-Based Systems — Thermal Expert System (TEXSYS)
269
7.5.2 Chemical Equilibrium of Complex Mixtures — CEA
270
7.6 Concise Summary
271
References
274
Chapter 8 Expert Systems in the Laboratory Environment 277
8.1 Introduction
277
8.2 Regulations
277
8.2.1 Good Laboratory Practices
278
8.2.1.1 Resources, Organization, and Personnel
278
8.2.1.2 Rules, Protocols, and Written Procedures
278
8.2.1.3 Characterization
278
8.2.1.4 Documentation
278
8.2.1.5 Quality Assurance
279
8.2.2 Good Automated Laboratory Practice (GALP)
279
8.2.3 Electronic Records and Electronic Signatures (21 CFR Part 11)
280
8.3 The Software Development Process
281
8.3.1 From the Requirements to the Implementation
282
8.3.1.1 Analyzing the Requirements
282
8.3.1.2 Specifying What Has to Be Done
282
8.3.1.3 Defining the Software Architecture
282
8.3.1.4 Programming
282
8.3.1.5 Testing the Outcome.
283
8.3.1.6 Documenting the Software
283
8.3.1.7 Supporting the User
283
8.3.1.8 Maintaining the Software
283
8.3.2 The Life Cycle of Software
283
8.4 Knowledge Management
287
8.4.1 General Considerations
287
8.4.2 The Role of a Knowledge Management System (KMS)
288
8.4.3 Architecture
289
8.4.4 The Knowledge Quality Management Team
290
8.5 Data Warehousing
290
8.6 The Basis — Scientific Data Management Systems
293
8.7 Managing Samples — Laboratory Information Management Systems (LIMS)
295
8.7.1 LIMS Characteristics
296
8.7.2 Why Use a LIMS9
297
8.7.3 Compliance and Quality Assurance (QA)
297
8.7.4 The Basic LIMS
298
8.7.5 A Functional Model
298
8.7.5.1 Sample Tracking
298
8.7.5.2 Sample Analysis
299
8.7.5.3 Sample Organization
299
8.7.6 Planning System
299
8.7.7 The Controlling System
300
8.7.8 The Assurance System
300
8.7.9 What Else Can We Find in a LIMS?
301
8.7.9.1 Automatic Test Programs
301
8.7.9.2 Off-Line Client
301
8.7.9.3 Stability Management
301
8.7.9.4 Reference Substance Module
302
8.7.9.5 Recipe Administration
302
8.8 Tracking Workflows — Workflow Management Systems
302
8.8.1 Requirements
303
8.8.2 The Lord of the Runs
303
8.8.3 Links and Logistics
304
8.8.4 Supervisor and Auditor
304
8.8.5 Interfacing
305
8.9 Scientific Documentation — Electronic Laboratory Notebooks (ELNs)
305
8.9.1 The Electronic Scientific Document
307
8.9.2 Scientific Document Templates
309
8.9.3 Reporting with ELNs
310
8.9.4 Optional Tools in ELNs
310
8.10 Scientific Workspaces
312
8.10.1 Scientific Workspace Managers
313
8.10.2 Navigation and Organization in a Scientific Workspace
315
8.10.3 Using Metadata Effectively
315
8.10.4 Working in Personal Mode
319
8.10.5 Differences of Electronic Scientific Documents
319
8.11 Interoperability and Interfacing
320
8.11.1 eXtensible Markup Language (XML)-Based Technologies
320
8.11.1.1 Simple Object Access Protocol (SOAP)
321
8.11.1.2 Universal Description, Discovery, and Integration (UDDI)
321
8.11.1.3 Web Services Description Language (WSDL)
321
8.11.2 Component Object Model (COM) Technologies
321
8.11.3 Connecting Instruments — Interface Port Solutions
322
8.11.4 Connecting Serial Devices
322
8.11.5 Developing Your Own Connectivity — Software Development Kits (SDKs)
324
8.11.6 Capturing Data — Intelligent Agents
325
8.11.7 The Inbox Concept
327
8.12 Access Rights and Administration
328
8.13 Electronic Signatures, Audit Trails, and IP Protection
329
8.13.1 Signature Workflow
329
8.13.2 Event Messaging
331
8.13.3 Audit Trails and IP Protection
331
8.13.4 Hashing Data
331
8.13.5 Public Key Cryptography
332
8.13.5.1 Secret Key Cryptography
333
8.13.5.2 Public Key Cryptography
333
8.14 Approaches for Search and Reuse of Data and Information
333
8.14.1 Searching for Standard Data
334
8.14.2 Searching with Data Cartridges
334
8.14.3 Mining for Data
335
8.14.4 The Outline of a Data Mining Service for Chemistry
336
8.14.4.1 Search and Processing of Raw Data
336
8.14.4.2 Calculation of Descriptors
337
8.14.4.3 Analysis by Statistical Methods
337
8.14.4.4 Analysis by Artificial Neural Networks
337
8.14.4.5 Optimization by Genetic Algorithms
338
8.14.4.6 Data Storage
338
8.14.4.7 Expert Systems
338
8.15 A Bioinformatics LIMS Approach
338
8.15.1 Managing Biotransformation Data
339
8.15.2 Describing Pathways
340
8.15.3 Comparing Pathways
342
8.15.4 Visualizing Biotransformation Studies
343
8.15.5 Storage of Biotransformation Data
344
8.16 Handling Process Deviations
344
8.16.1 Covered Business Processes
345
8.16.2 Exception Recording
346
8.16.2.1 Basic Information Entry
346
8.16.2.2 Risk Assessment
346
8.16.2.3 Cause Analysis
347
8.16.2.4 Corrective Actions
347
8.16.2.5 Efficiency Checks
348
8.16.3 Complaints Management
348
8.16.4 Approaches for Expert Systems
349
8.17 Rule-Based Verification of User Input
350
8.17.1 Creating User Dialogues
350
8.17.2 User Interface Designer (UID)
351
8.17.3 The Final Step — Rule Generation
354
8.18 Concise Summary
354
References
358
Chapter 9 Outlook 361
9.1 Introduction
361
9.2 Attempting a Definition
361
9.3 Some Critical Considerations
362
9.3.1 The Comprehension Factor
363
9.3.2 The Resistance Factor
363
9.3.3 The Educational Factor
363
9.3.4 The Usability Factor
364
9.3.5 The Commercial Factor
365
9.4 Looking Forward
365
Reference
366
Index 367


Fresenius University of Applied Sciences, Cologne, Germany