Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Gene Expression Data Analysis: A Statistical and Machine Learning Perspective [Taylor & Francis e-raamat]

Pankaj Barah (Tezpur Univ.), Jugal Kumar Kalita (University of Colorado), Dhruba Kumar Bhattacharyya (Tezpur Univ.)

Formaat: 360 pages, 42 Tables, black and white; 68 Line drawings, black and white; 2 Halftones, black and white; 70 Illustrations, black and white
Ilmumisaeg: 22-Nov-2021
Kirjastus: Chapman & Hall/CRC
ISBN-13: 9780429322655

Teised raamatud teemal:

Taylor & Francis e-raamat
Hind: 207,73 €*
* hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
Tavahind: 296,75 €
Säästad 30%

Formaat: 360 pages, 42 Tables, black and white; 68 Line drawings, black and white; 2 Halftones, black and white; 70 Illustrations, black and white
Ilmumisaeg: 22-Nov-2021
Kirjastus: Chapman & Hall/CRC
ISBN-13: 9780429322655

Teised raamatud teemal:

Rohkem infot Taylor & Francis e-raamatute kohta

Raamatu kodulehekülg: https://www.taylorfrancis.com/books/9780429322655

"The book introduces phenomenal growth of data generated by increasing numbers of genome sequencing projects and other throughput technology-led experimental efforts. It provides information about various sources of gene expression data, and pre-processing, analysis, and validation of such data"--

Development of high throughput technologies in molecular biology during the last two decades has contributed to the production of tremendous amounts of data. Microarray and RNA-sequencing are two such widely used high throughput technologies for monitoring the expression patterns of thousands of genes simultaneously. Data produced from such experiments are voluminous (both in dimensionality and numbers of instances) and evolving in nature. Analysis of huge amounts of data towards the identification of interesting patterns that are relevant for a given biological question requires high performance computational infrastructure as well as efficient machine learning algorithms. Cross-communication of ideas between biologists and computer scientists remains a big challenge.

Gene Expression Data Analysis: A Statistical and Machine Learning Perspective

has been written keeping a multi-disciplinary audience in mind. The book discusses gene expression data analysis from molecular biology, machine learning and statistical perspectives. Readers will be able to acquire both theoretical as well as practical knowledge of methods for identification of novel patterns of high biological significance. To measure the effectiveness of such algorithms, we discuss statistical and biological performance metrics that can be used in real life or in a simulated environment. This book discusses a large number of benchmark algorithms, tools, systems and repositories that are commonly used in analyzing gene expression data and validating results.This book will benefit students, researchers and practitioners in biology, medicine, and computer science by enabling them to acquire in-depth knowledge in statistical and machine learning based methods for analyzing gene expression data.

Key features:

An introduction to the Central Dogma of molecular biology and information flow in biological systems.

A systematic overview of the methods for generating gene expression data.

Background knowledge on statistical modeling and machine learning techniques.

Detailed methodology of analyzing gene expression data with an example case study.

Clustering methods for finding co-expression patterns from microarray, bulkRNA and scRNA data.

A large number of practical tools, systems and repositories that are useful for computational biologists to create, analyze and validate biologically relevant gene expression patterns.

Suitable for multi-disciplinary researchers and practitioners in computer science and biological sciences.

The book introduces phenomenal growth of data generated by increasing numbers of genome sequencing projects and other throughput technology-led experimental efforts. It provides information about various sources of gene expression data, and pre-processing, analysis, and validation of such data.

Acknowledgements

xiii

Authors

Preface

xvii

1 Introduction

(26)

1.1 Introduction

(1)

1.2 Central Dogma

(1)

1.3 Measuring Gene Expression

(2)

1.4 Representation of Gene Expression Data

(2)

1.5 Gene Expression Data Analysis: Applications

(2)

1.6 Machine Learning

(2)

1.7 Statistical and Biological Evaluation

(1)

1.8 Gene Expression Analysis Approaches

(10)

1.8.1 Preprocessing in Microarray and RNAseq Data

(4)

1.8.2 Co-Expressed Pattern-Finding Using Machine Learning

(4)

1.8.3 Co-Expressed Pattern-Finding Using Network-Based Approaches

(1)

1.9 Differential Co-Expression Analysis

(1)

1.10 Differential Expression Analysis

(1)

1.11 Tools and Systems for Gene Expression Data Analysis

(1)

1.11.1 (Diff) Co-Expression Analysis Tools and Systems

(1)

1.11.2 Differential Expression Analysis Tools and Systems

(1)

1.12 Contribution of This Book

(1)

1.13 Organization of This Book

(3)

2 Information Flow in Biological Systems

(12)

2.1 Concept of Systems Theory

(1)

2.1.1 A Brief History of Systems Thinking

(1)

2.1.2 Areas of Application of Systems Theory in Biology

(1)

2.2 Complexity in Biological Systems

(2)

2.2.1 Hierarchical Organization of Biological Systems from Macroscopic Levels to Microscopic Levels

(1)

2.2.2 Information Flow in Biological Systems

(1)

2.2.3 Top-Down and Bottom-Up Flow

(1)

2.3 Central Dogma of Molecular Biology

(4)

2.3.1 DNA Replication

(1)

2.3.2 Transcription

(1)

2.3.3 Translation

(1)

2.4 Ambiguity in Central Dogma

(3)

2.4.1 Reverse Transcription

(1)

2.4.2 RNA Replication

(1)

2.5 Discussion

(2)

2.5.1 Biological Information Flow from a Computer Science Perspective

(1)

2.5.2 Future Perspective

(2)

3 Gene Expression Data Generation

(14)

3.1 History of Gene Expression Data Generation

(2)

3.2 Low-Throughput Methods

(2)

3.2.1 Northern Blotting

(1)

3.2.2 Ribonuclease Protection Assay

(1)

3.2.3 qRT-PCR

(1)

3.2.4 SAGE

(1)

3.3 High-Throughput Methods

(9)

3.3.1 Microarray

(1)

3.3.2 RNA-Seq

(2)

3.3.3 Types of RNA-Seq

(2)

3.3.4 Gene Expression Data Repositories

(2)

3.3.5 Standards in Gene Expression Data

(2)

3.4
Chapter Summary

(1)

4 Statistical Foundations and Machine Learning

(92)

4.1 Introduction

(1)

4.2 Statistical Background

(14)

4.2.1 Statistical Modeling

(1)

4.2.2 Probability Distributions

(1)

4.2.3 Hypothesis Testing

(1)

4.2.4 Exact Tests

(1)

4.2.5 Common Data Distributions

(8)

4.2.6 Multiple Testing

(1)

4.2.7 False Discovery Rate

(1)

4.2.8 Maximum Likelihood Estimation

(2)

4.3 Machine Learning Background

(73)

4.3.1 Significance of Machine Learning

(2)

4.3.2 Machine Learning and Its Types

(3)

4.3.3 Supervised Learning Methods

(11)

4.3.4 Unsupervised Learning Methods

(40)

4.3.5 Outlier Mining

124

(4)

4.3.6 Association Rule Mining

128

(12)

4.4
Chapter Summary

140

(5)

4.4.1 Statistical Modeling

140

(1)

4.4.2 Supervised Learning: Classification and Regression Analysis

140

(1)

4.4.3 Proximity Measures

141

(1)

4.4.4 Unsupervised Learning: Clustering

141

(1)

4.4.5 Unsupervised Learning: Biclustering

142

(1)

4.4.6 Unsupervised Learning: Triclustering

142

(1)

4.4.7 Outlier Mining

143

(1)

4.4.8 Unsupervised Learning: Association Mining

143

(2)

5 Co-Expression Analysis

145

(74)

5.1 Introduction

145

(2)

5.2 Gene Co-Expression Analysis

147

(4)

5.2.1 Types of Gene Co-Expression

148

(1)

5.2.2 An Example

148

(3)

5.3 Measures to Identify Co-Expressed Patterns

151

(1)

5.4 Co-Expression Analysis Using Clustering

152

(40)

5.4.1 CEA Using Clustering: A Generic Architecture

153

(10)

5.4.2 Co-Expressed Pattern Finding Using 1-Way Clustering

163

(15)

5.4.3 Subspace or 2-way Clustering in Co-Expression Mining

178

(8)

5.4.4 Co-Expressed Pattern-Finding Using 3-Way Clustering

186

(6)

5.5 Network Analysis for Co-Expressed Pattern-Finding

192

(23)

5.5.1 Definition of CEN

193

(1)

5.5.2 Analyzing CENs: A Generic Architecture

193

(22)

5.6
Chapter Summary and Recommendations

215

(4)

6 Differential Expression Analysis

219

(42)

6.1 Introduction

219

(2)

6.1.1 Importance of DE Analysis

220

(1)

6.2 Differential Expression (DE) of a Gene

221

(1)

6.2.1 Differential Expression of a Gene: An Example

221

(1)

6.3 Differential Expression Analysis (DEA)

222

(28)

6.3.1 A Generic Framework

223

(1)

6.3.2 Preprocessing

223

(7)

6.3.3 DE Genes Identification

230

(13)

6.3.4 DE Gene Analysis

243

(4)

6.3.5 Statistical Validation

247

(2)

6.3.6 Discussion

249

(1)

6.4 Biomarker Identification Using DEA: A Case Study

250

(7)

6.4.1 Problem Definition

251

(1)

6.4.2 Dataset Used

251

(1)

6.4.3 Preprocessing

251

(1)

6.4.4 Framework of Analysis Used

252

(2)

6.4.5 Results

254

(2)

6.4.6 Discussion

256

(1)

6.5 Summary and Recommendations

257

(4)

7 Tools and Systems

261

(34)

7.1 Introduction

261

(4)

7.1.1 Generic Characteristics of a Systems Biology Tool

261

(1)

7.1.2 Target Systems Biology Activities

262

(3)

7.2 Systems Biology Tools

265

(13)

7.2.1 A Taxonomy

265

(1)

7.2.2 Pre-Processing Tools

266

(12)

7.3 Gene Expression Data Analysis Tools

278

(6)

7.3.1 Co-Expression Analysis

279

(4)

7.3.2 Differential Co-Expression Analysis

283

(1)

7.3.3 Differential Expression Analysis

283

(1)

7.4 Visualization

284

(1)

7.5 Validation

285

(3)

7.5.1 Statistical Validation

286

(2)

7.6 Biological Validation

288

(1)

7.7
Chapter Summary and Concluding Remarks

289

(6)

8 Concluding Remarks and Research Challenges

295

(6)

8.1 Concluding Remarks

295

(1)

8.2 Some Issues and Research Challenges

296

(5)

Bibliography

301

(46)

Glossary

347

(8)

Index

355

Pankaj Barah is an Assistant professor in Molecular Biology and Biotechnology at Tezpur University. He has received his M.Sc. degree in Bioinformatics (2006) from University of Madras in India and PhD in Computational Systems Biology (2013) from the Norwegian University of Science and Technology (NTNU), Trondheim, Norway. He has worked as Bioinformatics scientist in the division of Theoretical Bioinformatics at German Cancer Research Center (DKFZ) in Heidelberg, Germany during 2015-2017. His research areas include- computational systems biology, bioinformatics, evolutionary systems biology, Next Generation Sequencing (NGS), Big data analytics and biological networks. He has authored 20 research articles, edited two books and written 5 book chapters. He is recipient of Ramalingaswami Re-entry Fellowship from the Department of Biotechnology, Government of India. Dr. Barah is currently a member of the Indian National Young Academy of Sciences.

Dhruba Kumar Bhattacharyya is a professor in Computer Science and Engineering at Tezpur University. He teaches machine learning, network security, cryptography and computational biology in UG, PG and PhD classes at Tezpur University. Professor Bhattacharyya's research areas include machine learning, network security, and bioinformatics. He has published more than 280 research articles in leading international journals and peer-reviewed conference proceedings. Dr. Bhattacharyya has authored 5 technical reference books and edited 9 technical volumes. Under his guidance, twenty students have successfully completed Ph.D. in the areas of machine learning, bioinformatics and network security. He is PI of several major research grants, including the Centre of Excellence of Ministry of HRD of Government of India under FAST instituted at Tezpur University. Professor Bhattacharyya is a Fellow of IETE and IE, India. He is also a Senior Member of IEEE. More details about Dr Bhattacharyya can be found at http://agnigarh.tezu.ernet.in/_dkb/index.html.

Jugal Kumar Kalita teaches computer science at the University of Colorado, Colorado Springs. He received M.S. and Ph.D. degrees in computer and information science from the University of Pennsylvania in Philadelphia in 1988 and 1990, respectively. Prior to that he had received an M.Sc. from the University of Saskatchewan in Saskatoon, Canada in 1984 and a B.Tech. from the Indian Institute of Technology, Kharagpur in 1982. His expertise is in the areas of artificial intelligence and machine learning, and the application of techniques in machine learning to network security, natural language processing and bioinformatics. He has published 130 papers in journals and refereed conferences. He is the author of a book on Perl titled "On Perl: Perl for Students and Professionals". He is also a coauthor of a book titled "Network Anomaly Detection: A Machine Learning Perspective" with Dr Dhruba K Bhattacharyya. He received the Chancellor's Award at the University of Colorado, Colorado Springs, in 2011, in recognition of lifelong excellence in teaching, research and service. More details about Dr. Kalita can be found at http://www.cs.uccs.edu/_kalita.

Püsilink: https://www.kriso.ee/db/9780429322655_pe.html

Märksõnad:

E-raamat: Gene Expression Data Analysis: A Statistical and Machine Learning Perspective [Taylor & Francis e-raamat]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Kirjastuste teemad

Vali ostukorv