Muutke küpsiste eelistusi

RNA-seq Data Analysis: A Practical Approach [Kõva köide]

(CSC - IT Center for Science, Espoo, Finland), (SciLifeLab, Stockholm University, Sweden), (University of Helsinki, Finland), (RS-koulutus, Helsinki, Finland), (University of Eastern Finland, Kuopio)
  • Formaat: Hardback, 324 pages, kõrgus x laius: 234x156 mm, kaal: 640 g, 12 Tables, black and white; 55 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Computational Biology Series
  • Ilmumisaeg: 19-Sep-2014
  • Kirjastus: CRC Press Inc
  • ISBN-10: 1466595000
  • ISBN-13: 9781466595002
Teised raamatud teemal:
  • Formaat: Hardback, 324 pages, kõrgus x laius: 234x156 mm, kaal: 640 g, 12 Tables, black and white; 55 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Computational Biology Series
  • Ilmumisaeg: 19-Sep-2014
  • Kirjastus: CRC Press Inc
  • ISBN-10: 1466595000
  • ISBN-13: 9781466595002
Teised raamatud teemal:
"RNA-seq offers unprecedented information about transcriptome, but harnessing this information with bioinformatics tools is typically a bottleneck. This self-contained guide enables researchers to examine differential expression at gene, exon, and transcript level and to discover novel genes, transcripts, and whole transcriptomes. Each chapter starts with theoretical background, followed by descriptions of relevant analysis tools. The book also provides examples using command line tools and the R statistical environment. For non-programming scientists, the same examples are covered using open source software with a graphical user interface"--



The State of the Art in Transcriptome Analysis
RNA sequencing (RNA-seq) data offers unprecedented information about the transcriptome, but harnessing this information with bioinformatics tools is typically a bottleneck.RNA-seq Data Analysis: A Practical Approach enables researchers to examine differential expression at gene, exon, and transcript levels and to discover novel genes, transcripts, and whole transcriptomes.

Balanced Coverage of Theory and Practice
Each chapter starts with theoretical background, followed by descriptions of relevant analysis tools and practical examples. Accessible to both bioinformaticians and nonprogramming wet lab scientists, the examples illustrate the use of command-line tools, R, and other open source tools, such as the graphical Chipster software.

The Tools and Methods to Get Started in Your Lab
Taking readers through the whole data analysis workflow, this self-contained guide provides a detailed overview of the main RNA-seq data analysis methods and explains how to use them in practice. It is suitable for researchers from a wide variety of backgrounds, including biology, medicine, genetics, and computer science. The book can also be used in a graduate or advanced undergraduate course.

Arvustused

"Next-generation sequencing (NGS) is without doubt among the last decades most important technological advance in molecular biology, and RNA sequencing is its most common application, rapidly becoming an indispensable tool in drug discovery and biomarker identification. Given the complexity and fast-paced evolution of the NGS methodology, it may seem overwhelming to a novice to figure out where to get started. RNA-seq Data Analysis: A Practical Approach solves this problem: the single volume provides the reader with a wealth of details extending from the very fundamentals of NGS technology to comprehensive hands-on instructions on how to interpret your freshly baked sequencing reads. After reading this book, you will have all the necessary information to start putting RNA-seq to work answering your research questions." Dr. Satu Nahkuri, Pharma Research and Early Development, F. Hoffmann-La Roche Ltd.

"This is a fantastic book and a real resource for anyone embarking or already working in RNA-seq data analysis. It is a practical guide that provides layers of information to the reader to comprehend the different steps and options when analysing RNA-seq data. The content and style of the book are great and the authors clearly explain and provide well-rounded examples. This book stands out among others since it is very easy to follow and does not require a strong programming or statistical background. It is obvious the authors have experience with explaining and probably teaching others on how to perform RNA-seq analysis. I highly recommend this book to students, researchers, as well as trainers in RNA-seq data analysis." Dr. Maria Victoria Schneider (Vicky), The Genome Analysis Centre, UK

"It is really a very practical book for both wet lab biologists and computer scientists working on RNA-seq projects. The book is clearly written with a general introduction to RNA-seq in Chapter 1 and a brief description to RNA-seq data analysis in Chapter 2. Detailed information of computational methods, analysis pipelines, and software tools are presented in the remaining chapters with some real examples. I believe that this book will serve not only as a textbook for an introductory course of omics data analysis but also as a guideline for researchers working on RNA-seq projects." Jingchu Luo, Professor, Peking University

"RNA-seq is currently the best method for genome-wide transcriptional profiling of cells in about any organism. This book includes all the key steps, in generally the same organization, that weve found to be effective when training biologists and bioinformaticians in RNA-seq analysis. Its a good guide and reference for RNA-seq that can get analysts started (or keep them going) while avoiding much of the time-consuming literature, documentation, and Web searches at each step of the pipeline. It covers a broad (but not overwhelming) selection of popular methods without the typical bias of an authors research emphasis." George W. Bell, Ph.D., Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, USA

"In modern life sciences, it is increasingly the bioinformatics aspect that holds the essential key to successful research projects and discoveries, albeit often poorly understood, or even regarded as a black box. This is precisely the point at which RNA-seq Data Analysis: A Practical Approach comes in. This book, a brilliant compilation of all different aspects of RNA-sequencing analyses, opens up this black box and reveals all of its inner workings. It covers all the basic principles while maintaining a tight focus on the practical aspects of successful analyses, including discussing caveats and possible pitfalls. Following this spirit throughout, it is intuitively structured and easy to read, making it attractive both for researchers who want to quickly deepen their understanding of RNA-seq processing and for use as teaching material in the classroom for the scientists of the future." Manfred Grabherr, Department of Medical Biochemistry and Microbiology, Uppsala University

"I feel that this is a marvelous book and will be of invaluable use to support bioinformaticians, graduate students, and the occasional user of the RNA-seq technology. It has great depth of content and addresses the key areas of designing your study and making sense of the data. It covers the typical scenarios that a services bioinformatician is likely to encounter and I can see this book having a place on desks of scientists across academia and industry." Stephen Rudd, Head of Computational Biology, University of Queensland

"I strongly recommend RNA-seq Data Analysis: A Practical Approach to any scientist who plans to do sequencing experiments, even if he will not analyze the data by himself. The book gives in the first parts very important outlines of the sequencing technology and how it is working. Going further, the book covers all state-of-the-art techniques of RNA-seq analysis in a very profound yet clear way and will be of great value even for the advanced bioinformatician as a reference work. a must-have recommendation to everyone working in the molecular biology field Even as a bioinformatician with over 15 years of experience in this field, I found many valuable details and some yet not known facts about sequencing data analysis." Oliver Heil, Bioinformatician, German Cancer Research Center (DKFZ)

"Many people are interested in applying RNA-seq to examine the transcriptome of their organism of interest but they are finding it difficult to apply it in their own laboratories. In this book, RNA-seq is introduced by describing the different platforms and then the reader is taken in a systematic way through the process of analyzing RNA-seq data using several free open source tools. This book will be of interest to people starting with RNA-seq." Dr. Etienne de Villiers, Bioinformatics Group Leader, KEMRI-Wellcome Trust Research Programme (KWTRP), and Centre for Tropical Medicine, Nuffield Department of Medicine, University of Oxford "Next-generation sequencing (NGS) is without doubt among the last decades most important technological advance in molecular biology, and RNA sequencing is its most common application, rapidly becoming an indispensable tool in drug discovery and biomarker identification. Given the complexity and fast-paced evolution of the NGS methodology, it may seem overwhelming to a novice to figure out where to get started. RNA-seq Data Analysis: A Practical Approach solves this problem: the single volume provides the reader with a wealth of details extending from the very fundamentals of NGS technology to comprehensive hands-on instructions on how to interpret your freshly baked sequencing reads. After reading this book, you will have all the necessary information to start putting RNA-seq to work answering your research questions." Dr. Satu Nahkuri, Pharma Research and Early Development, F. Hoffmann-La Roche Ltd.

"This is a fantastic book and a real resource for anyone embarking or already working in RNA-seq data analysis. It is a practical guide that provides layers of information to the reader to comprehend the different steps and options when analysing RNA-seq data. The content and style of the book are great and the authors clearly explain and provide well-rounded examples. This book stands out among others since it is very easy to follow and does not require a strong programming or statistical background. It is obvious the authors have experience with explaining and probably teaching others on how to perform RNA-seq analysis. I highly recommend this book to students, researchers, as well as trainers in RNA-seq data analysis." Dr. Maria Victoria Schneider (Vicky), The Genome Analysis Centre, UK

"It is really a very practical book for both wet lab biologists and computer scientists working on RNA-seq projects. The book is clearly written with a general introduction to RNA-seq in Chapter 1 and a brief description to RNA-seq data analysis in Chapter 2. Detailed information of computational methods, analysis pipelines, and software tools are presented in the remaining chapters with some real examples. I believe that this book will serve not only as a textbook for an introductory course of omics data analysis but also as a guideline for researchers working on RNA-seq projects." Jingchu Luo, Professor, Peking University

"RNA-seq is currently the best method for genome-wide transcriptional profiling of cells in about any organism. This book includes all the key steps, in generally the same organization, that weve found to be effective when training biologists and bioinformaticians in RNA-seq analysis. Its a good guide and reference for RNA-seq that can get analysts started (or keep them going) while avoiding much of the time-consuming literature, documentation, and Web searches at each step of the pipeline. It covers a broad (but not overwhelming) selection of popular methods without the typical bias of an authors research emphasis." George W. Bell, Ph.D., Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, USA

"In modern life sciences, it is increasingly the bioinformatics aspect that holds the essential key to successful research projects and discoveries, albeit often poorly understood, or even regarded as a black box. This is precisely the point at which RNA-seq Data Analysis: A Practical Approach comes in. This book, a brilliant compilation of all different aspects of RNA-sequencing analyses, opens up this black box and reveals all of its inner workings. It covers all the basic principles while maintaining a tight focus on the practical aspects of successful analyses, including discussing caveats and possible pitfalls. Following this spirit throughout, it is intuitively structured and easy to read, making it attractive both for researchers who want to quickly deepen their understanding of RNA-seq processing and for use as teaching material in the classroom for the scientists of the future." Manfred Grabherr, Department of Medical Biochemistry and Microbiology, Uppsala University

"I feel that this is a marvelous book and will be of invaluable use to support bioinformaticians, graduate students, and the occasional user of the RNA-seq technology. It has great depth of content and addresses the key areas of designing your study and making sense of the data. It covers the typical scenarios that a services bioinformatician is likely to encounter and I can see this book having a place on desks of scientists across academia and industry." Stephen Rudd, Head of Computational Biology, University of Queensland

"I strongly recommend RNA-seq Data Analysis: A Practical Approach to any scientist who plans to do sequencing experiments, even if he will not analyze the data by himself. The book gives in the first parts very important outlines of the sequencing technology and how it is working. Going further, the book covers all state-of-the-art techniques of RNA-seq analysis in a very profound yet clear way and will be of great value even for the advanced bioinformatician as a reference work. a must-have recommendation to everyone working in the molecular biology field Even as a bioinformatician with over 15 years of experience in this field, I found many valuable details and some yet not known facts about sequencing data analysis." Oliver Heil, Bioinformatician, German Cancer Research Center (DKFZ)

"Many people are interested in applying RNA-seq to examine the transcriptome of their organism of interest but they are finding it difficult to apply it in their own laboratories. In this book, RNA-seq is introduced by describing the different platforms and then the reader is taken in a systematic way through the process of analyzing RNA-seq data using several free open source tools. This book will be of interest to people starting with RNA-seq." Dr. Etienne de Villiers, Bioinformatics Group Leader, KEMRI-Wellcome Trust Research Programme (KWTRP), and Centre for Tropical Medicine, Nuffield Department of Medicine, University of Oxford

Preface xvii
Acknowledgments xxi
Authors xxiii
Chapter 1 Introduction to RNA-seq
1(26)
1.1 Introduction
1(2)
1.2 Isolation Of RNAs
3(1)
1.3 Quality Control Of RNA
4(2)
1.4 Library Preparation
6(3)
1.5 Major RNA-Seq Platforms
9(5)
1.5.1 Illumina
9(1)
1.5.2 Solid
10(1)
1.5.3 Roche 454
11(1)
1.5.4 Ion Torrent
11(1)
1.5.5 Pacific Biosciences
12(1)
1.5.6 Nanopore Technologies
13(1)
1.6 RNA-Seq Applications
14(7)
1.6.1 Protein Coding Gene Structure
14(2)
1.6.2 Novel Protein-Coding Genes
16(1)
1.6.3 Quantifying and Comparing Gene Expression
16(1)
1.6.4 Expression Quantitative Train Loci (eQTL)
17(1)
1.6.5 Single-Cell RNA-seq
18(1)
1.6.6 Fusion Genes
18(1)
1.6.7 Gene Variations
19(1)
1.6.8 Long Noncoding RNAs
19(1)
1.6.9 Small Noncoding RNAs (miRNA-seq)
20(1)
1.6.10 Amplification Product Sequencing (Ampli-seq)
20(1)
1.7 Choosing An RNA-Seq Platform
21(6)
1.7.1 Eight General Principles for Choosing an RNA-seq Platform and Mode of Sequencing
21(1)
1.7.1.1 Accuracy: How Accurate Must the Sequencing Be?
21(1)
1.7.1.2 Reads: How Many Do I Need?
22(1)
1.7.1.3 Length: How Long Must the Reads Be?
23(1)
1.7.1.4 SR or PE: Single Read or Paired End?
23(1)
1.7.1.5 RNA or DNA: Am I Sequencing RNA or DNA?
23(1)
1.7.1.6 Material: How Much Sample Material Do I Have?
24(1)
1.7.1.7 Costs: How Much Can I Spend?
24(1)
1.7.1.8 Time: When Does the Work Need to Be Completed?
24(1)
1.7.2 Summary
25(1)
References
25(2)
Chapter 2 Introduction to RNA-seq Data Analysis
27(14)
2.1 Introduction
27(3)
2.2 Differential Expression Analysis Workflow
30(4)
2.2.1 Step 1: Quality Control of Reads
31(1)
2.2.2 Step 2: Preprocessing of Reads
31(1)
2.2.3 Step 3: Aligning Reads to a Reference Genome
31(1)
2.2.4 Step 4: Genome-Guided Transcriptome Assembly
32(1)
2.2.5 Step 5: Calculating Expression Levels
32(1)
2.2.6 Step 6: Comparing Gene Expression between Conditions
33(1)
2.2.7 Step 7: Visualization of Data in Genomic Context
33(1)
2.3 Downstream Analysis
34(1)
2.3.1 Gene Annotation
34(1)
2.3.2 Gene Set Enrichment Analysis
34(1)
2.4 Automated Workflows And Pipelines
35(1)
2.5 Hardware Requirements
35(1)
2.6 Following The Examples In The Book
36(4)
2.6.1 Using Command Line Tools and R
36(1)
2.6.2 Using the Chipster Software
37(2)
2.6.3 Example Data Sets
39(1)
2.7 Summary
40(1)
References
40(1)
Chapter 3 Quality Control and Preprocessing
41(22)
3.1 Introduction
41(1)
3.2 Software For Quality Control And Preprocessing
42(2)
3.2.1 FastQC
42(1)
3.2.2 PRINSEQ
43(1)
3.2.3 Trimmomatic
44(1)
3.3 Read Quality Issues
44(16)
3.3.1 Base Quality
44(1)
3.3.1.1 Filtering
45(4)
3.3.1.2 Trimming
49(3)
3.3.2 Ambiguous Bases
52(2)
3.3.3 Adapters
54(1)
3.3.4 Read Length
55(1)
3.3.5 Sequence-Specific Bias and Mismatches Caused by Random Hexamer Priming
56(1)
3.3.6 GC Content
57(1)
3.3.7 Duplicates
57(2)
3.3.8 Sequence Contamination
59(1)
3.3.9 Low-Complexity Sequences and PolyA Tails
59(1)
3.4 Summary
60(3)
References
61(2)
Chapter 4 Aligning Reads to Reference
63(22)
4.1 Introduction
63(1)
4.2 Alignment Programs
64(13)
4.2.1 Bowtie
64(4)
4.2.2 TopHat
68(5)
4.2.3 Star
73(4)
4.3 Alignment Statistics And Utilities For Manipulating Alignment Files
77(4)
4.4 Visualizing Reads In Genomic Context
81(1)
4.5 Summary
82(3)
References
83(2)
Chapter 5 Transcriptome Assembly
85(24)
5.1 Introduction
85(2)
5.2 Methods
87(5)
5.2.1 Transcriptome Assembly Is Different From Genome Assembly
87(1)
5.2.2 Complexity of Transcript Reconstruction
88(1)
5.2.3 Assembly Process
89(1)
5.2.4 de Bruijn Graph
90(1)
5.2.5 Use of Abundance Information
91(1)
5.3 Data Preprocessing
92(3)
5.3.1 Read Error Correction
93(1)
5.3.2 Seecer
93(2)
5.4 Mapping-Based Assembly
95(3)
5.4.1 Cufflinks
95(2)
5.4.2 Scripture
97(1)
5.5 De Novo Assembly
98(6)
5.5.1 Velvet + Oases
98(2)
5.5.2 Trinity
100(4)
5.6 Summary
104(5)
References
106(3)
Chapter 6 Quantitation and Annotation-Based Quality Control
109(22)
6.1 Introduction
109(1)
6.2 Annotation-Based Quality Metrics
110(6)
6.2.1 Tools For Annotation-Based Quality Control
111(5)
6.3 Quantitation Of Gene Expression
116(12)
6.3.1 Counting Reads per Genes
117(1)
6.3.1.1 HTSeq
117(3)
6.3.2 Counting Reads per Transcripts
120(2)
6.3.2.1 Cufflinks
122(1)
6.3.2.2 Express
122(4)
6.3.3 Counting Reads per Exons
126(2)
6.4 Summary
128(3)
References
129(2)
Chapter 7 RNA-seq Analysis Framework in R and Bioconductor
131(16)
7.1 Introduction
131(3)
7.1.1 Installing R and Add-on Packages
132(1)
7.1.2 Using R
133(1)
7.2 Overview Of The Bioconductor Packages
134(1)
7.2.1 Software Packages
134(1)
7.2.2 Annotation Packages
134(1)
7.2.3 Experiment Packages
135(1)
7.3 Descriptive Features Of The Bioconductor Packages
135(3)
7.3.1 Oop Features In R
135(3)
7.4 Representing Genes And Transcripts In R
138(3)
7.5 Representing Genomes In R
141(2)
7.6 Representing SNPs In R
143(1)
7.7 Forging New Annotation Packages
143(3)
7.8 Summary
146(1)
References
146(1)
Chapter 8 Differential Expression Analysis
147(34)
8.1 Introduction
147(1)
8.2 Technical Vs. Biological Replicates
148(1)
8.3 Statistical Distributions In RNA-Seq Data
149(3)
8.3.1 Biological Replication, Count Distributions, and Choice of Software
150(2)
8.4 Normalization
152(2)
8.5 Software Usage Examples
154(22)
8.5.1 Using Cuffdiff
154(4)
8.5.2 Using Bioconductor Packages: DESeq, edgeR, limma
158(1)
8.5.3 Linear Models, the Design Matrix, and the Contrast Matrix
158(1)
8.5.3.1 Design Matrix
159(1)
8.5.3.2 Contrast Matrix
160(1)
8.5.4 Preparations Ahead of Differential Expression Analysis
161(1)
8.5.4.1 Starting from BAM Files
162(1)
8.5.4.2 Starting from Individual Count Files
162(1)
8.5.4.3 Starting from an Existing Count Table
163(1)
8.5.4.4 Independent Filtering
163(1)
8.5.5 Code Example for DESeq(2)
163(1)
8.5.6 Visualization
164(4)
8.5.7 For Reference: Code Examples for Other Bioconductor Packages
168(1)
8.5.8 Limma
169(1)
8.5.9 SAMSeq (samr package)
170(1)
8.5.10 edgeR
171(1)
8.5.11 DESeq2 Code Example for a Multifactorial Experiment
171(3)
8.5.12 For Reference: edgeR Code Example
174(1)
8.5.13 Limma Code Example
175(1)
8.6 Summary
176(5)
References
177(4)
Chapter 9 Analysis of Differential Exon Usage
181(18)
9.1 Introduction
181(2)
9.2 Preparing The Input Files For Dexseq
183(1)
9.3 Reading Data In To R
184(1)
9.4 Accessing The ExonCountSet Object
185(2)
9.5 Normalization And Estimation Of The Variance
187(3)
9.6 Test For Differential Exon Usage
190(3)
9.7 Visualization
193(5)
9.8 Summary
198(1)
References
198(1)
Chapter 10 Annotating the Results
199(18)
10.1 Introduction
199(1)
10.2 Retrieving Additional Annotations
200(8)
10.2.1 Using an Organism-Specific Annotation Package to Retrieve Annotations for Genes
201(4)
10.2.2 Using BioMart to Retrieve Annotations for Genes
205(3)
10.3 Using Annotations For Ontological Analysis Of Gene Sets
208(2)
10.4 Gene Set Analysis In More Detail
210(6)
10.4.1 Competitive Method Using GOstats Package
211(2)
10.4.2 Self-Contained Method Using Globaltest Package
213(2)
10.4.3 Length Bias Corrected Method
215(1)
10.5 Summary
216(1)
References
216(1)
Chapter 11 Visualization
217(20)
11.1 Introduction
217(2)
11.1.1 Image File Types
218(1)
11.1.2 Image Resolution
218(1)
11.1.3 Color Models
219(1)
11.2 Graphics In R
219(13)
11.2.1 Heatmap
220(4)
11.2.2 Volcano Plot
224(2)
11.2.3 MA Plot
226(2)
11.2.4 Idiogram
228(2)
11.2.5 Visualizing Gene and Transcript Structures
230(2)
11.3 Finalizing The Plots
232(2)
11.4 Summary
234(3)
References
235(2)
Chapter 12 Small Noncoding RNAs
237(22)
12.1 Introduction
237(2)
12.2 MICRORNAs (miRNAs)
239(4)
12.3 Microrna Off-Set RNAS (moRNAs)
243(1)
12.4 Piwi-Associated RNAS (piRNAs)
243(1)
12.5 Endogenous Silencing RNAs (endo-siRNAs)
244(1)
12.6 Exogenous Silencing RNAs (exo-siRNAs)
244(1)
12.7 Transfer RNAs (tRNAs)
245(1)
12.8 Small Nucleolar RNAs (snoRNAs)
245(1)
12.9 Small Nuclear RNAs (snRNAs)
245(1)
12.10 Enhancer-Derived RNAs (eRNA)
246(1)
12.11 Other Small Noncoding RNAs
246(2)
12.12 Sequencing Methods For Discovery Of Small Noncoding RNAs
248(7)
12.12.1 microRNA-seq
248(3)
12.12.2 Clip-seq
251(3)
12.12.3 Degradome-seq
254(1)
12.12.4 Global Run-On Sequencing (GRO-seq)
254(1)
12.13 Summary
255(4)
References
255(4)
Chapter 13 Computational Analysis of Small Noncoding RNA Sequencing Data
259(28)
13.1 Introduction
259(1)
13.2 Discovery Of Small RNAs---miRDeep2
260(8)
13.2.1 GFF files
260(3)
13.2.2 FASTA Files of Known miRNAs
263(1)
13.2.3 Setting up the Run Environment
263(3)
13.2.4 Running miRDeep2
266(1)
13.2.4.1 miRDeep2 Output
266(2)
13.3 miRANALYZER
268(3)
13.3.1 Running miRanalyzer
271(1)
13.4 miRNA Target Analysis
271(5)
13.4.1 Computational Prediction Methods
272(2)
13.4.2 Artificial Intelligence Methods
274(1)
13.4.3 Experimental Support-Based Methods
275(1)
13.5 miRNA-Seq And mRNA-Seq Data Integration
276(1)
13.6 Small RNA Databases And Resources
277(7)
13.6.1 RNA-seq Reads of miRNAs in miRBase
277(2)
13.6.2 Expression Atlas of miRNAs
279(2)
13.6.3 Database for CLIP-seq and Degradome-seq Data
281(1)
13.6.4 Databases for miRNAs and Disease
281(1)
13.6.5 General Databases for the Research Community and Resources
282(1)
13.6.6 miRNAblog
282(2)
13.7 Summary
284(3)
References
284(3)
Index 287
Eija Korpelainen, Jarno Tuimala, Panu Somervuo, Mikael Huss, Garry Wong