Muutke küpsiste eelistusi

Beginners Guide To Bioinformatics For High Throughput Sequencing [Pehme köide]

(Nus, S'pore), (Nus, S'pore)
  • Formaat: Paperback / softback, 276 pages
  • Ilmumisaeg: 29-Nov-2018
  • Kirjastus: World Scientific Publishing Co Pte Ltd
  • ISBN-10: 9813231661
  • ISBN-13: 9789813231665
Teised raamatud teemal:
  • Formaat: Paperback / softback, 276 pages
  • Ilmumisaeg: 29-Nov-2018
  • Kirjastus: World Scientific Publishing Co Pte Ltd
  • ISBN-10: 9813231661
  • ISBN-13: 9789813231665
Teised raamatud teemal:
Biologists find computing bewildering; yet they are expected to be able to process the voluminous data available from the machines they buy and the datasets that has accumulated in genomic databanks worldwide. It is now increasingly difficult for them to avoid dealing with large volumes of data, that goes beyond just doing manual programming.Most books in this realm are full of equations and complex code but this book gives a much gentler entry point particularly for biologists, with code snippets users can use to cut and paste, and run on their Linux or MacOSX operating system or cloud instance. It also provides a step by step installation instructions which they can easily follow. Those who are in the field of genome sequencing and already familiar with the procedures of analysis, may also find this book useful in closing some knowledge gaps.High throughput sequencing requires high throughput and high performance computing. This book provides a gentle entry to high throughput sequencing by dealing with simple skills which the average biologist is increasingly required to master. You will find this book a breeze to read, and some suggestions in this book maybe new to you, something you might want to try out.
Preface v
Chapter 1 Preparing Your Computing Environment
1(16)
1.1 Buying Your Own Computer
1(3)
1.2 Setting up a Computing Server
4(2)
1.3 Establishing a Remote Connection to a Server
6(11)
Chapter 2 Learning Basic Linux Commands
17(42)
2.1 No Need to be a Linux Guru to use Linux Effectively
17(1)
2.2 Folder (Directory) Operations
18(14)
Controlling your command prompt
25(7)
2.3 File Operations
32(7)
2.4 Assignment of Permissions
39(8)
The path
46(1)
2.5 Understanding System Status
47(7)
UNIX redirection and pipes
52(2)
2.6 Other Useful Commands
54(5)
Chapter 3 Checking Sequence Quality
59(22)
3.1 Basic High-throughput Sequencing
59(2)
3.2 Challenges of High Throughput Genome Sequencing
61(1)
3.3 Standards of Quality Score
62(3)
3.4 Quality Check
65(16)
FastQC
65(7)
FASTX-Toolkit
72(9)
Chapter 4 Sequence Alignment
81(36)
4.1 The Purpose of Sequence Alignment
81(2)
Sequence assembly
83(1)
4.2 Selection of the Sequence Alignment Tools
83(7)
Burrows Wheeler
85(3)
The BWT encoding-decoding algorithm
88(2)
4.3 Actual Operation of the Sequence Alignment
90(9)
Download and installation of Bowtie
90(6)
Executing sequence alignment
96(3)
4.4 Sequence Alignment Results File Conversion
99(9)
Downloading SAMtools
99(9)
4.5 Using the Genome Browser
108(9)
Chapter 5 Speeding-up with GPUs
117(30)
5.1 Computational Advantages of the Graphics Card
117(2)
5.2 Industry Standards and Usage of GPU Computing
119(19)
5.3 Practical CUDA Applications in Bioinformatics
138(7)
Preparing the reference sequence
140(3)
Alignment with CUSHAW2-GPU
143(2)
5.4 The Reason for the Limited Success of GPUs
145(2)
Chapter 6 Establishing a Research Workflow Pipeline
147(30)
6.1 Automating Your Computational Workflow
147(1)
6.2 Scripting Language
148(9)
Script command
150(7)
6.3 Testing and Debugging
157(5)
Keeping track of the current project
158(1)
Complementing tests of code blocks
159(1)
Calculating the execution time
160(2)
6.4 Implementation Case Studies
162(8)
6.5 Case Study of Common Mistakes
170(7)
Mistake 1: Confusing mess of relative paths
170(2)
Mistake 2: Failure to change the necessary permissions
172(1)
Mistake 3: The disk becomes full during execution
172(2)
Mistake 4: Ignoring cross-platform shell portability considerations
174(3)
Chapter 7 Using a Bioinformatics Cloud Computing Platform
177(58)
7.1 Simple Introduction to the Cloud Computing Platform
177(1)
7.2 Amazon Web Service
178(4)
7.3 Bioinformatics Cloud Computing Platforms
182(33)
Logging in to use Galaxy services
184(3)
Uploading sequence data
187(8)
Sequence quality testing
195(7)
Execution of sequence alignment
202(3)
Selecting other Galaxy servers
205(2)
Design and use of research workflows
207(1)
Establishing new research workflows
207(2)
Sharing and publishing process
209(3)
Execution of research workflows
212(1)
Downloading or exporting research workflows
212(2)
Importing research workflows
214(1)
7.4 Installing and Setting up your Own Galaxy Server
215(21)
Downloading the latest version of the Galaxy
216(1)
Starting your Galaxy server
217(3)
Allowing external execution
220(1)
Installation of bioinformatics tools
220(9)
Adding new reference sequences
229(7)
Appendix Learning Regular Expressions through Practising Simple Data Processing 235(24)
Regular Expressions
236(1)
One character pattern match
236(1)
Numbering a file and printing line number of a hit
236(1)
Counting number of grepped hits
237(1)
UNIX redirection using pipes
237(1)
Grep and output several lines of context around the hit
237(1)
Grepping for non-matching lines
238(1)
Grepping for unwanted characters
238(1)
Mistake of logic
238(1)
Egrep or grep -E extended regular expression grep
239(1)
Egrep and the character class
239(1)
Egrep character class negation
240(1)
Regular expression: Beginning of line anchor ˆ
241(1)
Case-sensitive and case-insensitive grep
241(1)
Regular expression: End of line anchor
242(1)
More about regular expressions
242(2)
Even more regular expression
244(1)
Substitution with SED Awk and Perl
244(6)
Using Excel to do data processing
250(9)
Index 259