Next-generation sequencing (NGS)


DNA sequencing is the process of determining the precise order of nucleotides in a DNA molecule. Sequences reported by a DNA sequencing machine are typically very short, usually between 100-250 bp but sometimes as short as 25-35 bp. These sequences have a variety of useful research applications, most of which are based on reconstructing complete molecules (such as chromosomes or mRNAs) from smaller fragments or quantifying the relative abundance of a large number of molecules simultaneously.

The following paper provides an excellent review of DNA sequencing, highlighting the advent of NGS technologies in the last decade and their impact on genomics. Please read the paper in preparation for class on Wednesday January 21st.

Ten years of next-generation sequencing technology

Working with NGS data

Data formats

Although DNA has a beautiful and intricate chemical structure, when working with sequence data we ignore its atomic structure and instead focus at the resolution of entire nucleotides. Since there are only 4 nucleotides, we can use a tiny alphabet of 4 symbols to represent any DNA sequence: A for adenine, C for cytosine, G for guanine, and T for thymine.

Data files produced by sequencing machines contain many reads. Each read is the instrument's readout of the nucleotide sequence of a single DNA fragment, encoded in a string of As, Cs, Gs, and Ts. See this page for more information about common sequence data formats.

Quality control

How can you be sure the reads reported by the sequencing instrument are from your sample of interest and not from, for example, primers or barcode sequences? Can you identify and correct any errors in the reads? Quality control is always an important first step when you are working with NGS data. See this page for more information about the types of quality control you should consider.

cgss15/ngs/start.txt · Last modified: 2015/01/14 15:44 by standage
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki