Assignment: NGS and quality control

The exercise for this unit exposed you to some common quality analysis and quality control tools. This assignment is a follow-up to the exercise and is intended to give you additional experience evaluating NGS data. An assignment description is provided below.

  • Select a data set you want to study. If your research lab has done some sequencing, this is a great start. Or, if you've recently read a compelling paper that has an NGS component, see if they have provided an accession number for downloading the data from the NCBI SRA. You can also search the NCBI SRA to see if it contains any data sets for any organism of particular interest. If you still need some inspiration, consider SRR587238 (from the plant symbiotic bacterium Sinorhizobium meliloti) or SRR587238 (from the halictid bee Lasioglossum albipes).
  • Assess the quality of the data, and identify any potential issues.
  • Use one or more quality control tools correct any issues you identify in the data. The readings for this unit include a list of software tools you should consider, but this list is by no means comprehensive.
  • Assess the quality of the cleaned-up data and indicate whether the tool completed the intended task successfully.

Use your personal wiki namespace to keep a record of this assignment: links to data, commands you entered, any relevant graphics or screenshots, and clear explanations of what you did and why. A good rule of thumb is that you should be able to give these notes to an arbitrary scientist, and—assuming they have a basic understanding of genome biology and basic UNIX skills—they should be able to understand what you're trying to do, why you're trying to do it, and be able to reproduce your work.

Give as much care and detail to this electronic record keeping as you would if you were keeping a physical lab notebook in an experimental lab. One added benefit of the wiki is that you can iteratively refine your records, so you can use the wiki initially to dump notes in raw form, and then go back to clean up and organize (the cleaning up and organizing is crucial).

