Student Resources


Courses S22

Courses F21

Other Years

Graduate Programs

Digital Biology: A survey of topics in bioinformatics and functional genomics

BIOL-L/MLS-M 388 Spring 2022

Course description


Class times and locations

Tues, Thur 11:30a - 12:45p (Credits: 3.0); Biology Building (JH) A106

Tentative schedule (class notes can be viewed with Adobe Acrobat Reader)

btn_printerFriendly.gif version of this schedule

Tues Jan 11 0.1


What is Digital Biology?
Topics in bioinformatics, scope of class, and resources

New virus outbreak: BBC Jan. 19, 2020
New virus outbreak: BBC Jan. 20, 2020
Precedence: The Fastest Outbreak - connection to bioinformatics work

Bioinformatics Databases and Computing Resources
National Center for Biotechnology Information (NCBI), a great starting point for "anything" bioinformatics

Mapping biological question onto computational problems:
The modeling spiral
Sequence gazing: the famous TATA-box
Volker Brendel
Thur Jan 13 1.1

Module 1: Bioinformatics resources and workspaces

Ubuntu on Windows: ... a quick way to get to a command line terminal

Virtual Machines: VirtualBox

Linux Basics

Basic UNIX shell tutorial
Command-line bootcamp

The UNIX Shell
The UNIX Shell: Summary of Basic Commands
vi(m) editor tutorial

Volker Brendel
Tues Jan 18 1.2 Customizing your Linux work space
Getting code with wget
Volker Brendel
Thur Jan 20 1.3 Basic Linux system maintenance
Working with NCBI data
Volker Brendel
Tues Jan 25 1.4 Getting code with git Volker Brendel
Thur Jan 27 1.5 Command line access to NCBI data: EDirect Volker Brendel
Tues Feb 1 2.1

Module 2: Pairwise Sequence Alignment


How Do We Compare Biological Sequences?
from Bioinformatics Algorithms: An Active Learning Approach
Volker Brendel
Thur Feb 3 2.2 PWSA: Definition and representation of "alignments" Volker Brendel
Tues Feb 8 2.3 Global alignment (Needleman-Wunsch)
How to calculate the number of NW alignments
Volker Brendel
Thur Feb 13 2.4 Scoring alignments
How to calculate the optimal alignment score
(and find an optimal alignment)
Volker Brendel
Tues Feb 15 2.5 Computers within computers: VMs and Singularity/Apptainer containers

PWSA: NW algorithm with no end-gap penalties
PWSA: allowing "double-gaps"
PWSA: local alignment (Smith-Waterman)

Home Work Assignment 1 posted. Due: Feb. 22
Volker Brendel
Thur Feb 17 2.6 Sequence analysis with scores: Concepts and statistical foundations

Practice: BLAST and Substitution scoring matrices
Biological Sequence Analysis I (Lecturer: Dr. Andy Baxevanis)

Slides for Sequence Analysis I presentation
Handout for Sequence Analysis I presentation
Volker Brendel
Tues Feb 22 3.1

Module 3: Basic Concepts in Molecular Phylogenetics

Molecular Phylogeny: Models

The powers and pitfalls of parsimony
Volker Brendel
Thur Feb 24 3.2 Lectures on molecular phylogeny
(from Bioinformatics: An Active Learning Approach)
Volker Brendel
Tues March 1 3.3 Volker Brendel
Thur March 3 3.4

Volker Brendel
Tues March 8 4.1

Module 4: Hidden Markov Models

Hidden Markov Models: Concepts and Algorithms

Hidden Markov Models
(from Bioinformatics: An Active Learning Approach)

Home Work Assignment 2 posted. Due: March 11
Volker Brendel
Thur March 10 4.2 Hidden Markov Models: Algorithms

Review: Conditional Probability

GeneMark.hmm prokaryotic
GeneMark article
Volker Brendel
Tues March 15 Spring Break
Thur March 17 Spring Break
Tues March 22 5.1

Module 5: Genome Assembly and Annotation

Home Work Assignment 3 posted. Due: March 29
Volker Brendel
Thur March 24 5.2 Genome Annotation: Evaluation

Sensitivity, specificity, and all that
sample paper

How to evaluate gene structure prediction accuracy
Volker Brendel
Tues March 29 5.3 Eukaryotic gene finding:

GENSCAN; paper see here

Genome Assembly

Illumina sequencing
nanopore sequencing

Assembly basics
NCBI Assembly Help
How do we assemble genomes?
from Bioinformatics Algorithms: An Active Learning Approach
Volker Brendel
Thur March 31 5.4

Home Work Assignment 4 posted. Due: April 5
Volker Brendel
Tues April 5 6.1

Module 6: Genetic Variation

Volker Brendel
Thur April 7 6.2

Home Work Assignment 5 posted. Due: April 12
Volker Brendel
Tues April 12 7.1 Relevant file format specifications:

Variant Call Format (VCF)
Sequence Alignment Map format (SAM)
SAM flags explained
Pileup format (used by samtools)

Relevant code:

NCBI SRA Toolkit
Volker Brendel
Thur April 14 7.2 Gene Expression Analyses
from Bioinformatics Algorithms: An Active Learning Approach

MIT Lecture: Gene Regulatory Networks
MIT Data Science: Clustering
Volker Brendel
Tues April 19 7.1

Module 7: Protein Structure

PDB: What is a protein?
PDB: How enzymes work
Guide to PDB

PDB Molecule of the Month
NCBI Protein

Volker Brendel
Thur April 21 7.2 peptide bond
Ramachandran plot ... very nice visualization thereof
(thanks to Prof. Eric Martz)

Secondary Structure

2StruCompare server

Jpred - secondary structure prediction



Home Work Assignment 6 posted. Due: April 27
Volker Brendel
Tues April 26 8.1 Review Volker Brendel
Thur April 28 8.2 Review Volker Brendel
Thur May 5 12:40pm - 2:40pm Final Examination Students