Student Resources

Courses S25

Courses F24

Other Years

Graduate Programs

Fundamental Models and Algorithms in Bioinformatics

INFO I519 (= I617) Fall 2024


Course description

INFO I519 FMAB

Class times and locations

Mon, Wed 3:00p - 4:15p (Credits: 3.0); Lindley Hall (LH) 023
Computer Laboratory, Fri 9:45a - 11:00a; Lindley Hall (LH) 101

Tentative schedule (click the icon for a PrinterFriendly.gif version)

DAYDATELECTURETOPICLECTURER
Mon Aug 26 1.1 Motivation and Orientation: Topics in bioinformatics, prerequisites, scope of class, and resources

National Center for Biotechnology Information (NCBI), a great starting point for "anything" bioinformatics

Prerequisites - a few resources for review:
Molecuar Biology: ... basic concepts
Genomics: NHGRI Tutorial

Statistics: Event probabilities Union of events Bayes Theorem Probability distributions Expected value Conditional Probability

Data structures, algorithms, and programming:
Python for Everybody, QTPP class

Sequence Gazing: Case study: The "TATA-box" motif.

Mapping biological question onto computational problems: The Modeling Spiral

Aim high: AlphaFold

Module 1: Bioinformatics workspaces



Ubuntu on Windows: ... a quick way to get to a command line terminal

Virtual Machines: VirtualBox VMware

Research Desktop (RED) at IU: RED Overview access via ThinLinc Client access via browser
Volker Brendel
Wed Aug 28 1.2 Basic bioinformatics toolkit acquisition (Part I)

Getting code: git GitHub Brendel Group on GitHub

GitHub HowTo
git: working with branches
CodeFMAB

Entrez Direct
Insider's Guide: NCBI E-Utilities ... slides
EDirect Sample Code Explained

NCBI Sequence search fields
MEDLINE/PubMed search fields
Volker Brendel
Fri Aug 30 L1.1 Computer Laboratory: Linux Basics

Basic UNIX shell tutorial
The UNIX Shell
The UNIX Shell: Summary of Basic Commands
vi editor tutorial
vim editor tutorial
AI
Mon Sept 2 Labor Day no class
Wed Sept 4 1.3 Basic bioinformatics toolkit acquisition (Part II)

Apptainer
Apptainer User Guide

Applications: Tools and Workflows
SeqKit
SeqKit Tutorial

Homework I posted
Volker Brendel
Fri Sept 6 L1.2 Practice session: Simple bioinformatics workflows on the command line. AI
Mon Sept 9 2.1

Module 2: Sequence Models and Spaces



Simple Sequence Models

Homework I due 7pm
Volker Brendel
Wed Sept 11 2.2
Markov Models for Sequences
Volker Brendel
Fri Sept 13 L2.1 Computer Laboratory: Python Basics

Python for Everybody PY4E Lessons
Python Scripting for Computational Molecular Science

J. Sundnes: Introduction to Scientific Programming with Python

Style matters ...
AI
Mon Sept 16 2.3 Applications of Markov Models I.
Sequence classification: GENMARK
Sensitity, specificity, and all that
Volker Brendel
Wed Sept 18 2.4 Applications of Markov Models II.
Waiting time and pattern probability calculations

Homework II posted
Volker Brendel
Fri Sept 20 L2.1 Computer Laboratory: Simple coding, complex insights.

Coding random sequence generation and pattern probability calculations
AI
Mon Sept 23 3.1

Module 3: Pairwise Sequence Alignment



Models for Pairwise Sequence Alignment
Representations of alignments
NW alignments
Number of alignments (nNW algorithm)

Volker Brendel
Wed Sept 25 3.2 PWSA: gNW algorithm.

Homework II due 9:30pm
Volker Brendel
Fri Sept 27 L3.1 Computer Laboratory: Coding the nNW and gNW algorithms AI
Mon Sept 30 3.3 Algorithms for Pairwise Sequence Alignment: gSW, lSW, and other algorithms

How Do We Compare Biological Sequences?
(from Bioinformatics: An Active Learning Approach)
Volker Brendel
Wed Oct 2 3.4 PWSA: Review and extensions.

Homework III posted
Volker Brendel
Fri Oct 4 L3.2 Computer Laboratory: Coding PWSA algorithms. AI
Mon Oct 7 4.1

Module 4: Sequence Analysis with Scores





Sequence Analysis with Scores: Theory
Volker Brendel
Wed Oct 9 4.2 Notes4.2

Homework III due 9pm, Oct. 10
Volker Brendel
Fri Oct 11 Fall Break: no class
Mon Oct 14 4.3 Sequence Analysis with Scores: Applications

Volker Brendel
Wed Oct 16 4.4 Brief review and outlook

Homework IV posted
Volker Brendel
Fri Oct 18 L4.2 Computer Lab AI
Mon Oct 21 5.1

Module 5: Hidden Markov Models





Hidden Markov Models: Motivation

Rabiner's Tutorial

Homework IV due 7pm
Volker Brendel
Wed Oct 23 5.2 Hidden Markov Models: Algorithms

Hidden Markov Models
(from Bioinformatics: An Active Learning Approach)

Homework Va posted
Volker Brendel
Fri Oct 25 L5.1 Computer Laboratory: Coding and applications of HMM algorithms AI
Mon Oct 28 5.3 Hidden Markov Models: Applications

Application examples
GENSCAN
Profile Hidden Markov Models
TagDust Tagdust2 on github

Sequence motifs: models
Biological Sequence Analysis II (Lecturer: Dr. Andy Baxevanis)
InterPro

Homework Va due 2:30pm
Wed Oct 30 5.4 Gene Finding:

GeneMark.hmm prokaryotic
GeneMark.hmm paper
GeneMark training

Sequence motifs: algorithms

HMMER
The MEME Suite
HOMER

Homework Vb to be posted
Volker Brendel
Fri Nov 1 L5.2 Computer Laboratory: Implementation of HMM algorithms AI
Mon Nov 4 6.1

Module 6: Basic Concepts of Molecular Phylogenetics





The Molecular Clock
Linus Pauling

Molecular Phylogeny: Models

The powers and pitfalls of parsimony
Volker Brendel
Wed Nov 6 6.2 Parsimony and Distance Matrix Methods

Lectures on molecular phylogeny
(from Bioinformatics: An Active Learning Approach)

Homework Vb due 6pm
Volker Brendel
Fri Nov 8 L6.1 Computer Laboratory: Molecular Phylogeny, applications AI
Mon Nov 11 6.3 Molecular Phylogeny: Applications

MEGA
Volker Brendel
Wed Nov 13 6.4 Molecular Phylogeny: Design of a workflow for a sample research project

Homework VI posted
Volker Brendel
Fri Nov 15 L6.2 Molecular Phylogeny, practice AI
Mon Nov 18 7.1

Module 7: Genome Assembly and Annotation





DNA Sequencing
Sanger sequencing
Illumina sequencing
nanopore sequencing

Genome Resources
The Genomic Landscape circa 2016 (Lecturer: Dr. Andy Green)

Assembly basics
NCBI Assembly Help
NCBI Genome

Genome Assembly
Introduction to genome sequencing
How do we assemble genomes?
from Bioinformatics Algorithms: An Active Learning Approach

Volker Brendel
Wed Nov 20 7.2

Homework VI due Thur, Nov 21, 11:30am
Volker Brendel
Fri Nov 22 L7.1 Computer Laboratory: Exploration of genome sequencing by simulation and genome assembly

wgsim - read generator
SoapDeNovo2 - assembler

Hoemwork VII posted
AI
Mon Nov 25 THANKSGIVING BREAK n/a
Wed Nov 27 THANKSGIVING BREAK n/a
Fri Nov 29 THANKSGIVING BREAK n/a
Mon Dec 2 8.1

Module 8: Genetic Variation





Genetic variation
Interpreting an individual genome

NCBI dbSNP    How To
NCBI dbVar    How To
1000 Genomes Project    Nature 491:56
Example: rs1113769

file format specifications:

Variant Call Format (VCF)
Sequence Alignment Map format (SAM)
SAM flags explained
Pileup format (used by samtools)


Relevant code:

NCBI SRA Toolkit
samtools
bwa
freebayes
Volker Brendel
Wed Dec 4 9.1

Module 9: Protein Structure





peptide bond
Ramachandran plot ... very nice visualization thereof
(thanks to Prof. Eric Martz)

PDB: What is a protein?
PDB: How enzymes work
HIV I
HIV II

Guide to PDB

PDB Molecule of the Month
NCBI Protein

Secondary Structure

Jpred - secondary structure prediction
SPIDER3

foldit
AlphaFold

Homework VII due 6pm
Volker Brendel
Fri Dec 6 L7.2 Lab7.2 AI
Mon Dec 9

Review Sessions


Volker Brendel
Wed Dec 11 Volker Brendel
Fri Dec 13 Office hour AI
Mon Dec 16 3:00-5:00pm Final Examination Students