• Skip to main content
  • Skip to top navigation bar
  •      

    Links

    Courses S21

    Courses F20

    Other Years

    Graduate Programs

    INFO I519 Fall 2020

    Fundamental Models and Algorithms in Bioinformatics

    INFO I519 (= I617) Fall 2020


    Time & Location:    Mon, Wed 3:15p - 4:30p (Credits: 3.0); online
                                   Computer Laboratory, Fri 9:25a - 10:40a; online
    Instructor:    Volker Brendel (205C Simon Hall); Assistant Instructor: Samer Al-Saffar (205A Simon Hall)
    Email:     VB, vbrendel@indiana.edu; SA, sialsaff@iu.edu
    WWW:     https://brendelgroup.org/
    Virtual Office Hours:     Mon, Wed after class and by appointment.
    Grades:    will be determined as described below.
    Schedule:     https://brendelgroup.org/teaching/2020/I519F20schedule.php
    Computing Resources:     You will have access to networked computer terminals during laboratory classes and will need such basic access outside of the classroom for assignments. An initial goal of the course will be to instruct you in setting up a laptop for class work and bioinformatics work in general.


    btn_printerFriendly.gif version of this syllabus

    Synopsis

    Biology has become one of the primary application domains of computer science and informatics approaches. The term "Bioinformatics" covers a wide spectrum of data management and processing associated with large-scale, high-throughput biological data generation. This class will focus on biomolecular sequence data (DNA and protein) that underpin much of modern biology, including for example genetics; ecology, evolution, and population biology; and structural biology. Applications in medicine and biotechnology are changing our societies and world. Many of the data analysis problems in the field have been mapped to tractable mathematical models amenable to algorithmic solutions. The course will cover fundamental models and algorithms in bioinformatics, with emphasis on the general principles involved in the modeling and algorithmic approaches. The course should be of interest to you if one or more of the following apply to you: (1) You are curious and would like to learn about a "hot topic"; (2) You want to expand your range of options for post-graduate school; (3) You want to become or stay relevant in life science research in academia or industry; (4) You are considering a high-paying job in the biotechnology sector.

    Prerequisites

    This class is directed primarily at first and second year graduate students in Biology, Informatics, Computer Science, or Data Science; students of Mathematics, Statistics, and other fields may also find the course accessible and of interest. Although there are no formal prerequisites for the course, some basic calculus and statistics knowledge will be necessary and will be reviewed as required by students' background. Relevant biological concepts will be introduced as needed. Some classes will be taught as a computer lab. Students will need to be or become familiar with basic computer operational skills, including some programming (scripting) language knowledge. Class messages and materials, including assignments, will be shared through our Canvas site in addition to these web pages, and students are required to regularly check these relevant communication channels. IU is committed to Creating a Positive Environment for teaching and learning. If you have any concerns or suggestions, please let the instructor know.

    Learning Goals

    The course seeks to provide students with a solid foundation for understanding models and algorithms in bioinformatics and to impart the basic practical skills to work on bioinformatics projects. Specific learning goals cover the following topics: (1) Basic bioinformatics data skills: Linux, scripting, R, virtual machines, containers. (2) Modeling biomolecular sequences: sequence probability spaces, principles of feature significance evaluation. (3) Pairwise sequence alignment: alignment types, representations, scoring, algorithms for determining optimal alignments. (4) Multiple sequence alignment and database searches: algorithms, index structures, statistical evaluation. (5) Basic models and approaches to molecular phylogeny: molecular clock, bifurcating trees, parsimony and distance matrix methods. (6) Hidden Markov Models for gene finding, spliced alignment, and protein motif identification: basic algorithms and applications. (7) Genome assembly, genome variation, and gene expression: introduction to problems, algorithms, and data analysis.

    Assignments and Grading

    Grades will be based on a 100-point scale, derived as the total number of points gained from assignments (for a maximum of 80 points) and the final project score (20 points maximum). A rough translation into letter grades is: >=95, A+; >=90, A; >=85, A-; >=80, B+; >=75, B; >=70, B-; and so forth. Assignments will be given in the first week of each of the 6 basic course Modules. One assignment will go along with the lecture material, and the other assignment will be a computational task. Each assignment will count a maximum of 10 points towards your course total. However, we will only count the best 4 scores from each group of 6 assignments (grouped by lecture versus computational). Thus, your total assignment score is at most 80 points. Instead of a final examination, you will be asked to submit a final project which will be described in Module 7: Topics in Bioinformatics Research. The project will be due at the end of the last week of classes and count at most 20 points.

    Text book

    The class is based on a draft textbook "Fundamental Models and Algorithms in Bioinformatics", V. Brendel (Indiana University) & K. Dorman (Iowa State University). Excerpts of the draft will be made available to the students as PDFs. We will make use of engaging, beautifully produced videos accompanying "Bioinformatics Algorithms - An Active Learning Approach" (Active Learning Publishers LLC) by Phillip Compeau & Pavel Pevzner. Other materials will be posted on the course web pages or our Canvas site. Students wishing to explore the biological background of class topics will find "Genomics and Personalized Medicine - What Everyone Needs to Know" (Oxford University Press) by Michael Snyder a concise, stimulating guide. For practical bioinformatics skills we strongly recommend "Bioinformatics Data Skills" by Vince Buffalo (O'Reilly); topics and examples from this book will be explored in the Computer Laboratory part of the course.