Fundamental Models and Algorithms in Bioinformatics
INFO I519 (= I617) Fall 2023
Time & Location: Mon, Wed 3:00p - 4:15p (Credits: 3.0); Swain West (SW) 217
Computer Laboratory, Fri 9:45a - 11:00a; Radio-TV (TV) 186
Instructor: Volker Brendel (205C Simon Hall);
Assistant Instructor: Rakesh Santha Kumaran (TBA)
Email:
VB, vbrendel@indiana.edu;
RK, rsantha@iu.edu
WWW:
https://brendelgroup.org/
Virtual Office Hours:
Mon, Wed after class and by appointment.
Grades: will be determined as described below.
Schedule:
https://brendelgroup.org/teaching/2023/I519F23schedule.php
Computing Resources:
You will need to bring a laptop to class to participate in exercises, group activities, and scheduled quizzes. An initial goal of the course will be to instruct you in setting up a Linux environment on your laptop for class work and bioinformatics work in general. Access to IU Linux and HPC resource will be reviewed as needed.
Synopsis
Biology has become one of the primary application domains of computer science and informatics approaches. The term "Bioinformatics" covers a wide spectrum of data management and processing associated with large-scale, high-throughput biological data generation. This class will focus on biomolecular sequence data (DNA and protein) that underpin much of modern biology, including genetics and genomics; ecology, evolution, and population biology; and structural biology. Applications in medicine and biotechnology are changing our societies and world. Many of the data analysis problems in the field have been mapped to tractable mathematical models amenable to algorithmic solutions. The course will cover fundamental models and algorithms in bioinformatics, with emphasis on the general principles involved in the modeling and algorithmic approaches.
The course should be of interest to you if one or more of the following apply to you: (1) You are curious and would like to learn about a "hot topic"; (2) You want to expand your range of options for post-graduate school; (3) You want to become or stay relevant in life science research in academia or industry; (4) You are considering a high-paying job in the biotechnology sector.
Prerequisites
This class is directed primarily at first and second year graduate students in Biology, Informatics, Computer Science, or Data Science; students of Mathematics, Statistics, and other fields may also find the course accessible and of interest. Although there are no formal prerequisites for the course, basic concepts of calculus and statistics are foundational and will be reviewed as required by students' background. Relevant biological concepts will be introduced as needed. Some classes will be taught as a computer lab. Students will need to be or become familiar with basic computer operational skills, including some programming (scripting) language. Students may find Quantitative Thinking and Python Programming to be a helpful companion class or are otherwise encouraged to self-learn Python.
Class messages and materials, including assignments, will be shared through our Canvas site in addition to these web pages, and students are required to regularly check these relevant communication channels. IU is committed to a positive environment for teaching and learning. If you have any concerns or suggestions, please let the instructor know. Please pay keen attention to the following:
Sexual Misconduct & Title IX: Indiana University policy prohibits sexual misconduct in any form, including sexual harassment, sexual assault, stalking, sexual exploitation, and dating and domestic violence. If you have experienced sexual misconduct, or know someone who has, the University can help. If you are seeking help and would like to speak to someone confidentially, you can make an appointment with the IU Sexual Assault Crisis Services at (812) 855-5711, or contact a Confidential Victim Advocate at (812) 856-2469 or cva@indiana.edu. It is also important that you know that University policy requires instructors to share certain information brought to their attention about potential sexual misconduct, with the campus Deputy Sexual Misconduct & Title IX Coordinator or the University Sexual Misconduct & Title IX Coordinator. In that event, those individuals will work to ensure that appropriate measures are taken and resources are made available. Protecting student privacy is of utmost concern, and information will only be shared with those that need to know to ensure the University can respond and assist. Please visit http://stopsexualviolence.iu.edu/ to learn more.
Learning Goals
The course seeks to provide students with a solid foundation for understanding models and algorithms in bioinformatics and to impart the basic practical skills to work on bioinformatics projects. Specific learning goals cover the following topics: (1) Basic bioinformatics data skills: Linux, scripting, virtual machines, containers. (2) Modeling biomolecular sequences: sequence probability spaces, principles of feature significance evaluation. (3) Pairwise sequence alignment: alignment types, representations, scoring, algorithms for determining optimal alignments. (4) Multiple sequence alignment and database searches: algorithms, index structures, statistical evaluation. (5) Basic models and approaches to molecular phylogeny: molecular clock, bifurcating trees, parsimony and distance matrix methods. (6) Hidden Markov Models for gene finding, spliced alignment, and protein motif identification: basic algorithms and applications. (7) Genome assembly, genome variation, and gene expression: introduction to problems, algorithms, and data analysis.
In order to achieve our learning goals, students are expected to attend and participate regularly in class meetings and all assignments.
Assignments and Grading
Grades will be based on a 100-point scale, derived as the total number of points gained from homework assignments (for a maximum of 60 points), a project assignment (30 points maximum), and class participation (10 points maximum). A rough translation into letter grades is: >=95, A+; >=90, A; >=85, A-; >=80, B+; >=75, B; >=70, B-; and so forth. Homework assignments will be given at the end of each of the 6 basic course modules and will be administered via Canvas. Each assignment will count a maximum of 15 points towards your course total. However, we will only count the best 4 scores from the 6 assignments. Thus, your total assignment score will be at most 60 points. This arrangement allows students to miss 2 of the 6 assignments for any reason. Other accommodations for absences will only be made in exceptional circumstances. A project assignment will be posted in the week before Thanksgiving Break and will be due before Reflection Week. The assignment is meant to give you an opportunity to develop a small project that will tie together concepts and tools learned in the class.
Text book
The class is based on a draft textbook "Fundamental Models and Algorithms in Bioinformatics", V. Brendel (Indiana University) & K. Dorman (Iowa State University). Excerpts of the draft will be made available to the students as PDFs. We will also make use of engaging, beautifully produced videos accompanying "Bioinformatics Algorithms - An Active Learning Approach" (Active Learning Publishers LLC) by Phillip Compeau & Pavel Pevzner. Other materials will be posted on the course web pages or our Canvas site. Students wishing to explore the biological background of class topics will find "Genomics and Personalized Medicine - What Everyone Needs to Know" (Oxford University Press) by Michael Snyder a concise, stimulating guide. For practical bioinformatics skills we strongly recommend "Bioinformatics Data Skills" by Vince Buffalo (O'Reilly); topics and examples from this book will be explored in the Computer Laboratory part of the course.
Important Footnotes
Academic Integrity: As a student at IU, you are expected to adhere to the standards contained in the Code of Student Rights, Responsibilities, and Conduct (the Code). Academic misconduct is defined as any activity that tends to undermine the academic integrity of the institution. Academic integrity violations include: cheating, fabrication, plagiarism, interference, violation of course rules, and facilitating academic dishonesty. When you submit an assignment with your name on it, you are signifying that the work contained therein is yours, unless otherwise cited or referenced. Any ideas or materials taken from another source must be fully acknowledged. Students should not share their work with any other students. If plagiarism or other cheating occurs, both students involved will be considered responsible even if the student sharing their work was unaware that academic misconduct would occur or had occurred. Ignorance of what constitutes academic misconduct or plagiarism is not a valid excuse. In addition, posting questions from quizzes/exams or assignments or downloading answers from online sources is considered academic misconduct. All suspected violations of the Code will be reported to the Dean of Students (Office of Student Conduct) and handled according to University policies. Sanctions for academic misconduct in this course may include a failing grade on the assignment, a reduction in your final course grade, or a failing grade in the course, among other possibilities. If you are unsure about the expectations for completing an assignment or taking a test or exam, be sure to seek clarification from your instructor in advance.
Note Selling: Various commercial services have approached students regarding selling class notes/study guides to their classmates. Selling the instructor’s notes/study guides or uploading course assignments to these sites in exchange for access to materials for other courses is not permitted. Violations of this policy will be reported to the Dean of Students (Office of Student Conduct) as academic misconduct (violation of course rules). Sanctions for academic misconduct for this action may include a failing grade on the assignment for which the notes/study guides or assignments are being uploaded, a reduction in your final course grade, or a failing grade in the course, among other possibilities. Additionally, you should know that selling a faculty member’s notes/study guides individually or on behalf of one of these services using IU email, or via Canvas may also constitute a violation of IU information technology and IU intellectual property policies; additional consequences may result.
Online Course Materials: The instructor teaching this course holds the exclusive right to distribute, modify, post, and reproduce course materials, including all written materials, study guides, lectures, assignments, exercises, and exams. Some of the course content may be downloadable, but you should not distribute, post, or alter the instructor’s intellectual property. While you are permitted to take notes on the online materials and lectures posted for this course for your personal use, you are not permitted to re-post in another forum, distribute, or reproduce content from this course without the express written permission of the instructor.
GroupMe: Please note that you may receive emails from other students about joining GroupMe for individual classes via Canvas. Even though invitations to join the group may be issued through Canvas, they do not imply the endorsement of the course instructor. While GroupMe can be an effective tool for keeping in touch with classmates and clarifying information related to the course, it can also be source of unauthorized information sharing or collaboration among students. Collaborative efforts on assignments, quizzes and exams, including sharing or discussing answers when the instructor has not expressly authorized collaboration is considered cheating, If academic dishonesty occurs via GroupMe, everyone involved in the thread may be found responsible for academic misconduct since membership in the group suggests that that they have been able to view the information shared.