First Page

NCBI Tutorials

Interesting Proteins

Links to FASTA pages with amino acid sequences

β-lactamase SHV-1

β-lactamase SHV-2

Phylogeny

Using Mobyle @Pastuer and the BLAST Search Page

Step 1: Compile FASTA files into a single file

Step 2: Under Program -Alignment -Multiple -clustalw-multialign

Step 3: Under -phylogeny -distance -protdist

Step 4: Under -phylogeny -distance -neighbor

Step 5: Under -phylogeny -display -newicktops

The sequences used were:

seq-1 beta-lactamase SHV-1 [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-2 beta-lactamase SHV-2 [Escherichia coli]

MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGASERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-3 SHV-5 (plasmid) [Providencia stuartii]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGASKRGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-4 beta-lactamase SHV-141 [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGSVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGASERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-5 extended spectrum beta-lactamase SHV-120 (plasmid) [Escherichia coli]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGAGKRGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-6 SHV-154 beta-lactamase, partial [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGSVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGASKRGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-7 SHV-161 beta-lactamase [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGSVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-8 extended spectrum beta-lactamase SHV-173 [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGRVGMIEMDQASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-9 beta-lactamase SHV-29 [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGSVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGAAERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

seq-10 beta-lactamase [Klebsiella pneumoniae]

MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKV VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGG PAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMTATLRKLLTSQRLSARSQRQLLQWMVDD RVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAAL IEHWQR

Results from Step 5:

10 Populations

Neighbor-Joining/UPGMA method version 3.67

Neighbor-joining method

Negative branch lengths allowed

+seq-10    
! 
!     +seq-3     
!   +-1 
! +-2 +seq-6     
! ! ! 
! ! +seq-5     
! ! 
4-6   +seq-4     
! ! +-7 
! ! ! +seq-7     
! +-8 
!   !   +seq-2     
!   ! +-3 
!   +-5 +seq-8     
!     ! 
!     +seq-9     
! 
+seq-1     

remember: this is an unrooted tree!

Between And Length ——- — ——

 4          seq-10          0.00344
 4             6            0.00158
 6             2            0.00187
 2             1            0.00147
 1          seq-3           0.00128
 1          seq-6           0.00215
 2          seq-5           0.00197
 6             8            0.00108
 8             7            0.00064
 7          seq-4           0.00214
 7          seq-7           0.00129
 8             5            0.00194
 5             3            0.00108
 3          seq-2           0.00227
 3          seq-8           0.00458
 5          seq-9           0.00407
 4          seq-1          -0.00001

CLUSTAL 2.0.12 multiple sequence alignment

seq-1 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADE seq-10 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADE seq-3 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADE seq-6 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGSVGMIEMDLASGRTLTAWRADE seq-5 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADE seq-4 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGSVGMIEMDLASGRTLTAWRADE seq-7 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGSVGMIEMDLASGRTLTAWRADE seq-2 MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGRVGMIEMDLASGRTLTAWRADE seq-8 MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGRVGMIEMDQASGRTLTAWRADE seq-9 MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGSVGMIEMDLASGRTLTAWRADE

  • * * * *

seq-1 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-10 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-3 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-6 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-5 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-4 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-7 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-2 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-8 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA seq-9 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA

  • * seq-1 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-10 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-3 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-6 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-5 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-4 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-7 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-2 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-8 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-9 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA seq-1 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARG seq-10 SMTATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARG seq-3 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGASKRGARG seq-6 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGASKRGARG seq-5 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGKRGARG seq-4 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGASERGARG seq-7 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARG seq-2 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGASERGARG seq-8 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARG seq-9 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAAERGARG :.:* seq-1 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-10 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-3 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-6 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-5 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-4 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-7 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-2 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-8 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR seq-9 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR

Screenshot of the Mobyle Pipeline:

Screenshot of the Mobyle pipeline that worked:

Screenshot of the Tree:

SNP and Variant Databases

Here are the databases we were to search:

SNP database

Variant databases

Also: 1000 Genomes Project

I searched for Klebsiella pneumonia variants

Protein Data Base

A link toNCBI website

A link toRCSB PBD

PBD tutorials

I am about to do a project on SHV mutations, it is a protein of interest to me.

Here are some links to SHV enzyme in complex:

SHV-1 in complex

SHV, inhibitor resistant variant

Using the four digit code from PBD, here is three proteins together:

2CJQ, 2WHO, and 4WTG

Comments from Kasun:

So you are working on a mutation of a protein. Links work fine and I think its good if you can include a small note about your protein of interest (what it does? etc.).

Similarly if you know the specific mutation, you can model the mutated protein using Swiss Model as I have done. You can model any protein with Swiss Model!

Great job Adam!

Kasun (10/13/15)

Predicting Gene Structure

This class we were exploring websites designed to search for genes within DNA sequences

I started by searching NCBI website for a DNA sequence

I picked TP53 Human tumor protein

I ran the FASTA sequence through GENSCAN

The GeneMark program

I also tried out the Augustus program

Here is what that output looked like:

Dd you only search the gene sequence or did you include the surrounding region? Did the searches find the gene the TP53 Human tumor protein with all exons?

Genome Browsers and Gene Annotation

There are many great genome browsing resources that we explored this week: Flybase, Wormbase for C. elegans, yeast genome and GBrowse

We also explored the use of gene annotation tool: yrGATE

Polistes dominula xGDB instance

In Polistes, I used it anonymously

There is a useful tutorial page

I looked under gene models

the first locus ID that I picked was selected based on a high coverage and low integrity score

it was: PdomGENEr1.2-07887

Then I looked under aligned proteins

Then I used a BLAST search of the amino acid sequence

Further use of the program:

I registered and submitted two annotations

They were both rejected

I tried again, and was again rejected-

Comparative Genomics

the website CoGe is a great resource

First, I did organismview for Escherichia coli

Then I looked at the genome viewer for that organism

You can slide the view to anywhere on the genome

Then I generated a map using the SynMap function

I used Escherichia albertii and Escherichia coli as my two organisms

There are not many regions with similarity

Once you selected the region you want to examine, the synteny of the genes can be compared side-by-side

This is used to exemplify conservation and shared genes between the two species

Gene Expression Analysis

This is a link to Gene Expression Omnibus

A link to XXMotif

must have two selections for each group- for comparison

A link to the how to page

Here is an example:

Here are the results from another search:

I also tried the dataset browser and the taxonomy browser

iplant collaborative

iplant collaborative

select an image

started a xGDBvm instance

“New Machine Adam”

b2gof15/students/cab/start.txt · Last modified: 2015/11/03 23:45 by cab
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki