Molecular Phylogeny

similar sequences

These sequences were found using swissprot blast

DDC_HUMAN

MNASEFRRRGKEMVDYMANYMEGIEGRQVYPDVEPGYLRPLIPAAAPQEPDTFEDIINDVEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPKAFLNEKAGEGGGVIQGSASEATLVALLAAR TKVIHRLQAASPELTQAAIMEKLVAYSSDQAHSSVERAGLIGGVKLKAIPSDGNFAMRASALQEALERDKAAGLIPFFMV ATLGTTTCCSFDNLLEVGPICNKEDIWLHVDAAYAGSAFICPEFRHLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKKRT DLTGAFRLDPTYLKHSHQDSGLITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVQLSHEFESLVRQDPRFEI CVEVILGLVCFRLKGSNKVNEALLQRINSAKKIHLVPCHLRDKFVLRFAICSRTVESAHVQRAWEHIKELAADVLRAERE

DDC_RAT

MDSREFRRRGKEMVDYIADYLDGIEGRPVYPDVEPGYLRALIPTTAPQEPETYEDIIRDIEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPEAFLAGRAGEGGGVIQGSASEATLVALLAAR TKMIRQLQAASPELTQAALMEKLVAYTSDQAHSSVERAGLIGGVKIKAIPSDGNYSMRAAALREALERDKAAGLIPFFVV VTLGTTSCCSFDNLLEVGPICNQEGVWLHIDAAYAGSAFICPEFRYLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKKRT DLTEAFNMDPVYLRHSHQDSGLITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVKLSHEFESLVRQDPRFEI CTEVILGLVCFRLKGSNQLNETLLQRINSAKKIHLVPCRLRDKFVLRFAVCSRTVESAHVQLAWEHIRDLASSVLRAEKE

DDC_COW

MNASEFRRRGKEMVDYVADYLEGIEGRQVFPDVDPGYLRPLIPTTAPQEPETFEAIIEDIEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLQLPEAFLAGEAGEGGGVIQGTASEATLVALLAAR TKVTRHLQAASPELMQAAIMEKLVAYASDQAHSSVEKAGLIGGVRLKAIPSDGKFAMRASALQEALERDKAAGLIPFFVV ATLGTTSCCSFDNLLEVGPICHEEGLWLHVDAAYAGSAFICPEFRHLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKKRT DLTGAFRLDPVYLRHSHQDSGLITDYRHWQLPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVQLSHAFEALVRQDTRFEI CAEVILGLVCFRLKGSNKLNEALLESINSAKKIHLVPCSLRDRFVLRFAICSRTVELAHVQLAWEHIQEMAATVLRAQGE EKAEIKN

DDC_MOUSE

MDSREFRRRGKEMVDYIADYLDGIEGRPVYPDVEPGYLRPLIPATAPQEPETYEDIIKDIEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPEAFLAGRAGEGGGVIQGSASEATLVALLAAR TKVIRQLQAASPEFTQAAIMEKLVAYTSDQAHSSVERAGLIGGIKLKAVPSDGNFSMRASALREALERDKAAGLIPFFVV ATLGTTSCCSFDNLLEVGPICNQEGVWLHIDAAYAGSAFICPEFRYLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKRRT DLTGAFNMDPVYLKHSHQDSGFITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVELSHEFESLVRQDPRFEI CTEVILGLVCFRLKGSNELNETLLQRINSAKKIHLVPCRLRDKFVLRFAVCARTVESAHVQLAWEHISDLASSVLRAEKE

DDC_CAVIA

MNASEFRRRGKEMVDYVANYLEGIESRLVYPDVEPGYLRPLIPSSAPEEPETYEDIIGDIERIIMPGVTHWNSPYFFAYF PTANSYPSMLADMLCGAISCIGFSWAASPACTELETVMLDWLGKMLRLPDAFLAGNAGMGGGVIQGSASEATLVALLAAR TKVIRRLQAASPELTQAAIMEKLVAYASDQAHSSVERAGLIGGVRMKLIPSDSNFAMRASALREALERDKAAGLIPFFVV ATLGTTNCCSFDSLLEVGPICNQEEMWLHIDAAYAGSAFICPEFRHLLDGVEFADSFNFNPHKWLLVNFDCSAMWVKQRT DLIGAFKLDPVYLKHGHQDSGLITDYRHWQIPLGRRFRSLKMWFVFRMYGIKGLQAHIRKHVQLAHEFESLVRQDPRFEI CMEVTLGLVCFRLKGSNQLNETLLKRINSARKIHLVPCHLRDKFVLRFRICSRQVESDHVQQAWQHIRQLASSVLRLERA

DDC_DROSOPHILA

MSHIPISNTIPTKQTDGNGKANISPDKLDPKVSIDMEAPEFKDFAKTMVDFIAEYLENIRERRVLPEVKPGYLKPLIPDA APEKPEKWQDVMQDIERVIMPGVTHWHSPKFHAYFPTANSYPAIVADMLSGAIACIGFTWIASPACTELEVVMMDWLGKM LELPAEFLACSGGKGGGVIQGTASESTLVALLGAKAKKLKEVKELHPEWDEHTILGKLVGYCSDQAHSSVERAGLLGGVK LRSVQSENHRMRGAALEKAIEQDVAEGLIPFYAVVTLGTTNSCAFDYLDECGPVGNKHNLWIHVDAAYAGSAFICPEYRH LMKGIESADSFNFNPHKWMLVNFDCSAMWLKDPSWVVNAFNVDPLYLKHDMQGSAPDYRHWQIPLGRRFRALKLWFVLRL YGVENLQAHIRRHCNFAKQFGDLCVADSRFELAAEINMGLVCFRLKGSNERNEALLKRINGRGHIHLVPAKIKDVYFLRM AICSRFTQSEDMEYSWKEVSAAADEMEQEQ

workflow

1) The similar sequences in FASTA format were ran though muscle. Muscle is a program that aligns multiple sequences.

Note: The alignment shown is in the ClustalW format

2) The alignment output was then used as the input for the program protdist. Protdist determines the distances between each pair of sequences. The generated matrix is shown below.

3) The protdist output then serves as the input for the program neighbor. Neighbor generates an evolutionary tree. The neighbor output is shown below.

Neighbor-Joining/UPGMA method version 3.67

Neighbor-joining method

Negative branch lengths allowed

+----DDC_CAVIA 
! 
!   +DDC_RAT   
! +-1 
! ! +DDC_MOUSE 
2-3 
! ! +--DDC_HUMAN 
! +-4 
!   +---DDC_COW   
! 
+-----------------------------DDC_DROSOP

remember: this is an unrooted tree!

Between And Length ——- — ——

 2          DDC_CAVIA       0.07969
 2             3            0.01273
 3             1            0.03631
 1          DDC_RAT         0.02403
 1          DDC_MOUSE       0.01893
 3             4            0.01217
 4          DDC_HUMAN       0.04802
 4          DDC_COW         0.06446
 2          DDC_DROSOP      0.50365

4) This data was then used as the input for the program drawtree. Drawtree creates an image file of the tree, of which is shown below.

A picture of the workflow is shown below.

single nucleotide polymorphisms

DDC is on chromosome 7.

NCBI's dbSNP database is a collection of polymorphisms across species. When searching DDC on this database, there were 8535 hits. When narrowed down to humans (active), 2763 hits remained; and when Clinical/LSDB Submissions was selected, only 6 hits were recovered, all which being pathogenic. NCBI SNPs

I also searched DDC on the 1000 genomes site. This site shows variation between individuals through out 14 populations worldwide. There were six SNPs, all being pathogenic. This is consistant with the Clinical hits from the SNP database. NCBI 1000 genomes

b2gof14:students:taynick:14oct2014

b2gof14/students/taynick/7oct2014.txt · Last modified: 2014/10/14 09:44 by taynick
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki