These sequences were found using swissprot blast
DDC_HUMAN
MNASEFRRRGKEMVDYMANYMEGIEGRQVYPDVEPGYLRPLIPAAAPQEPDTFEDIINDVEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPKAFLNEKAGEGGGVIQGSASEATLVALLAAR TKVIHRLQAASPELTQAAIMEKLVAYSSDQAHSSVERAGLIGGVKLKAIPSDGNFAMRASALQEALERDKAAGLIPFFMV ATLGTTTCCSFDNLLEVGPICNKEDIWLHVDAAYAGSAFICPEFRHLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKKRT DLTGAFRLDPTYLKHSHQDSGLITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVQLSHEFESLVRQDPRFEI CVEVILGLVCFRLKGSNKVNEALLQRINSAKKIHLVPCHLRDKFVLRFAICSRTVESAHVQRAWEHIKELAADVLRAERE
DDC_RAT
MDSREFRRRGKEMVDYIADYLDGIEGRPVYPDVEPGYLRALIPTTAPQEPETYEDIIRDIEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPEAFLAGRAGEGGGVIQGSASEATLVALLAAR TKMIRQLQAASPELTQAALMEKLVAYTSDQAHSSVERAGLIGGVKIKAIPSDGNYSMRAAALREALERDKAAGLIPFFVV VTLGTTSCCSFDNLLEVGPICNQEGVWLHIDAAYAGSAFICPEFRYLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKKRT DLTEAFNMDPVYLRHSHQDSGLITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVKLSHEFESLVRQDPRFEI CTEVILGLVCFRLKGSNQLNETLLQRINSAKKIHLVPCRLRDKFVLRFAVCSRTVESAHVQLAWEHIRDLASSVLRAEKE
DDC_COW
MNASEFRRRGKEMVDYVADYLEGIEGRQVFPDVDPGYLRPLIPTTAPQEPETFEAIIEDIEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLQLPEAFLAGEAGEGGGVIQGTASEATLVALLAAR TKVTRHLQAASPELMQAAIMEKLVAYASDQAHSSVEKAGLIGGVRLKAIPSDGKFAMRASALQEALERDKAAGLIPFFVV ATLGTTSCCSFDNLLEVGPICHEEGLWLHVDAAYAGSAFICPEFRHLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKKRT DLTGAFRLDPVYLRHSHQDSGLITDYRHWQLPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVQLSHAFEALVRQDTRFEI CAEVILGLVCFRLKGSNKLNEALLESINSAKKIHLVPCSLRDRFVLRFAICSRTVELAHVQLAWEHIQEMAATVLRAQGE EKAEIKN
DDC_MOUSE
MDSREFRRRGKEMVDYIADYLDGIEGRPVYPDVEPGYLRPLIPATAPQEPETYEDIIKDIEKIIMPGVTHWHSPYFFAYF PTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPEAFLAGRAGEGGGVIQGSASEATLVALLAAR TKVIRQLQAASPEFTQAAIMEKLVAYTSDQAHSSVERAGLIGGIKLKAVPSDGNFSMRASALREALERDKAAGLIPFFVV ATLGTTSCCSFDNLLEVGPICNQEGVWLHIDAAYAGSAFICPEFRYLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKRRT DLTGAFNMDPVYLKHSHQDSGFITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVELSHEFESLVRQDPRFEI CTEVILGLVCFRLKGSNELNETLLQRINSAKKIHLVPCRLRDKFVLRFAVCARTVESAHVQLAWEHISDLASSVLRAEKE
DDC_CAVIA
MNASEFRRRGKEMVDYVANYLEGIESRLVYPDVEPGYLRPLIPSSAPEEPETYEDIIGDIERIIMPGVTHWNSPYFFAYF PTANSYPSMLADMLCGAISCIGFSWAASPACTELETVMLDWLGKMLRLPDAFLAGNAGMGGGVIQGSASEATLVALLAAR TKVIRRLQAASPELTQAAIMEKLVAYASDQAHSSVERAGLIGGVRMKLIPSDSNFAMRASALREALERDKAAGLIPFFVV ATLGTTNCCSFDSLLEVGPICNQEEMWLHIDAAYAGSAFICPEFRHLLDGVEFADSFNFNPHKWLLVNFDCSAMWVKQRT DLIGAFKLDPVYLKHGHQDSGLITDYRHWQIPLGRRFRSLKMWFVFRMYGIKGLQAHIRKHVQLAHEFESLVRQDPRFEI CMEVTLGLVCFRLKGSNQLNETLLKRINSARKIHLVPCHLRDKFVLRFRICSRQVESDHVQQAWQHIRQLASSVLRLERA
DDC_DROSOPHILA
MSHIPISNTIPTKQTDGNGKANISPDKLDPKVSIDMEAPEFKDFAKTMVDFIAEYLENIRERRVLPEVKPGYLKPLIPDA APEKPEKWQDVMQDIERVIMPGVTHWHSPKFHAYFPTANSYPAIVADMLSGAIACIGFTWIASPACTELEVVMMDWLGKM LELPAEFLACSGGKGGGVIQGTASESTLVALLGAKAKKLKEVKELHPEWDEHTILGKLVGYCSDQAHSSVERAGLLGGVK LRSVQSENHRMRGAALEKAIEQDVAEGLIPFYAVVTLGTTNSCAFDYLDECGPVGNKHNLWIHVDAAYAGSAFICPEYRH LMKGIESADSFNFNPHKWMLVNFDCSAMWLKDPSWVVNAFNVDPLYLKHDMQGSAPDYRHWQIPLGRRFRALKLWFVLRL YGVENLQAHIRRHCNFAKQFGDLCVADSRFELAAEINMGLVCFRLKGSNERNEALLKRINGRGHIHLVPAKIKDVYFLRM AICSRFTQSEDMEYSWKEVSAAADEMEQEQ
1) The similar sequences in FASTA format were ran though muscle. Muscle is a program that aligns multiple sequences.
Note: The alignment shown is in the ClustalW format
2) The alignment output was then used as the input for the program protdist. Protdist determines the distances between each pair of sequences. The generated matrix is shown below.
3) The protdist output then serves as the input for the program neighbor. Neighbor generates an evolutionary tree. The neighbor output is shown below.
Neighbor-Joining/UPGMA method version 3.67
Neighbor-joining method
Negative branch lengths allowed
+----DDC_CAVIA ! ! +DDC_RAT ! +-1 ! ! +DDC_MOUSE 2-3 ! ! +--DDC_HUMAN ! +-4 ! +---DDC_COW ! +-----------------------------DDC_DROSOP
remember: this is an unrooted tree!
Between And Length ——- — ——
2 DDC_CAVIA 0.07969 2 3 0.01273 3 1 0.03631 1 DDC_RAT 0.02403 1 DDC_MOUSE 0.01893 3 4 0.01217 4 DDC_HUMAN 0.04802 4 DDC_COW 0.06446 2 DDC_DROSOP 0.50365
4) This data was then used as the input for the program drawtree. Drawtree creates an image file of the tree, of which is shown below.
A picture of the workflow is shown below.
DDC is on chromosome 7.
NCBI's dbSNP database is a collection of polymorphisms across species. When searching DDC on this database, there were 8535 hits. When narrowed down to humans (active), 2763 hits remained; and when Clinical/LSDB Submissions was selected, only 6 hits were recovered, all which being pathogenic. NCBI SNPs
I also searched DDC on the 1000 genomes site. This site shows variation between individuals through out 14 populations worldwide. There were six SNPs, all being pathogenic. This is consistant with the Clinical hits from the SNP database. NCBI 1000 genomes