fbpx
Wikipedia

Superfamily database

SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes.[1][2][3][4][5][6][7] It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies.[8][9] Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP.[8][10] Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.[11]

SUPERFAMILY
Content
DescriptionThe SUPERFAMILY database provides structural and functional annotation for all proteins and genomes.
Data types
captured
Protein families, genome annotation, alignments, Hidden Markov models (HMMs)
Organismsall
Contact
Research centerUniversity of Bristol
Laboratory
Primary citationPMID 19036790
Access
Data formatFASTA format
Websitesupfam.org
Download URLsupfam.org/SUPERFAMILY/downloads.html
Miscellaneous
LicenseGNU General Public License
Version1.75

Annotations edit

The SUPERFAMILY annotation is based on a collection of hidden Markov models (HMM), which represent structural protein domains at the SCOP superfamily level.[12] [13] A superfamily groups together domains which have an evolutionary relationship. The annotation is produced by scanning protein sequences from completely sequenced genomes against the hidden Markov models.

For each protein you can:

  • Submit sequences for SCOP classification
  • View domain organisation, sequence alignments and protein sequence details

For each genome you can:

  • Examine superfamily assignments, phylogenetic trees, domain organisation lists and networks
  • Check for over- and under-represented superfamilies within a genome

For each superfamily you can:

  • Inspect SCOP classification, functional annotation, Gene Ontology annotation,[6][14] InterPro abstract and genome assignments
  • Explore taxonomic distribution of a superfamily across the tree of life

All annotation, models and the database dump are freely available for download to everyone.

Features edit

Sequence Search

Submit a protein or DNA sequence for SCOP superfamily and family level classification using the SUPERFAMILY HMM's. Sequences can be submitted either by raw input or by uploading a file, but all must be in FASTA format. Sequences can be amino acids, a fixed frame nucleotide sequence, or all frames of a submitted nucleotide sequence. Up to 1000 sequences can be run at a time.

Keyword Search

Search the database using a superfamily, family, or species name plus a sequence, SCOP, PDB, or HMM ID's. A successful search yields the class, folds, superfamilies, families, and individual proteins matching the query.

Domain Assignments

The database has domain assignments, alignments, and architectures for completely sequence eukaryotic and prokaryotic organisms, plus sequence collections.

Comparative Genomics Tools

Browse unusual (over- and under-represented) superfamilies and families, adjacent domain pair lists and graphs, unique domain pairs, domain combinations, domain architecture co-occurrence networks, and domain distribution across taxonomic kingdoms for each organism.

Genome Statistics

For each genome: number of sequences, number of sequences with assignment, percentage of sequences with assignment, percentage total sequence coverage, number of domains assigned, number of superfamilies assigned, number of families assigned, average superfamily size, percentage produced by duplication, average sequence length, average length matched, number of domain pairs, and number of unique domain architectures.

Gene Ontology

Domain-centric Gene Ontology (GO) automatically annotated.

Due to the growing gap between sequenced proteins and known functions of proteins, it is becoming increasingly important to develop a more automated method for functionally annotating proteins, especially for proteins with known domains. SUPERFAMILY uses protein-level GO annotations taken from the Genome Ontology Annotation (GOA) project, which offers high-quality GO annotations directly associated to proteins in the UniprotKB over a wide spectrum of species.[15] SUPERFAMILY has generated GO annotations for evolutionarily closed domains (at the SCOP family level) and distant domains (at the SCOP superfamily level).

Phenotype Ontology

Domain-centric phenotype/anatomy ontology including Disease Ontology, Human Phenotype, Mouse Phenotype, Worm Phenotype, Yeast Phenotype, Fly Phenotype, Fly Anatomy, Zebrafish Anatomy, Xenopus Anatomy, and Arabidopsis Plant.

Superfamily Annotation

InterPro abstracts for over 1,000 superfamilies, and Gene Ontology (GO) annotation for over 700 superfamilies. This feature allows for the direct annotation of key features, functions, and structures of a superfamily.

Functional Annotation

Functional annotation of SCOP 1.73 superfamilies.

The SUPERFAMILY database uses a scheme of 50 detailed function categories which map to 7 general function categories, similar to the scheme used in the COG database.[16] A general function assigned to a superfamily was used to reflect the major function for that superfamily. The general categories of function are:

  1. Information: storage, maintenance of genetic code; DNA replication and repair; general transcription and translation.
  2. Regulation: Regulation of gene expression and protein activity; information processing in response to environmental input; signal transduction; general regulatory or receptor activity.
  3. Metabolism: Anabolic and catabolic processes; cell maintenance and homeostasis; secondary metabolism.
  4. Intra-cellular processes: cell motility and division; cell death; intra-cellular transport; secretion.
  5. Extra-cellular processes: inter-, extr-cellular processes like cell adhesion; organismal process like blood clotting or the immune system.
  6. General: General and multiple functions; interactions with proteins, lipids, small molecules, and ions.
  7. Other/Unknown: an unknown function, viral proteins, or toxins.

Each domain superfamily in SCOP classes a to g were manually annotated using this scheme[17][18][19] and the information used was provided by SCOP,[10] InterPro,[20][21] Pfam,[22] Swiss Prot,[23] and various literature sources.

Phylogenetic Trees

Create custom phylogenetic trees by selecting 3 or more available genomes on the SUPERFAMILY site. Trees are generated using heuristic parsimony methods, and are based on protein domain architecture data for all genomes in SUPERFAMILY. Genome combinations, or specific clades, can be displayed as individual trees.

Similar Domain Architectures

This feature allows the user to find the 10 domain architectures which are most similar to the domain architecture of interest.

Hidden Markov Models

Produce SCOP domain assignments for a sequence using the SUPERFAMILY hidden Markov models.

Profile Comparison

Find remote domain matches when the HMM search fails to find a significant match. Profile comparison (PRC)[24] for aligning and scoring two profile HMM's are used.

Web Services

Distributed Annotation Server and linking to SUPERFAMILY.

Downloads

Sequences, assignments, models, MySQL database, and scripts - updated weekly.

Use in Research edit

The SUPERFAMILY database has numerous research applications and has been used by many research groups for various studies. It can serve either as a database for proteins that the user wishes to examine with other methods, or to assign a function and structure to a novel or uncharacterized protein. One study found SUPERFAMILY to be very adept at correctly assigning an appropriate function and structure to a large number of domains of unknown function by comparing them to the databases hidden Markov models.[25] Another study used SUPERFAMILY to generate a data set of 1,733 Fold superfamily domains (FSF) in use of a comparison of proteomes and functionomes for to identify the origin of cellular diversification.[26]

References edit

  1. ^ Wilson, D; Pethica, R; Zhou, Y; Talbot, C; Vogel, C; Madera, M; Chothia, C; Gough, J (January 2009). "SUPERFAMILY--sophisticated comparative genomics, data mining, visualization and phylogeny". Nucleic Acids Research. 37 (Database issue): D380-6. doi:10.1093/NAR/GKN762. ISSN 0305-1048. PMC 2686452. PMID 19036790. Wikidata Q26781958.
  2. ^ Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K.; Chothia, Cyrus; Gough, Julian (2004-01-01). "The SUPERFAMILY database in 2004: additions and improvements". Nucleic Acids Research. 32 (suppl 1): D235–D239. doi:10.1093/nar/gkh117. ISSN 0305-1048. PMC 308851. PMID 14681402.
  3. ^ Wilson, D.; Madera, M.; Vogel, C.; Chothia, C.; Gough, J. (2007). "The SUPERFAMILY database in 2007: Families and functions". Nucleic Acids Research. 35 (Database issue): D308–D313. doi:10.1093/nar/gkl910. PMC 1669749. PMID 17098927.
  4. ^ Gough, J. (2002). "The SUPERFAMILY database in structural genomics". Acta Crystallographica Section D. 58 (Pt 11): 1897–1900. doi:10.1107/s0907444902015160. PMID 12393919.
  5. ^ Gough, J.; Chothia, C. (2002). "SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments". Nucleic Acids Research. 30 (1): 268–272. doi:10.1093/nar/30.1.268. PMC 99153. PMID 11752312.
  6. ^ a b De Lima Morais, D. A.; Fang, H.; Rackham, O. J. L.; Wilson, D.; Pethica, R.; Chothia, C.; Gough, J. (2010). "SUPERFAMILY 1.75 including a domain-centric gene ontology method". Nucleic Acids Research. 39 (Database issue): D427–D434. doi:10.1093/nar/gkq1130. PMC 3013712. PMID 21062816.
  7. ^ Oates, M. E.; Stahlhacke, J; Vavoulis, D. V.; Smithers, B; Rackham, O. J.; Sardar, A. J.; Zaucha, J; Thurlby, N; Fang, H; Gough, J (2015). "The SUPERFAMILY 1.75 database in 2014: A doubling of data". Nucleic Acids Research. 43 (Database issue): D227–33. doi:10.1093/nar/gku1041. PMC 4383889. PMID 25414345.
  8. ^ a b Hubbard, T. J.; Ailey, B.; Brenner, S. E.; Murzin, A. G.; Chothia, C. (1999). "SCOP: A Structural Classification of Proteins database". Nucleic Acids Research. 27 (1): 254–256. doi:10.1093/nar/27.1.254. PMC 148149. PMID 9847194.
  9. ^ Lo Conte, L.; Ailey, B.; Hubbard, T. J.; Brenner, S. E.; Murzin, A. G.; Chothia, C. (2000). "SCOP: A Structural Classification of Proteins database". Nucleic Acids Research. 28 (1): 257–259. doi:10.1093/nar/28.1.257. PMC 102479. PMID 10592240.
  10. ^ a b Andreeva, Antonina; Howorth, Dave; Brenner, Steven E.; Hubbard, Tim J. P.; Chothia, Cyrus; Murzin, Alexey G. (2004-01-01). "SCOP database in 2004: refinements integrate structure and sequence family data". Nucleic Acids Research. 32 (Database issue): D226–D229. doi:10.1093/nar/gkh039. ISSN 0305-1048. PMC 308773. PMID 14681400.
  11. ^ Dayhoff, M. O.; McLaughlin, P. J.; Barker, W. C.; Hunt, L. T. (1975-04-01). "Evolution of sequences within protein superfamilies". Naturwissenschaften. 62 (4): 154–161. Bibcode:1975NW.....62..154D. doi:10.1007/BF00608697. ISSN 0028-1042. S2CID 40304076.
  12. ^ Gough, J.; Karplus, K.; Hughey, R.; Chothia, C. (2001). "Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure1". Journal of Molecular Biology. 313 (4): 903–919. CiteSeerX 10.1.1.144.6577. doi:10.1006/jmbi.2001.5080. PMID 11697912.
  13. ^ Karplus, K.; Barrett, C.; Hughey, R. (1998-01-01). "Hidden Markov models for detecting remote protein homologies". Bioinformatics. 14 (10): 846–856. doi:10.1093/bioinformatics/14.10.846. ISSN 1367-4803. PMID 9927713.
  14. ^ Botstein, D.; Cherry, J. M.; Ashburner, M.; Ball, C. A.; Blake, J. A.; Butler, H.; Davis, A. P.; Dolinski, K.; Dwight, S. S.; Eppig, J. T.; Harris, M. A.; Hill, D. P.; Issel-Tarver, L.; Kasarskis, A.; Lewis, S.; Matese, J. C.; Richardson, J. E.; Ringwald, M.; Rubin, G. M.; Sherlock, G. (2000). "Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium". Nature Genetics. 25 (1): 25–29. doi:10.1038/75556. PMC 3037419. PMID 10802651.  
  15. ^ Barrell, Daniel; Dimmer, Emily; Huntley, Rachael P.; Binns, David; O’Donovan, Claire; Apweiler, Rolf (2009-01-01). "The GOA database in 2009—an integrated Gene Ontology Annotation resource". Nucleic Acids Research. 37 (suppl 1): D396–D403. doi:10.1093/nar/gkn803. ISSN 0305-1048. PMC 2686469. PMID 18957448.
  16. ^ Tatusov, Roman L; Fedorova, Natalie D; Jackson, John D; Jacobs, Aviva R; Kiryutin, Boris; Koonin, Eugene V; Krylov, Dmitri M; Mazumder, Raja; Mekhedov, Sergei L (2003-09-11). "The COG database: an updated version includes eukaryotes". BMC Bioinformatics. 4: 41. doi:10.1186/1471-2105-4-41. ISSN 1471-2105. PMC 222959. PMID 12969510.
  17. ^ Vogel, Christine; Berzuini, Carlo; Bashton, Matthew; Gough, Julian; Teichmann, Sarah A. (2004-02-20). "Supra-domains: evolutionary units larger than single protein domains". Journal of Molecular Biology. 336 (3): 809–823. CiteSeerX 10.1.1.116.6568. doi:10.1016/j.jmb.2003.12.026. ISSN 0022-2836. PMID 15095989.
  18. ^ Vogel, Christine; Teichmann, Sarah A.; Pereira-Leal, Jose (2005-02-11). "The relationship between domain duplication and recombination". Journal of Molecular Biology. 346 (1): 355–365. doi:10.1016/j.jmb.2004.11.050. ISSN 0022-2836. PMID 15663950.
  19. ^ Vogel, Christine; Chothia, Cyrus (2006-05-01). "Protein Family Expansions and Biological Complexity". PLOS Computational Biology. 2 (5): e48. Bibcode:2006PLSCB...2...48V. doi:10.1371/journal.pcbi.0020048. ISSN 1553-734X. PMC 1464810. PMID 16733546.
  20. ^ Mulder, Nicola J.; Apweiler, Rolf; Attwood, Teresa K.; Bairoch, Amos; Barrell, Daniel; Bateman, Alex; Binns, David; Biswas, Margaret; Bradley, Paul (2003-01-01). "The InterPro Database, 2003 brings increased coverage and new features". Nucleic Acids Research. 31 (1): 315–318. doi:10.1093/nar/gkg046. ISSN 0305-1048. PMC 165493. PMID 12520011.
  21. ^ Mulder, Nicola J.; Apweiler, Rolf; Attwood, Teresa K.; Bairoch, Amos; Bateman, Alex; Binns, David; Bradley, Paul; Bork, Peer; Bucher, Phillip (2005-01-01). "InterPro, progress and status in 2005". Nucleic Acids Research. 33 (Database Issue): D201–D205. doi:10.1093/nar/gki106. ISSN 0305-1048. PMC 540060. PMID 15608177.
  22. ^ Finn, Robert D.; Mistry, Jaina; Schuster-Böckler, Benjamin; Griffiths-Jones, Sam; Hollich, Volker; Lassmann, Timo; Moxon, Simon; Marshall, Mhairi; Khanna, Ajay (2006-01-01). "Pfam: clans, web tools and services". Nucleic Acids Research. 34 (Database issue): D247–D251. doi:10.1093/nar/gkj149. ISSN 0305-1048. PMC 1347511. PMID 16381856.
  23. ^ Boeckmann, Brigitte; Blatter, Marie-Claude; Famiglietti, Livia; Hinz, Ursula; Lane, Lydie; Roechert, Bernd; Bairoch, Amos (2005-11-01). "Protein variety and functional diversity: Swiss-Prot annotation in its biological context". Comptes Rendus Biologies. 328 (10–11): 882–899. doi:10.1016/j.crvi.2005.06.001. ISSN 1631-0691. PMID 16286078.
  24. ^ Madera, Martin (2008-11-15). "Profile Comparer: a program for scoring and aligning profile hidden Markov models". Bioinformatics. 24 (22): 2630–2631. doi:10.1093/bioinformatics/btn504. ISSN 1367-4803. PMC 2579712. PMID 18845584.
  25. ^ Mudgal, Richa; Sandhya, Sankaran; Chandra, Nagasuma; Srinivasan, Narayanaswamy (2015-07-31). "De-DUFing the DUFs: Deciphering distant evolutionary relationships of Domains of Unknown Function using sensitive homology detection methods". Biology Direct. 10 (1): 38. doi:10.1186/s13062-015-0069-2. PMC 4520260. PMID 26228684.
  26. ^ Nasir, Arshan; Caetano-Anollés, Gustavo (2013). "Comparative Analysis of Proteomes and Functionomes Provides Insights into Origins of Cellular Diversification". Archaea. 2013: 648746. doi:10.1155/2013/648746. PMC 3892558. PMID 24492748.

External links edit

  • SUPERFAMILY database

superfamily, database, superfamily, database, search, platform, structural, functional, annotation, proteins, genomes, classifies, amino, acid, sequences, into, known, structural, domains, especially, into, scop, superfamilies, domains, functional, structural,. SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes 1 2 3 4 5 6 7 It classifies amino acid sequences into known structural domains especially into SCOP superfamilies 8 9 Domains are functional structural and evolutionary units that form proteins Domains of common Ancestry are grouped into superfamilies The domains and domain superfamilies are defined and described in SCOP 8 10 Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology 11 SUPERFAMILYContentDescriptionThe SUPERFAMILY database provides structural and functional annotation for all proteins and genomes Data typescapturedProtein families genome annotation alignments Hidden Markov models HMMs OrganismsallContactResearch centerUniversity of BristolLaboratoryJulian Gough Cyrus ChothiaPrimary citationPMID 19036790AccessData formatFASTA formatWebsitesupfam wbr orgDownload URLsupfam wbr org wbr SUPERFAMILY wbr downloads wbr htmlMiscellaneousLicenseGNU General Public LicenseVersion1 75 Contents 1 Annotations 2 Features 3 Use in Research 4 References 5 External linksAnnotations editThe SUPERFAMILY annotation is based on a collection of hidden Markov models HMM which represent structural protein domains at the SCOP superfamily level 12 13 A superfamily groups together domains which have an evolutionary relationship The annotation is produced by scanning protein sequences from completely sequenced genomes against the hidden Markov models For each protein you can Submit sequences for SCOP classification View domain organisation sequence alignments and protein sequence details For each genome you can Examine superfamily assignments phylogenetic trees domain organisation lists and networks Check for over and under represented superfamilies within a genome For each superfamily you can Inspect SCOP classification functional annotation Gene Ontology annotation 6 14 InterPro abstract and genome assignments Explore taxonomic distribution of a superfamily across the tree of life All annotation models and the database dump are freely available for download to everyone Features editSequence SearchSubmit a protein or DNA sequence for SCOP superfamily and family level classification using the SUPERFAMILY HMM s Sequences can be submitted either by raw input or by uploading a file but all must be in FASTA format Sequences can be amino acids a fixed frame nucleotide sequence or all frames of a submitted nucleotide sequence Up to 1000 sequences can be run at a time Keyword SearchSearch the database using a superfamily family or species name plus a sequence SCOP PDB or HMM ID s A successful search yields the class folds superfamilies families and individual proteins matching the query Domain AssignmentsThe database has domain assignments alignments and architectures for completely sequence eukaryotic and prokaryotic organisms plus sequence collections Comparative Genomics ToolsBrowse unusual over and under represented superfamilies and families adjacent domain pair lists and graphs unique domain pairs domain combinations domain architecture co occurrence networks and domain distribution across taxonomic kingdoms for each organism Genome StatisticsFor each genome number of sequences number of sequences with assignment percentage of sequences with assignment percentage total sequence coverage number of domains assigned number of superfamilies assigned number of families assigned average superfamily size percentage produced by duplication average sequence length average length matched number of domain pairs and number of unique domain architectures Gene OntologyDomain centric Gene Ontology GO automatically annotated Due to the growing gap between sequenced proteins and known functions of proteins it is becoming increasingly important to develop a more automated method for functionally annotating proteins especially for proteins with known domains SUPERFAMILY uses protein level GO annotations taken from the Genome Ontology Annotation GOA project which offers high quality GO annotations directly associated to proteins in the UniprotKB over a wide spectrum of species 15 SUPERFAMILY has generated GO annotations for evolutionarily closed domains at the SCOP family level and distant domains at the SCOP superfamily level Phenotype OntologyDomain centric phenotype anatomy ontology including Disease Ontology Human Phenotype Mouse Phenotype Worm Phenotype Yeast Phenotype Fly Phenotype Fly Anatomy Zebrafish Anatomy Xenopus Anatomy and Arabidopsis Plant Superfamily AnnotationInterPro abstracts for over 1 000 superfamilies and Gene Ontology GO annotation for over 700 superfamilies This feature allows for the direct annotation of key features functions and structures of a superfamily Functional AnnotationFunctional annotation of SCOP 1 73 superfamilies The SUPERFAMILY database uses a scheme of 50 detailed function categories which map to 7 general function categories similar to the scheme used in the COG database 16 A general function assigned to a superfamily was used to reflect the major function for that superfamily The general categories of function are Information storage maintenance of genetic code DNA replication and repair general transcription and translation Regulation Regulation of gene expression and protein activity information processing in response to environmental input signal transduction general regulatory or receptor activity Metabolism Anabolic and catabolic processes cell maintenance and homeostasis secondary metabolism Intra cellular processes cell motility and division cell death intra cellular transport secretion Extra cellular processes inter extr cellular processes like cell adhesion organismal process like blood clotting or the immune system General General and multiple functions interactions with proteins lipids small molecules and ions Other Unknown an unknown function viral proteins or toxins Each domain superfamily in SCOP classes a to g were manually annotated using this scheme 17 18 19 and the information used was provided by SCOP 10 InterPro 20 21 Pfam 22 Swiss Prot 23 and various literature sources Phylogenetic TreesCreate custom phylogenetic trees by selecting 3 or more available genomes on the SUPERFAMILY site Trees are generated using heuristic parsimony methods and are based on protein domain architecture data for all genomes in SUPERFAMILY Genome combinations or specific clades can be displayed as individual trees Similar Domain ArchitecturesThis feature allows the user to find the 10 domain architectures which are most similar to the domain architecture of interest Hidden Markov ModelsProduce SCOP domain assignments for a sequence using the SUPERFAMILY hidden Markov models Profile ComparisonFind remote domain matches when the HMM search fails to find a significant match Profile comparison PRC 24 for aligning and scoring two profile HMM s are used Web ServicesDistributed Annotation Server and linking to SUPERFAMILY DownloadsSequences assignments models MySQL database and scripts updated weekly Use in Research editThe SUPERFAMILY database has numerous research applications and has been used by many research groups for various studies It can serve either as a database for proteins that the user wishes to examine with other methods or to assign a function and structure to a novel or uncharacterized protein One study found SUPERFAMILY to be very adept at correctly assigning an appropriate function and structure to a large number of domains of unknown function by comparing them to the databases hidden Markov models 25 Another study used SUPERFAMILY to generate a data set of 1 733 Fold superfamily domains FSF in use of a comparison of proteomes and functionomes for to identify the origin of cellular diversification 26 References edit Wilson D Pethica R Zhou Y Talbot C Vogel C Madera M Chothia C Gough J January 2009 SUPERFAMILY sophisticated comparative genomics data mining visualization and phylogeny Nucleic Acids Research 37 Database issue D380 6 doi 10 1093 NAR GKN762 ISSN 0305 1048 PMC 2686452 PMID 19036790 Wikidata Q26781958 Madera Martin Vogel Christine Kummerfeld Sarah K Chothia Cyrus Gough Julian 2004 01 01 The SUPERFAMILY database in 2004 additions and improvements Nucleic Acids Research 32 suppl 1 D235 D239 doi 10 1093 nar gkh117 ISSN 0305 1048 PMC 308851 PMID 14681402 Wilson D Madera M Vogel C Chothia C Gough J 2007 The SUPERFAMILY database in 2007 Families and functions Nucleic Acids Research 35 Database issue D308 D313 doi 10 1093 nar gkl910 PMC 1669749 PMID 17098927 Gough J 2002 The SUPERFAMILY database in structural genomics Acta Crystallographica Section D 58 Pt 11 1897 1900 doi 10 1107 s0907444902015160 PMID 12393919 Gough J Chothia C 2002 SUPERFAMILY HMMs representing all proteins of known structure SCOP sequence searches alignments and genome assignments Nucleic Acids Research 30 1 268 272 doi 10 1093 nar 30 1 268 PMC 99153 PMID 11752312 a b De Lima Morais D A Fang H Rackham O J L Wilson D Pethica R Chothia C Gough J 2010 SUPERFAMILY 1 75 including a domain centric gene ontology method Nucleic Acids Research 39 Database issue D427 D434 doi 10 1093 nar gkq1130 PMC 3013712 PMID 21062816 Oates M E Stahlhacke J Vavoulis D V Smithers B Rackham O J Sardar A J Zaucha J Thurlby N Fang H Gough J 2015 The SUPERFAMILY 1 75 database in 2014 A doubling of data Nucleic Acids Research 43 Database issue D227 33 doi 10 1093 nar gku1041 PMC 4383889 PMID 25414345 a b Hubbard T J Ailey B Brenner S E Murzin A G Chothia C 1999 SCOP A Structural Classification of Proteins database Nucleic Acids Research 27 1 254 256 doi 10 1093 nar 27 1 254 PMC 148149 PMID 9847194 Lo Conte L Ailey B Hubbard T J Brenner S E Murzin A G Chothia C 2000 SCOP A Structural Classification of Proteins database Nucleic Acids Research 28 1 257 259 doi 10 1093 nar 28 1 257 PMC 102479 PMID 10592240 a b Andreeva Antonina Howorth Dave Brenner Steven E Hubbard Tim J P Chothia Cyrus Murzin Alexey G 2004 01 01 SCOP database in 2004 refinements integrate structure and sequence family data Nucleic Acids Research 32 Database issue D226 D229 doi 10 1093 nar gkh039 ISSN 0305 1048 PMC 308773 PMID 14681400 Dayhoff M O McLaughlin P J Barker W C Hunt L T 1975 04 01 Evolution of sequences within protein superfamilies Naturwissenschaften 62 4 154 161 Bibcode 1975NW 62 154D doi 10 1007 BF00608697 ISSN 0028 1042 S2CID 40304076 Gough J Karplus K Hughey R Chothia C 2001 Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure1 Journal of Molecular Biology 313 4 903 919 CiteSeerX 10 1 1 144 6577 doi 10 1006 jmbi 2001 5080 PMID 11697912 Karplus K Barrett C Hughey R 1998 01 01 Hidden Markov models for detecting remote protein homologies Bioinformatics 14 10 846 856 doi 10 1093 bioinformatics 14 10 846 ISSN 1367 4803 PMID 9927713 Botstein D Cherry J M Ashburner M Ball C A Blake J A Butler H Davis A P Dolinski K Dwight S S Eppig J T Harris M A Hill D P Issel Tarver L Kasarskis A Lewis S Matese J C Richardson J E Ringwald M Rubin G M Sherlock G 2000 Gene ontology Tool for the unification of biology The Gene Ontology Consortium Nature Genetics 25 1 25 29 doi 10 1038 75556 PMC 3037419 PMID 10802651 nbsp Barrell Daniel Dimmer Emily Huntley Rachael P Binns David O Donovan Claire Apweiler Rolf 2009 01 01 The GOA database in 2009 an integrated Gene Ontology Annotation resource Nucleic Acids Research 37 suppl 1 D396 D403 doi 10 1093 nar gkn803 ISSN 0305 1048 PMC 2686469 PMID 18957448 Tatusov Roman L Fedorova Natalie D Jackson John D Jacobs Aviva R Kiryutin Boris Koonin Eugene V Krylov Dmitri M Mazumder Raja Mekhedov Sergei L 2003 09 11 The COG database an updated version includes eukaryotes BMC Bioinformatics 4 41 doi 10 1186 1471 2105 4 41 ISSN 1471 2105 PMC 222959 PMID 12969510 Vogel Christine Berzuini Carlo Bashton Matthew Gough Julian Teichmann Sarah A 2004 02 20 Supra domains evolutionary units larger than single protein domains Journal of Molecular Biology 336 3 809 823 CiteSeerX 10 1 1 116 6568 doi 10 1016 j jmb 2003 12 026 ISSN 0022 2836 PMID 15095989 Vogel Christine Teichmann Sarah A Pereira Leal Jose 2005 02 11 The relationship between domain duplication and recombination Journal of Molecular Biology 346 1 355 365 doi 10 1016 j jmb 2004 11 050 ISSN 0022 2836 PMID 15663950 Vogel Christine Chothia Cyrus 2006 05 01 Protein Family Expansions and Biological Complexity PLOS Computational Biology 2 5 e48 Bibcode 2006PLSCB 2 48V doi 10 1371 journal pcbi 0020048 ISSN 1553 734X PMC 1464810 PMID 16733546 Mulder Nicola J Apweiler Rolf Attwood Teresa K Bairoch Amos Barrell Daniel Bateman Alex Binns David Biswas Margaret Bradley Paul 2003 01 01 The InterPro Database 2003 brings increased coverage and new features Nucleic Acids Research 31 1 315 318 doi 10 1093 nar gkg046 ISSN 0305 1048 PMC 165493 PMID 12520011 Mulder Nicola J Apweiler Rolf Attwood Teresa K Bairoch Amos Bateman Alex Binns David Bradley Paul Bork Peer Bucher Phillip 2005 01 01 InterPro progress and status in 2005 Nucleic Acids Research 33 Database Issue D201 D205 doi 10 1093 nar gki106 ISSN 0305 1048 PMC 540060 PMID 15608177 Finn Robert D Mistry Jaina Schuster Bockler Benjamin Griffiths Jones Sam Hollich Volker Lassmann Timo Moxon Simon Marshall Mhairi Khanna Ajay 2006 01 01 Pfam clans web tools and services Nucleic Acids Research 34 Database issue D247 D251 doi 10 1093 nar gkj149 ISSN 0305 1048 PMC 1347511 PMID 16381856 Boeckmann Brigitte Blatter Marie Claude Famiglietti Livia Hinz Ursula Lane Lydie Roechert Bernd Bairoch Amos 2005 11 01 Protein variety and functional diversity Swiss Prot annotation in its biological context Comptes Rendus Biologies 328 10 11 882 899 doi 10 1016 j crvi 2005 06 001 ISSN 1631 0691 PMID 16286078 Madera Martin 2008 11 15 Profile Comparer a program for scoring and aligning profile hidden Markov models Bioinformatics 24 22 2630 2631 doi 10 1093 bioinformatics btn504 ISSN 1367 4803 PMC 2579712 PMID 18845584 Mudgal Richa Sandhya Sankaran Chandra Nagasuma Srinivasan Narayanaswamy 2015 07 31 De DUFing the DUFs Deciphering distant evolutionary relationships of Domains of Unknown Function using sensitive homology detection methods Biology Direct 10 1 38 doi 10 1186 s13062 015 0069 2 PMC 4520260 PMID 26228684 Nasir Arshan Caetano Anolles Gustavo 2013 Comparative Analysis of Proteomes and Functionomes Provides Insights into Origins of Cellular Diversification Archaea 2013 648746 doi 10 1155 2013 648746 PMC 3892558 PMID 24492748 External links editSUPERFAMILY database SCOP Structural Classification of Proteins Retrieved from https en wikipedia org w index php title Superfamily database amp oldid 1197201323, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.