fbpx
Wikipedia

Protein Data Bank

The Protein Data Bank (PDB)[1] is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cryo-electron microscopy, and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations (PDBe,[2] PDBj,[3] RCSB,[4] and BMRB[5]). The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.

Protein Data Bank
Content
Description
Contact
Primary citationPMID 30357364
Access
Data formatmmCIF, PDB
Website
  • www.wwpdb.org
  • ebi.ac.uk
  • www.rcsb.org
  • bmrb.io
  • pdbj.org

The PDB is a key in areas of structural biology, such as structural genomics. Most major scientific journals and some funding agencies now require scientists to submit their structure data to the PDB. Many other databases use protein structures deposited in the PDB. For example, SCOP and CATH classify protein structures, while PDBsum provides a graphic overview of PDB entries using information from other sources, such as Gene Ontology.[6][7]

History edit

Two forces converged to initiate the PDB: a small but growing collection of sets of protein structure data determined by X-ray diffraction; and the newly available (1968) molecular graphics display, the Brookhaven RAster Display (BRAD), to visualize these protein structures in 3-D. In 1969, with the sponsorship of Walter Hamilton at the Brookhaven National Laboratory, Edgar Meyer (Texas A&M University) began to write software to store atomic coordinate files in a common format to make them available for geometric and graphical evaluation. By 1971, one of Meyer's programs, SEARCH, enabled researchers to remotely access information from the database to study protein structures offline.[8] SEARCH was instrumental in enabling networking, thus marking the functional beginning of the PDB.

The Protein Data Bank was announced in October 1971 in Nature New Biology[9] as a joint venture between Cambridge Crystallographic Data Centre, UK and Brookhaven National Laboratory, US.

Upon Hamilton's death in 1973, Tom Koetzle took over direction of the PDB for the subsequent 20 years. In January 1994, Joel Sussman of Israel's Weizmann Institute of Science was appointed head of the PDB. In October 1998,[10] the PDB was transferred to the Research Collaboratory for Structural Bioinformatics (RCSB);[11] the transfer was completed in June 1999. The new director was Helen M. Berman of Rutgers University (one of the managing institutions of the RCSB, the other being the San Diego Supercomputer Center at UC San Diego).[12] In 2003, with the formation of the wwPDB, the PDB became an international organization. The founding members are PDBe (Europe),[2] RCSB (US), and PDBj (Japan).[3] The BMRB[5] joined in 2006. Each of the four members of wwPDB can act as deposition, data processing and distribution centers for PDB data. The data processing refers to the fact that wwPDB staff review and annotate each submitted entry.[13] The data are then automatically checked for plausibility (the source code[14] for this validation software has been made available to the public at no charge).

Contents edit

 
Examples of protein structures from the PDB (created with UCSF Chimera)
 
Rate of Protein Structure Determination by Method and Year. MX = macromolecular crystallography, 3DEM = 3D Electron Microscopy.[15]

The PDB database is updated weekly (UTC+0 Wednesday), along with its holdings list.[16] As of 10 January 2023, the PDB comprised:

Experimental
Method
Proteins only Proteins with oligosaccharides Protein/Nucleic Acid
complexes
Nucleic Acids only Other Oligosaccharides only Total
X-ray diffraction 152277 8969 8027 2566 163 11 172013
NMR 12104 32 281 1433 31 6 13887
Electron microscopy 9226 1633 2898 77 8 0 13842
Hybrid 189 7 6 12 0 1 215
Neutron 72 1 0 2 0 0 75
Other 32 0 0 1 0 4 309
Total: 173900 10642 11212 4091 202 22 200069
162,041 structures in the PDB have a structure factor file.
11,242 structures have an NMR restraint file.
5,774 structures in the PDB have a chemical shifts file.
13,388 structures in the PDB have a 3DEM map file deposited in EM Data Bank

Most structures are determined by X-ray diffraction, but about 7% of structures are determined by protein NMR. When using X-ray diffraction, approximations of the coordinates of the atoms of the protein are obtained, whereas using NMR, the distance between pairs of atoms of the protein is estimated. The final conformation of the protein is obtained from NMR by solving a distance geometry problem. After 2013, a growing number of proteins are determined by cryo-electron microscopy.

For PDB structures determined by X-ray diffraction that have a structure factor file, their electron density map may be viewed. The data of such structures may be viewed on the three PDB websites.

Historically, the number of structures in the PDB has grown at an approximately exponential rate, with 100 registered structures in 1982, 1,000 structures in 1993, 10,000 in 1999, 100,000 in 2014, and 200,000 in January 2023.[17][18]

File format edit

The file format initially used by the PDB was called the PDB file format. The original format was restricted by the width of computer punch cards to 80 characters per line. Around 1996, the "macromolecular Crystallographic Information file" format, mmCIF, which is an extension of the CIF format was phased in. mmCIF became the standard format for the PDB archive in 2014.[19] In 2019, the wwPDB announced that depositions for crystallographic methods would only be accepted in mmCIF format.[20]

An XML version of PDB, called PDBML, was described in 2005.[21] The structure files can be downloaded in any of these three formats, though an increasing number of structures do not fit the legacy PDB format. Individual files are easily downloaded into graphics packages from Internet URLs:

  • For PDB format files, use, e.g., http://www.pdb.org/pdb/files/4hhb.pdb.gz or http://pdbe.org/download/4hhb
  • For PDBML (XML) files, use, e.g., http://www.pdb.org/pdb/files/4hhb.xml.gz or http://pdbe.org/pdbml/4hhb

The "4hhb" is the PDB identifier. Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID. (This is not a unique identifier for biomolecules, because several structures for the same molecule—in different environments or conformations—may be contained in PDB with different PDB IDs.)

Viewing the data edit

The structure files may be viewed using one of several free and open source computer programs, including Jmol, Pymol, VMD, Molstar and Rasmol. Other non-free, shareware programs include ICM-Browser,[22] MDL Chime, UCSF Chimera, Swiss-PDB Viewer,[23] StarBiochem[24] (a Java-based interactive molecular viewer with integrated search of protein databank), Sirius, and VisProt3DS[25] (a tool for Protein Visualization in 3D stereoscopic view in anaglyph and other modes), and Discovery Studio. The RCSB PDB website contains an extensive list of both free and commercial molecule visualization programs and web browser plugins.

See also edit

References edit

  1. ^ wwPDB, Consortium (2019). "Protein Data Bank: the single global archive for 3D macromolecular structure data". Nucleic Acids Res. 47 (D1): 520–528. doi:10.1093/nar/gky949. PMC 6324056. PMID 30357364.
  2. ^ a b "PDBe home < Node < EMBL-EBI". pdbe.org.
  3. ^ a b "Protein Data Bank Japan – PDB Japan – PDBj". pdbj.org.
  4. ^ Bank, RCSB Protein Data. "RCSB PDB: Homepage". rcsb.org.
  5. ^ a b "Biological Magnetic Resonance Bank". bmrb.wisc.edu.
  6. ^ Berman, H. M. (January 2008). "The Protein Data Bank: a historical perspective" (PDF). Acta Crystallographica Section A. A64 (1): 88–95. doi:10.1107/S0108767307035623. PMID 18156675.
  7. ^ Laskowski RA, Hutchinson EG, Michie AD, Wallace AC, Jones ML, Thornton JM (December 1997). "PDBsum: a Web-based database of summaries and analyses of all PDB structures". Trends Biochem. Sci. 22 (12): 488–90. doi:10.1016/S0968-0004(97)01140-7. PMID 9433130.
  8. ^ Meyer EF (1997). "The first years of the Protein Data Bank". Protein Science. Cambridge University Press. 6 (7): 1591–1597. doi:10.1002/pro.5560060724. PMC 2143743. PMID 9232661.
  9. ^ "Protein Data Bank". Nature New Biology. 233. 1971. doi:10.1038/newbio233223b0.
  10. ^ Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (January 2000). "The Protein Data Bank". Nucleic Acids Res. 28 (1): 235–242. doi:10.1093/nar/28.1.235. PMC 102472. PMID 10592235.
  11. ^ . RCSB.org. Research Collaboratory for Structural Bioinformatics. Archived from the original on 2007-02-05.
  12. ^ "RCSB PDB Newsletter Archive". RCSB Protein Data Bank.
  13. ^ Curry E, Freitas A, O'Riáin S (2010). "The Role of Community-Driven Data Curation for Enterprises". In D. Wood (ed.). Linking Enterprise Data. Boston: Springer US. pp. 25–47. ISBN 978-1-441-97664-2.
  14. ^ "PDB Validation Suite". sw-tools.pdb.org.
  15. ^ Burley SK, Berman HM, Bhikadiya C, Bi C, Chen L, Costanzo LD, et al. (wwPDB consortium) (January 2019). "Protein Data Bank: the single global archive for 3D macromolecular structure data". Nucleic Acids Research. 47 (D1): D520–D528. doi:10.1093/nar/gky949. PMC 6324056. PMID 30357364.
  16. ^ . RCSB. Archived from the original on 2007-07-04. Retrieved 2007-07-02.
  17. ^ Anon (2014). "Hard data: It has been no small feat for the Protein Data Bank to stay relevant for 100,000 structures". Nature. 509 (7500): 260. doi:10.1038/509260a. PMID 24834514.
  18. ^ Protein Data Bank. "PDB Statistics: Overall Growth of Released Structures Per Year". www.rcsb.org. Retrieved 12 January 2023.
  19. ^ "wwPDB: File Formats and the PDB". wwpdb.org. Retrieved April 1, 2020.
  20. ^ wwPDB.org. "wwPDB: 2019 News". wwpdb.org.
  21. ^ Westbrook J, Ito N, Nakamura H, Henrick K, Berman HM (April 2005). "PDBML: the representation of archival macromolecular structure data in XML". Bioinformatics. 21 (7): 988–992. doi:10.1093/bioinformatics/bti082. PMID 15509603.
  22. ^ "ICM-Browser". Molsoft L.L.C. Retrieved 2013-04-06.
  23. ^ "Swiss PDB Viewer". Swiss Institute of Bioinformatics. Retrieved 2013-04-06.
  24. ^ "STAR: Biochem - Home". web.mit.edu.
  25. ^ "VisProt3DS". Molecular Systems Ltd. Retrieved 2013-04-06.

External links edit

  • The Worldwide Protein Data Bank (wwPDB)—parent site to regional hosts (below)
    • RCSB Protein Data Bank (US)
    • PDBe (Europe)
    • PDBj (Japan)
    • BMRB, Biological Magnetic Resonance Data Bank (US)
  • wwPDB Documentation—documentation on both the PDB and PDBML file formats
  • Looking at Structures 2011-03-24 at the Wayback Machine—The RCSB's introduction to crystallography
  • PDBsum Home Page—Extracts data from other databases about PDB structures.
  • Nucleic Acid Database, NDB—a PDB mirror especially for searching for nucleic acids
  • PDBe: Quick Tour on EBI Train OnLine

protein, data, bank, database, three, dimensional, structural, data, large, biological, molecules, such, proteins, nucleic, acids, data, typically, obtained, crystallography, spectroscopy, increasingly, cryo, electron, microscopy, submitted, biologists, bioche. The Protein Data Bank PDB 1 is a database for the three dimensional structural data of large biological molecules such as proteins and nucleic acids The data typically obtained by X ray crystallography NMR spectroscopy or increasingly cryo electron microscopy and submitted by biologists and biochemists from around the world are freely accessible on the Internet via the websites of its member organisations PDBe 2 PDBj 3 RCSB 4 and BMRB 5 The PDB is overseen by an organization called the Worldwide Protein Data Bank wwPDB Protein Data BankContentDescriptionProtein structureX ray crystallographyNMR Structure DeterminationContactPrimary citationPMID 30357364AccessData formatmmCIF PDBWebsitewww wbr wwpdb wbr org ebi ac uk www wbr rcsb wbr org bmrb wbr io pdbj wbr orgThe PDB is a key in areas of structural biology such as structural genomics Most major scientific journals and some funding agencies now require scientists to submit their structure data to the PDB Many other databases use protein structures deposited in the PDB For example SCOP and CATH classify protein structures while PDBsum provides a graphic overview of PDB entries using information from other sources such as Gene Ontology 6 7 Contents 1 History 2 Contents 3 File format 4 Viewing the data 5 See also 6 References 7 External linksHistory editTwo forces converged to initiate the PDB a small but growing collection of sets of protein structure data determined by X ray diffraction and the newly available 1968 molecular graphics display the Brookhaven RAster Display BRAD to visualize these protein structures in 3 D In 1969 with the sponsorship of Walter Hamilton at the Brookhaven National Laboratory Edgar Meyer Texas A amp M University began to write software to store atomic coordinate files in a common format to make them available for geometric and graphical evaluation By 1971 one of Meyer s programs SEARCH enabled researchers to remotely access information from the database to study protein structures offline 8 SEARCH was instrumental in enabling networking thus marking the functional beginning of the PDB The Protein Data Bank was announced in October 1971 in Nature New Biology 9 as a joint venture between Cambridge Crystallographic Data Centre UK and Brookhaven National Laboratory US Upon Hamilton s death in 1973 Tom Koetzle took over direction of the PDB for the subsequent 20 years In January 1994 Joel Sussman of Israel s Weizmann Institute of Science was appointed head of the PDB In October 1998 10 the PDB was transferred to the Research Collaboratory for Structural Bioinformatics RCSB 11 the transfer was completed in June 1999 The new director was Helen M Berman of Rutgers University one of the managing institutions of the RCSB the other being the San Diego Supercomputer Center at UC San Diego 12 In 2003 with the formation of the wwPDB the PDB became an international organization The founding members are PDBe Europe 2 RCSB US and PDBj Japan 3 The BMRB 5 joined in 2006 Each of the four members of wwPDB can act as deposition data processing and distribution centers for PDB data The data processing refers to the fact that wwPDB staff review and annotate each submitted entry 13 The data are then automatically checked for plausibility the source code 14 for this validation software has been made available to the public at no charge Contents edit nbsp Examples of protein structures from the PDB created with UCSF Chimera nbsp Rate of Protein Structure Determination by Method and Year MX macromolecular crystallography 3DEM 3D Electron Microscopy 15 The PDB database is updated weekly UTC 0 Wednesday along with its holdings list 16 As of 10 January 2023 update the PDB comprised ExperimentalMethod Proteins only Proteins with oligosaccharides Protein Nucleic Acidcomplexes Nucleic Acids only Other Oligosaccharides only TotalX ray diffraction 152277 8969 8027 2566 163 11 172013NMR 12104 32 281 1433 31 6 13887Electron microscopy 9226 1633 2898 77 8 0 13842Hybrid 189 7 6 12 0 1 215Neutron 72 1 0 2 0 0 75Other 32 0 0 1 0 4 309Total 173900 10642 11212 4091 202 22 200069162 041 structures in the PDB have a structure factor file 11 242 structures have an NMR restraint file 5 774 structures in the PDB have a chemical shifts file 13 388 structures in the PDB have a 3DEM map file deposited in EM Data Bank dd Most structures are determined by X ray diffraction but about 7 of structures are determined by protein NMR When using X ray diffraction approximations of the coordinates of the atoms of the protein are obtained whereas using NMR the distance between pairs of atoms of the protein is estimated The final conformation of the protein is obtained from NMR by solving a distance geometry problem After 2013 a growing number of proteins are determined by cryo electron microscopy For PDB structures determined by X ray diffraction that have a structure factor file their electron density map may be viewed The data of such structures may be viewed on the three PDB websites Historically the number of structures in the PDB has grown at an approximately exponential rate with 100 registered structures in 1982 1 000 structures in 1993 10 000 in 1999 100 000 in 2014 and 200 000 in January 2023 17 18 File format editMain article Protein Data Bank file format The file format initially used by the PDB was called the PDB file format The original format was restricted by the width of computer punch cards to 80 characters per line Around 1996 the macromolecular Crystallographic Information file format mmCIF which is an extension of the CIF format was phased in mmCIF became the standard format for the PDB archive in 2014 19 In 2019 the wwPDB announced that depositions for crystallographic methods would only be accepted in mmCIF format 20 An XML version of PDB called PDBML was described in 2005 21 The structure files can be downloaded in any of these three formats though an increasing number of structures do not fit the legacy PDB format Individual files are easily downloaded into graphics packages from Internet URLs For PDB format files use e g http www pdb org pdb files 4hhb pdb gz or http pdbe org download 4hhb For PDBML XML files use e g http www pdb org pdb files 4hhb xml gz or http pdbe org pdbml 4hhbThe 4hhb is the PDB identifier Each structure published in PDB receives a four character alphanumeric identifier its PDB ID This is not a unique identifier for biomolecules because several structures for the same molecule in different environments or conformations may be contained in PDB with different PDB IDs Viewing the data editThe structure files may be viewed using one of several free and open source computer programs including Jmol Pymol VMD Molstar and Rasmol Other non free shareware programs include ICM Browser 22 MDL Chime UCSF Chimera Swiss PDB Viewer 23 StarBiochem 24 a Java based interactive molecular viewer with integrated search of protein databank Sirius and VisProt3DS 25 a tool for Protein Visualization in 3D stereoscopic view in anaglyph and other modes and Discovery Studio The RCSB PDB website contains an extensive list of both free and commercial molecule visualization programs and web browser plugins See also editCrystallographic database Protein structure Protein structure prediction Protein structure database PDBREPORT lists all anomalies also errors in PDB structures PDBsum extracts data from other databases about PDB structures Proteopedia a collaborative 3D encyclopedia of proteins and other molecules 1 References edit wwPDB Consortium 2019 Protein Data Bank the single global archive for 3D macromolecular structure data Nucleic Acids Res 47 D1 520 528 doi 10 1093 nar gky949 PMC 6324056 PMID 30357364 a b PDBe home lt Node lt EMBL EBI pdbe org a b Protein Data Bank Japan PDB Japan PDBj pdbj org Bank RCSB Protein Data RCSB PDB Homepage rcsb org a b Biological Magnetic Resonance Bank bmrb wisc edu Berman H M January 2008 The Protein Data Bank a historical perspective PDF Acta Crystallographica Section A A64 1 88 95 doi 10 1107 S0108767307035623 PMID 18156675 Laskowski RA Hutchinson EG Michie AD Wallace AC Jones ML Thornton JM December 1997 PDBsum a Web based database of summaries and analyses of all PDB structures Trends Biochem Sci 22 12 488 90 doi 10 1016 S0968 0004 97 01140 7 PMID 9433130 Meyer EF 1997 The first years of the Protein Data Bank Protein Science Cambridge University Press 6 7 1591 1597 doi 10 1002 pro 5560060724 PMC 2143743 PMID 9232661 Protein Data Bank Nature New Biology 233 1971 doi 10 1038 newbio233223b0 Berman HM Westbrook J Feng Z Gilliland G Bhat TN Weissig H Shindyalov IN Bourne PE January 2000 The Protein Data Bank Nucleic Acids Res 28 1 235 242 doi 10 1093 nar 28 1 235 PMC 102472 PMID 10592235 Research Collaboratory for Structural Bioinformatics RCSB org Research Collaboratory for Structural Bioinformatics Archived from the original on 2007 02 05 RCSB PDB Newsletter Archive RCSB Protein Data Bank Curry E Freitas A O Riain S 2010 The Role of Community Driven Data Curation for Enterprises In D Wood ed Linking Enterprise Data Boston Springer US pp 25 47 ISBN 978 1 441 97664 2 PDB Validation Suite sw tools pdb org Burley SK Berman HM Bhikadiya C Bi C Chen L Costanzo LD et al wwPDB consortium January 2019 Protein Data Bank the single global archive for 3D macromolecular structure data Nucleic Acids Research 47 D1 D520 D528 doi 10 1093 nar gky949 PMC 6324056 PMID 30357364 PDB Current Holdings Breakdown RCSB Archived from the original on 2007 07 04 Retrieved 2007 07 02 Anon 2014 Hard data It has been no small feat for the Protein Data Bank to stay relevant for 100 000 structures Nature 509 7500 260 doi 10 1038 509260a PMID 24834514 Protein Data Bank PDB Statistics Overall Growth of Released Structures Per Year www rcsb org Retrieved 12 January 2023 wwPDB File Formats and the PDB wwpdb org Retrieved April 1 2020 wwPDB org wwPDB 2019 News wwpdb org Westbrook J Ito N Nakamura H Henrick K Berman HM April 2005 PDBML the representation of archival macromolecular structure data in XML Bioinformatics 21 7 988 992 doi 10 1093 bioinformatics bti082 PMID 15509603 ICM Browser Molsoft L L C Retrieved 2013 04 06 Swiss PDB Viewer Swiss Institute of Bioinformatics Retrieved 2013 04 06 STAR Biochem Home web mit edu VisProt3DS Molecular Systems Ltd Retrieved 2013 04 06 External links edit nbsp Wikidata has the property nbsp PDB structure ID P638 see uses The Worldwide Protein Data Bank wwPDB parent site to regional hosts below RCSB Protein Data Bank US PDBe Europe PDBj Japan BMRB Biological Magnetic Resonance Data Bank US wwPDB Documentation documentation on both the PDB and PDBML file formats Looking at Structures Archived 2011 03 24 at the Wayback Machine The RCSB s introduction to crystallography PDBsum Home Page Extracts data from other databases about PDB structures Nucleic Acid Database NDB a PDB mirror especially for searching for nucleic acids Introductory PDB tutorial sponsored by PDB PDBe Quick Tour on EBI Train OnLine Retrieved from https en wikipedia org w index php title Protein Data Bank amp oldid 1201766509, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.