fbpx
Wikipedia

PubChem

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.[2]

PubChem
Content
DescriptionChemicals and their bioassays
OrganismsHumans and other animals
Contact
Research centerNCBI
Primary citationPMID 15879180
Access
Websitehttps://pubchem.ncbi.nlm.nih.gov/
Download URLFTP
Web service URLPUG-View[1]
Miscellaneous
LicensePublic domain

History

PubChem was released in 2004 as a component of the Molecular Libraries Program (MLP) of the NIH. As of November 2015, PubChem contains more than 150 million depositor-provided substance descriptions, 60 million unique chemical structures, and 225 million biological activity test results (from over 1 million assay experiments performed on more than 2 million small-molecules covering almost 10,000 unique protein target sequences that correspond to more than 5,000 genes). It also contains RNA interference (RNAi) screening assays that target over 15,000 genes.[3]

As of August 2018, PubChem contains 247.3 million substance descriptions, 96.5 million unique chemical structures, contributed by 629 data sources from 40 countries. It also contains 237 million bioactivity test results from 1.25 million biological assays, covering >10,000 target protein sequences.[4]

As of 2020, with data integration from over 100 new sources, PubChem contains more than 293 million depositor-provided substance descriptions, 111 million unique chemical structures, and 271 million bioactivity data points from 1.2 million biological assays experiments.[5]

Databases

PubChem consists of three dynamically growing primary databases. As of 5 November 2020 (number of BioAssays is unchanged):

  • Compounds, 111 million entries[5] (up from 94 million entries in 2017[4]), contains pure and characterized chemical compounds.[6]
  • Substances, 293 million entries[5] (up from 236 million entries in 2017[7] and 163 million in Sept. 2014[8]), contains also mixtures, extracts, complexes and uncharacterized substances.
  • BioAssay, bioactivity results from 1.25 million[9] (up from 6,000 in Sept. 2014[10]) high-throughput screening programs with several million values.

Searching

Searching the databases is possible for a broad range of properties including chemical structure, name fragments, chemical formula, molecular weight, XLogP, and hydrogen bond donor and acceptor count.

PubChem contains its own online molecule editor with SMILES/SMARTS and InChI support that allows the import and export of all common chemical file formats to search for structures and fragments.

Each hit provides information about synonyms, chemical properties, chemical structure including SMILES and InChI strings, bioactivity, and links to structurally related compounds and other NCBI databases like PubMed.

In the text search form the database fields can be searched by adding the field name in square brackets to the search term. A numeric range is represented by two numbers separated by a colon. The search terms and field names are case-insensitive. Parentheses and the logical operators AND, OR, and NOT can be used. AND is assumed if no operator is used.

Example (Lipinski's Rule of Five):

0:500[mw] 0:5[hbdc] 0:10[hbac] -5:5[logp] 

Database fields


Identification numbers
Identification number in current database [UID]
Substance identification number [SID]
Compound identification number [CID]
BioAssay identification number [BAID], [AID]

General
Any database field [ALL]
Comment [CMT]
Deposition date [DDAT], [DEPDAT]
Depositor's external ID [SRID], [SRCID]
Source name [SRC], [SRCNAM], [SRCNAME]
Source release date [SRD], [SRDAT], [RLSDAT]
Medical Subject Heading (MeSH) term [MSHT], [MESHT]
MeSH tree node [MSHN], [MESHTN]
MeSH pharmacological actions [PHMA], [PHARMA]

Substance properties
Substance synonyms [SYNO]
IUPAC name [UPAC], [IUPAC]
International Chemical Identifier (InChI) [INCHI]
Molecular weight [MW], [MWT], [MOLWT]
Chemical elements [ELMT], [EL]
Non-Hydrogen atoms [HAC], [HACNT]
Isotope count [IAC], [IACNT]
Total formal charge [TFC], [CHG], [CHRG]
Chiral atom count [ACC], [ACCNT]
Defined chiral atom count [ACDC], [ACDCNT]
Undefined chiral atom count [ACUC], [ACUCNT]
Hydrogen bond acceptor count [HBAC], [HBACNT]
Hydrogen bond donor count [HBDC], [HBDCNT]
Tautomer count [TC], [TCNT], [TTMC]
Rotatable bond count [RBC], [RBCNT]
XLogP[11] [XLGP], [LOGP]

Compound properties
Compound synonyms [CSYN], [CSYNO]
Component count [CC], [CCNT]
Covalent unit (molecule) count [CUC], [CUCNT]
Total bioactivity count [TAC]

See also

  • Chemical database
    • CAS Common Chemistry - run by the American Chemical Society
    • Comparative Toxicogenomics Database - run by North Carolina State University
    • ChEMBL - run by European Bioinformatics Institute
    • ChemSpider - run by UK's Royal Society of Chemistry
    • DrugBank - run by the University of Alberta
    • IUPAC - run by Swiss-based International Union of Pure and Applied Chemistry (IUPAC)
    • Moltable - run by India's National Chemical Laboratory
    • PubChem - run by the National Institute of Health, USA
    • BindingDB - run by the University of California, San Diego
    • SCRIPDB - run by the University of Toronto, Canada
    • National Center for Biotechnology Information (NCBI) - run by the National Institute of Health, USA
    • Entrez - run by the National Institute of Health, USA
    • GenBank - run by the National Institute of Health, USA

References

  1. ^ Kim, Sunghwan; Thiessen, Paul A.; Cheng, Tiejun; Zhang, Jian; Gindulyte, Asta; Bolton, Evan E. (9 August 2019). "PUG-View: programmatic access to chemical annotations integrated in PubChem". Journal of Cheminformatics. 11 (1): 56. doi:10.1186/s13321-019-0375-2. PMC 6688265. PMID 31399858.
  2. ^ "PubChem Source Information". The PubChem Project. USA: National Center for Biotechnology Information.
  3. ^ Kim, Sunghwan; Thiessen, Paul A.; Cheng, Tiejun; Yu, Bo; Shoemaker, Benjamin A.; Wang, Jiyao; Bolton, Evan E.; Wang, Yanli; Bryant, Stephen H. (2016). "Literature information in PubChem: associations between PubChem records and scientific articles". Journal of Cheminformatics. 8: Article 32. doi:10.1186/s13321-016-0142-6. PMC 4901473. PMID 27293485.
  4. ^ a b "Search Results for all compounds". Retrieved 28 January 2016.
  5. ^ a b c Kim, Sunghwan; Chen, Jie; Cheng, Tiejun; Gindulyte, Asta; He, Jia; He, Siqian; Li, Qingliang; Shoemaker, Benjamin A; Thiessen, Paul A; Yu, Bo; Zaslavsky, Leonid; Zhang, Jian; Bolton, Evan E (8 January 2021). "PubChem in 2021: new data content and improved web interfaces". Nucleic Acids Research. 49 (D1): D1388–D1395. doi:10.1093/nar/gkaa971. PMC 7778930. PMID 33151290.
  6. ^ "all[filt] - PubChem Compound Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  7. ^ "all[filt] - PubChem Substance Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 28 January 2016.
  8. ^ "all[filt] - PubChem Substance Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  9. ^ "all[filt] - PubChem BioAssay Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 28 January 2016.
  10. ^ "all[filt] - PubChem BioAssay Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  11. ^ Cheng T (Nov 2007). "Computation of octanol-water partition coefficients by guiding an additive model with knowledge". Journal of Chemical Information and Modeling. 47 (6): 2140–2148. doi:10.1021/ci700257y. PMID 17985865.

External links

  • Official website

pubchem, this, article, needs, additional, citations, verification, please, help, improve, this, article, adding, citations, reliable, sources, unsourced, material, challenged, removed, find, sources, news, newspapers, books, scholar, jstor, january, 2009, lea. This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources PubChem news newspapers books scholar JSTOR January 2009 Learn how and when to remove this template message PubChem is a database of chemical molecules and their activities against biological assays The system is maintained by the National Center for Biotechnology Information NCBI a component of the National Library of Medicine which is part of the United States National Institutes of Health NIH PubChem can be accessed for free through a web user interface Millions of compound structures and descriptive datasets can be freely downloaded via FTP PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1 000 bonds More than 80 database vendors contribute to the growing PubChem database 2 PubChemContentDescriptionChemicals and their bioassaysOrganismsHumans and other animalsContactResearch centerNCBIPrimary citationPMID 15879180AccessWebsitehttps pubchem ncbi nlm nih gov Download URLFTPWeb service URLPUG View 1 MiscellaneousLicensePublic domain Contents 1 History 2 Databases 3 Searching 4 Database fields 5 See also 6 References 7 External linksHistory EditPubChem was released in 2004 as a component of the Molecular Libraries Program MLP of the NIH As of November 2015 PubChem contains more than 150 million depositor provided substance descriptions 60 million unique chemical structures and 225 million biological activity test results from over 1 million assay experiments performed on more than 2 million small molecules covering almost 10 000 unique protein target sequences that correspond to more than 5 000 genes It also contains RNA interference RNAi screening assays that target over 15 000 genes 3 As of August 2018 PubChem contains 247 3 million substance descriptions 96 5 million unique chemical structures contributed by 629 data sources from 40 countries It also contains 237 million bioactivity test results from 1 25 million biological assays covering gt 10 000 target protein sequences 4 As of 2020 with data integration from over 100 new sources PubChem contains more than 293 million depositor provided substance descriptions 111 million unique chemical structures and 271 million bioactivity data points from 1 2 million biological assays experiments 5 Databases EditPubChem consists of three dynamically growing primary databases As of 5 November 2020 number of BioAssays is unchanged Compounds 111 million entries 5 up from 94 million entries in 2017 4 contains pure and characterized chemical compounds 6 Substances 293 million entries 5 up from 236 million entries in 2017 7 and 163 million in Sept 2014 8 contains also mixtures extracts complexes and uncharacterized substances BioAssay bioactivity results from 1 25 million 9 up from 6 000 in Sept 2014 10 high throughput screening programs with several million values Searching EditSearching the databases is possible for a broad range of properties including chemical structure name fragments chemical formula molecular weight XLogP and hydrogen bond donor and acceptor count PubChem contains its own online molecule editor with SMILES SMARTS and InChI support that allows the import and export of all common chemical file formats to search for structures and fragments Each hit provides information about synonyms chemical properties chemical structure including SMILES and InChI strings bioactivity and links to structurally related compounds and other NCBI databases like PubMed In the text search form the database fields can be searched by adding the field name in square brackets to the search term A numeric range is represented by two numbers separated by a colon The search terms and field names are case insensitive Parentheses and the logical operators AND OR and NOT can be used AND is assumed if no operator is used Example Lipinski s Rule of Five 0 500 mw 0 5 hbdc 0 10 hbac 5 5 logp Database fields EditIdentification numbers Identification number in current database UID Substance identification number SID Compound identification number CID BioAssay identification number BAID AID General Any database field ALL Comment CMT Deposition date DDAT DEPDAT Depositor s external ID SRID SRCID Source name SRC SRCNAM SRCNAME Source release date SRD SRDAT RLSDAT Medical Subject Heading MeSH term MSHT MESHT MeSH tree node MSHN MESHTN MeSH pharmacological actions PHMA PHARMA Substance properties Substance synonyms SYNO IUPAC name UPAC IUPAC International Chemical Identifier InChI INCHI Molecular weight MW MWT MOLWT Chemical elements ELMT EL Non Hydrogen atoms HAC HACNT Isotope count IAC IACNT Total formal charge TFC CHG CHRG Chiral atom count ACC ACCNT Defined chiral atom count ACDC ACDCNT Undefined chiral atom count ACUC ACUCNT Hydrogen bond acceptor count HBAC HBACNT Hydrogen bond donor count HBDC HBDCNT Tautomer count TC TCNT TTMC Rotatable bond count RBC RBCNT XLogP 11 XLGP LOGP Compound properties Compound synonyms CSYN CSYNO Component count CC CCNT Covalent unit molecule count CUC CUCNT Total bioactivity count TAC See also EditChemical database CAS Common Chemistry run by the American Chemical Society Comparative Toxicogenomics Database run by North Carolina State University ChEMBL run by European Bioinformatics Institute ChemSpider run by UK s Royal Society of Chemistry DrugBank run by the University of Alberta IUPAC run by Swiss based International Union of Pure and Applied Chemistry IUPAC Moltable run by India s National Chemical Laboratory PubChem run by the National Institute of Health USA BindingDB run by the University of California San Diego SCRIPDB run by the University of Toronto Canada National Center for Biotechnology Information NCBI run by the National Institute of Health USA Entrez run by the National Institute of Health USA GenBank run by the National Institute of Health USAReferences Edit Kim Sunghwan Thiessen Paul A Cheng Tiejun Zhang Jian Gindulyte Asta Bolton Evan E 9 August 2019 PUG View programmatic access to chemical annotations integrated in PubChem Journal of Cheminformatics 11 1 56 doi 10 1186 s13321 019 0375 2 PMC 6688265 PMID 31399858 PubChem Source Information The PubChem Project USA National Center for Biotechnology Information Kim Sunghwan Thiessen Paul A Cheng Tiejun Yu Bo Shoemaker Benjamin A Wang Jiyao Bolton Evan E Wang Yanli Bryant Stephen H 2016 Literature information in PubChem associations between PubChem records and scientific articles Journal of Cheminformatics 8 Article 32 doi 10 1186 s13321 016 0142 6 PMC 4901473 PMID 27293485 a b Search Results for all compounds Retrieved 28 January 2016 a b c Kim Sunghwan Chen Jie Cheng Tiejun Gindulyte Asta He Jia He Siqian Li Qingliang Shoemaker Benjamin A Thiessen Paul A Yu Bo Zaslavsky Leonid Zhang Jian Bolton Evan E 8 January 2021 PubChem in 2021 new data content and improved web interfaces Nucleic Acids Research 49 D1 D1388 D1395 doi 10 1093 nar gkaa971 PMC 7778930 PMID 33151290 all filt PubChem Compound Results The PubChem Project USA National Center for Biotechnology Information Retrieved 7 January 2011 all filt PubChem Substance Results The PubChem Project USA National Center for Biotechnology Information Retrieved 28 January 2016 all filt PubChem Substance Results The PubChem Project USA National Center for Biotechnology Information Retrieved 7 January 2011 all filt PubChem BioAssay Results The PubChem Project USA National Center for Biotechnology Information Retrieved 28 January 2016 all filt PubChem BioAssay Results The PubChem Project USA National Center for Biotechnology Information Retrieved 7 January 2011 Cheng T Nov 2007 Computation of octanol water partition coefficients by guiding an additive model with knowledge Journal of Chemical Information and Modeling 47 6 2140 2148 doi 10 1021 ci700257y PMID 17985865 External links Edit Wikidata has the properties PubChem CID P662 see uses PubChem Substance ID SID P2153 see uses PubChem BioAssay ID AID P2874 see uses Scholia has a profile for PubChem Q278487 Official website Retrieved from https en wikipedia org w index php title PubChem amp oldid 1122737021, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.