fbpx
Wikipedia

KEGG

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.

KEGG
Content
DescriptionBioinformatics resource for deciphering the genome
OrganismsAll
Contact
Research centerKyoto University
LaboratoryKanehisa Laboratories
Primary citationPMID 10592173
Release date1995
Access
Websitewww.kegg.jp
Web service URLREST see KEGG API
Tools
WebKEGG Mapper

The KEGG database project was initiated in 1995 by Minoru Kanehisa, professor at the Institute for Chemical Research, Kyoto University, under the then ongoing Japanese Human Genome Program.[1][2] Foreseeing the need for a computerized resource that can be used for biological interpretation of genome sequence data, he started developing the KEGG PATHWAY database. It is a collection of manually drawn KEGG pathway maps representing experimental knowledge on metabolism and various other functions of the cell and the organism. Each pathway map contains a network of molecular interactions and reactions and is designed to link genes in the genome to gene products (mostly proteins) in the pathway. This has enabled the analysis called KEGG pathway mapping, whereby the gene content in the genome is compared with the KEGG PATHWAY database to examine which pathways and associated functions are likely to be encoded in the genome.

According to the developers, KEGG is a "computer representation" of the biological system.[3] It integrates building blocks and wiring diagrams of the system—more specifically, genetic building blocks of genes and proteins, chemical building blocks of small molecules and reactions, and wiring diagrams of molecular interaction and reaction networks. This concept is realized in the following databases of KEGG, which are categorized into systems, genomic, chemical, and health information.[4]

Databases edit

Systems information edit

The KEGG PATHWAY database, the wiring diagram database, is the core of the KEGG resource. It is a collection of pathway maps integrating many entities including genes, proteins, RNAs, chemical compounds, glycans, and chemical reactions, as well as disease genes and drug targets, which are stored as individual entries in the other databases of KEGG. The pathway maps are classified into the following sections:

The metabolism section contains aesthetically drawn global maps showing an overall picture of metabolism, in addition to regular metabolic pathway maps. The low-resolution global maps can be used, for example, to compare metabolic capacities of different organisms in genomics studies and different environmental samples in metagenomics studies. In contrast, KEGG modules in the KEGG MODULE database are higher-resolution, localized wiring diagrams, representing tighter functional units within a pathway map, such as subpathways conserved among specific organism groups and molecular complexes. KEGG modules are defined as characteristic gene sets that can be linked to specific metabolic capacities and other phenotypic features, so that they can be used for automatic interpretation of genome and metagenome data.

Another database that supplements KEGG PATHWAY is the KEGG BRITE database. It is an ontology database containing hierarchical classifications of various entities including genes, proteins, organisms, diseases, drugs, and chemical compounds. While KEGG PATHWAY is limited to molecular interactions and reactions of these entities, KEGG BRITE incorporates many different types of relationships.

Genomic information edit

Several months after the KEGG project was initiated in 1995, the first report of the completely sequenced bacterial genome was published.[5] Since then all published complete genomes are accumulated in KEGG for both eukaryotes and prokaryotes. The KEGG GENES database contains gene/protein-level information and the KEGG GENOME database contains organism-level information for these genomes. The KEGG GENES database consists of gene sets for the complete genomes, and genes in each set are given annotations in the form of establishing correspondences to the wiring diagrams of KEGG pathway maps, KEGG modules, and BRITE hierarchies.

These correspondences are made using the concept of orthologs. The KEGG pathway maps are drawn based on experimental evidence in specific organisms but they are designed to be applicable to other organisms as well, because different organisms, such as human and mouse, often share identical pathways consisting of functionally identical genes, called orthologous genes or orthologs. All the genes in the KEGG GENES database are being grouped into such orthologs in the KEGG ORTHOLOGY (KO) database. Because the nodes (gene products) of KEGG pathway maps, as well as KEGG modules and BRITE hierarchies, are given KO identifiers, the correspondences are established once genes in the genome are annotated with KO identifiers by the genome annotation procedure in KEGG.[4]

Chemical information edit

The KEGG metabolic pathway maps are drawn to represent the dual aspects of the metabolic network: the genomic network of how genome-encoded enzymes are connected to catalyze consecutive reactions and the chemical network of how chemical structures of substrates and products are transformed by these reactions.[6] A set of enzyme genes in the genome will identify enzyme relation networks when superimposed on the KEGG pathway maps, which in turn characterize chemical structure transformation networks allowing interpretation of biosynthetic and biodegradation potentials of the organism. Alternatively, a set of metabolites identified in the metabolome will lead to the understanding of enzymatic pathways and enzyme genes involved.

The databases in the chemical information category, which are collectively called KEGG LIGAND, are organized by capturing knowledge of the chemical network. In the beginning of the KEGG project, KEGG LIGAND consisted of three databases: KEGG COMPOUND for chemical compounds, KEGG REACTION for chemical reactions, and KEGG ENZYME for reactions in the enzyme nomenclature.[7] Currently, there are additional databases: KEGG GLYCAN for glycans[8] and two auxiliary reaction databases called RPAIR (reactant pair alignments) and RCLASS (reaction class).[9] KEGG COMPOUND has also been expanded to contain various compounds such as xenobiotics, in addition to metabolites.

Health information edit

In KEGG, diseases are viewed as perturbed states of the biological system caused by perturbants of genetic factors and environmental factors, and drugs are viewed as different types of perturbants.[10] The KEGG PATHWAY database includes not only the normal states but also the perturbed states of the biological systems. However, disease pathway maps cannot be drawn for most diseases because molecular mechanisms are not well understood. An alternative approach is taken in the KEGG DISEASE database, which simply catalogs known genetic factors and environmental factors of diseases. These catalogs may eventually lead to more complete wiring diagrams of diseases.

The KEGG DRUG database contains active ingredients of approved drugs in Japan, the US, and Europe. They are distinguished by chemical structures and/or chemical components and associated with target molecules, metabolizing enzymes, and other molecular interaction network information in the KEGG pathway maps and the BRITE hierarchies. This enables an integrated analysis of drug interactions with genomic information. Crude drugs and other health-related substances, which are outside the category of approved drugs, are stored in the KEGG ENVIRON database. The databases in the health information category are collectively called KEGG MEDICUS, which also includes package inserts of all marketed drugs in Japan.

Subscription model edit

In July 2011 KEGG introduced a subscription model for FTP download due to a significant cutback of government funding. KEGG continues to be freely available through its website, but the subscription model has raised discussions about sustainability of bioinformatics databases.[11][12]

See also edit

References edit

  1. ^ Kanehisa M, Goto S (2000). "KEGG: Kyoto Encyclopedia of Genes and Genomes". Nucleic Acids Res. 28 (1): 27–30. doi:10.1093/nar/28.1.27. PMC 102409. PMID 10592173.
  2. ^ Kanehisa M (1997). "A database for post-genome analysis". Trends Genet. 13 (9): 375–6. doi:10.1016/S0168-9525(97)01223-7. PMID 9287494.
  3. ^ Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006). "From genomics to chemical genomics: new developments in KEGG". Nucleic Acids Res. 34 (Database issue): D354–7. doi:10.1093/nar/gkj102. PMC 1347464. PMID 16381885.
  4. ^ a b Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014). "Data, information, knowledge and principle: back to metabolism in KEGG". Nucleic Acids Res. 42 (Database issue): D199–205. doi:10.1093/nar/gkt1076. PMC 3965122. PMID 24214961.
  5. ^ Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. (1995). "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd". Science. 269 (5223): 496–512. Bibcode:1995Sci...269..496F. doi:10.1126/science.7542800. PMID 7542800. S2CID 10423613.
  6. ^ Kanehisa M (2013). "Chemical and genomic evolution of enzyme-catalyzed reaction networks". FEBS Lett. 587 (17): 2731–7. doi:10.1016/j.febslet.2013.06.026. hdl:2433/178762. PMID 23816707. S2CID 40074657.
  7. ^ Goto S, Nishioka T, Kanehisa M (1999). "LIGAND database for enzymes, compounds and reactions". Nucleic Acids Res. 27 (1): 377–9. doi:10.1093/nar/27.1.377. PMC 148189. PMID 9847234.
  8. ^ Hashimoto K, Goto S, Kawano S, Aoki-Kinoshita KF, Ueda N, Hamajima M, Kawasaki T, Kanehisa M (2006). "KEGG as a glycome informatics resource". Glycobiology. 16 (5): 63R–70R. doi:10.1093/glycob/cwj010. PMID 16014746.
  9. ^ Muto A, Kotera M, Tokimatsu T, Nakagawa Z, Goto S, Kanehisa M (2013). "Modular architecture of metabolic pathways revealed by conserved sequences of reactions". J Chem Inf Model. 53 (3): 613–22. doi:10.1021/ci3005379. PMC 3632090. PMID 23384306.
  10. ^ Kanehisa M, Goto S, Furumichi M, Tanabe M, Hirakawa M (2010). "KEGG for representation and analysis of molecular networks involving diseases and drugs". Nucleic Acids Res. 38 (Database issue): D355–60. doi:10.1093/nar/gkp896. PMC 2808910. PMID 19880382.
  11. ^ Galperin MY, Fernández-Suárez XM (2012). "The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection". Nucleic Acids Res. 40 (Database issue): D1–8. doi:10.1093/nar/gkr1196. PMC 3245068. PMID 22144685.
  12. ^ Hayden, EC (2013). "Popular plant database set to charge users". Nature. doi:10.1038/nature.2013.13642. S2CID 211729309.

External links edit

  • KEGG website
  • GenomeNet mirror site
  • The in MetaBase

kegg, kyoto, encyclopedia, genes, genomes, collection, databases, dealing, with, genomes, biological, pathways, diseases, drugs, chemical, substances, utilized, bioinformatics, research, education, including, data, analysis, genomics, metagenomics, metabolomic. KEGG Kyoto Encyclopedia of Genes and Genomes is a collection of databases dealing with genomes biological pathways diseases drugs and chemical substances KEGG is utilized for bioinformatics research and education including data analysis in genomics metagenomics metabolomics and other omics studies modeling and simulation in systems biology and translational research in drug development KEGGContentDescriptionBioinformatics resource for deciphering the genomeOrganismsAllContactResearch centerKyoto UniversityLaboratoryKanehisa LaboratoriesPrimary citationPMID 10592173Release date1995AccessWebsitewww wbr kegg wbr jpWeb service URLREST see KEGG APIToolsWebKEGG MapperThe KEGG database project was initiated in 1995 by Minoru Kanehisa professor at the Institute for Chemical Research Kyoto University under the then ongoing Japanese Human Genome Program 1 2 Foreseeing the need for a computerized resource that can be used for biological interpretation of genome sequence data he started developing the KEGG PATHWAY database It is a collection of manually drawn KEGG pathway maps representing experimental knowledge on metabolism and various other functions of the cell and the organism Each pathway map contains a network of molecular interactions and reactions and is designed to link genes in the genome to gene products mostly proteins in the pathway This has enabled the analysis called KEGG pathway mapping whereby the gene content in the genome is compared with the KEGG PATHWAY database to examine which pathways and associated functions are likely to be encoded in the genome According to the developers KEGG is a computer representation of the biological system 3 It integrates building blocks and wiring diagrams of the system more specifically genetic building blocks of genes and proteins chemical building blocks of small molecules and reactions and wiring diagrams of molecular interaction and reaction networks This concept is realized in the following databases of KEGG which are categorized into systems genomic chemical and health information 4 Systems information PATHWAY pathway maps for cellular and organismal functions MODULE modules or functional units of genes BRITE hierarchical classifications of biological entities Genomic information GENOME complete genomes GENES genes and proteins in the complete genomes ORTHOLOGY ortholog groups of genes in the complete genomes Chemical information COMPOUND GLYCAN chemical compounds and glycans REACTION RPAIR RCLASS chemical reactions ENZYME enzyme nomenclature Health information DISEASE human diseases DRUG approved drugs ENVIRON crude drugs and health related substancesContents 1 Databases 1 1 Systems information 1 2 Genomic information 1 3 Chemical information 1 4 Health information 2 Subscription model 3 See also 4 References 5 External linksDatabases editSystems information edit The KEGG PATHWAY database the wiring diagram database is the core of the KEGG resource It is a collection of pathway maps integrating many entities including genes proteins RNAs chemical compounds glycans and chemical reactions as well as disease genes and drug targets which are stored as individual entries in the other databases of KEGG The pathway maps are classified into the following sections Metabolism Genetic information processing transcription translation replication and repair etc Environmental information processing membrane transport signal transduction etc Cellular processes cell growth cell death cell membrane functions etc Organismal systems immune system endocrine system nervous system etc Human diseases Drug developmentThe metabolism section contains aesthetically drawn global maps showing an overall picture of metabolism in addition to regular metabolic pathway maps The low resolution global maps can be used for example to compare metabolic capacities of different organisms in genomics studies and different environmental samples in metagenomics studies In contrast KEGG modules in the KEGG MODULE database are higher resolution localized wiring diagrams representing tighter functional units within a pathway map such as subpathways conserved among specific organism groups and molecular complexes KEGG modules are defined as characteristic gene sets that can be linked to specific metabolic capacities and other phenotypic features so that they can be used for automatic interpretation of genome and metagenome data Another database that supplements KEGG PATHWAY is the KEGG BRITE database It is an ontology database containing hierarchical classifications of various entities including genes proteins organisms diseases drugs and chemical compounds While KEGG PATHWAY is limited to molecular interactions and reactions of these entities KEGG BRITE incorporates many different types of relationships Genomic information edit Several months after the KEGG project was initiated in 1995 the first report of the completely sequenced bacterial genome was published 5 Since then all published complete genomes are accumulated in KEGG for both eukaryotes and prokaryotes The KEGG GENES database contains gene protein level information and the KEGG GENOME database contains organism level information for these genomes The KEGG GENES database consists of gene sets for the complete genomes and genes in each set are given annotations in the form of establishing correspondences to the wiring diagrams of KEGG pathway maps KEGG modules and BRITE hierarchies These correspondences are made using the concept of orthologs The KEGG pathway maps are drawn based on experimental evidence in specific organisms but they are designed to be applicable to other organisms as well because different organisms such as human and mouse often share identical pathways consisting of functionally identical genes called orthologous genes or orthologs All the genes in the KEGG GENES database are being grouped into such orthologs in the KEGG ORTHOLOGY KO database Because the nodes gene products of KEGG pathway maps as well as KEGG modules and BRITE hierarchies are given KO identifiers the correspondences are established once genes in the genome are annotated with KO identifiers by the genome annotation procedure in KEGG 4 Chemical information edit The KEGG metabolic pathway maps are drawn to represent the dual aspects of the metabolic network the genomic network of how genome encoded enzymes are connected to catalyze consecutive reactions and the chemical network of how chemical structures of substrates and products are transformed by these reactions 6 A set of enzyme genes in the genome will identify enzyme relation networks when superimposed on the KEGG pathway maps which in turn characterize chemical structure transformation networks allowing interpretation of biosynthetic and biodegradation potentials of the organism Alternatively a set of metabolites identified in the metabolome will lead to the understanding of enzymatic pathways and enzyme genes involved The databases in the chemical information category which are collectively called KEGG LIGAND are organized by capturing knowledge of the chemical network In the beginning of the KEGG project KEGG LIGAND consisted of three databases KEGG COMPOUND for chemical compounds KEGG REACTION for chemical reactions and KEGG ENZYME for reactions in the enzyme nomenclature 7 Currently there are additional databases KEGG GLYCAN for glycans 8 and two auxiliary reaction databases called RPAIR reactant pair alignments and RCLASS reaction class 9 KEGG COMPOUND has also been expanded to contain various compounds such as xenobiotics in addition to metabolites Health information edit In KEGG diseases are viewed as perturbed states of the biological system caused by perturbants of genetic factors and environmental factors and drugs are viewed as different types of perturbants 10 The KEGG PATHWAY database includes not only the normal states but also the perturbed states of the biological systems However disease pathway maps cannot be drawn for most diseases because molecular mechanisms are not well understood An alternative approach is taken in the KEGG DISEASE database which simply catalogs known genetic factors and environmental factors of diseases These catalogs may eventually lead to more complete wiring diagrams of diseases The KEGG DRUG database contains active ingredients of approved drugs in Japan the US and Europe They are distinguished by chemical structures and or chemical components and associated with target molecules metabolizing enzymes and other molecular interaction network information in the KEGG pathway maps and the BRITE hierarchies This enables an integrated analysis of drug interactions with genomic information Crude drugs and other health related substances which are outside the category of approved drugs are stored in the KEGG ENVIRON database The databases in the health information category are collectively called KEGG MEDICUS which also includes package inserts of all marketed drugs in Japan Subscription model editIn July 2011 KEGG introduced a subscription model for FTP download due to a significant cutback of government funding KEGG continues to be freely available through its website but the subscription model has raised discussions about sustainability of bioinformatics databases 11 12 See also editComparative Toxicogenomics Database CTD integrates KEGG pathways with toxicogenomic and disease data ConsensusPathDB a molecular functional interaction database integrating information from KEGG Gene ontology PubMed Uniprot Gene Disease DatabaseReferences edit Kanehisa M Goto S 2000 KEGG Kyoto Encyclopedia of Genes and Genomes Nucleic Acids Res 28 1 27 30 doi 10 1093 nar 28 1 27 PMC 102409 PMID 10592173 Kanehisa M 1997 A database for post genome analysis Trends Genet 13 9 375 6 doi 10 1016 S0168 9525 97 01223 7 PMID 9287494 Kanehisa M Goto S Hattori M Aoki Kinoshita KF Itoh M Kawashima S Katayama T Araki M Hirakawa M 2006 From genomics to chemical genomics new developments in KEGG Nucleic Acids Res 34 Database issue D354 7 doi 10 1093 nar gkj102 PMC 1347464 PMID 16381885 a b Kanehisa M Goto S Sato Y Kawashima M Furumichi M Tanabe M 2014 Data information knowledge and principle back to metabolism in KEGG Nucleic Acids Res 42 Database issue D199 205 doi 10 1093 nar gkt1076 PMC 3965122 PMID 24214961 Fleischmann RD Adams MD White O Clayton RA Kirkness EF Kerlavage AR Bult CJ Tomb JF Dougherty BA Merrick JM et al 1995 Whole genome random sequencing and assembly of Haemophilus influenzae Rd Science 269 5223 496 512 Bibcode 1995Sci 269 496F doi 10 1126 science 7542800 PMID 7542800 S2CID 10423613 Kanehisa M 2013 Chemical and genomic evolution of enzyme catalyzed reaction networks FEBS Lett 587 17 2731 7 doi 10 1016 j febslet 2013 06 026 hdl 2433 178762 PMID 23816707 S2CID 40074657 Goto S Nishioka T Kanehisa M 1999 LIGAND database for enzymes compounds and reactions Nucleic Acids Res 27 1 377 9 doi 10 1093 nar 27 1 377 PMC 148189 PMID 9847234 Hashimoto K Goto S Kawano S Aoki Kinoshita KF Ueda N Hamajima M Kawasaki T Kanehisa M 2006 KEGG as a glycome informatics resource Glycobiology 16 5 63R 70R doi 10 1093 glycob cwj010 PMID 16014746 Muto A Kotera M Tokimatsu T Nakagawa Z Goto S Kanehisa M 2013 Modular architecture of metabolic pathways revealed by conserved sequences of reactions J Chem Inf Model 53 3 613 22 doi 10 1021 ci3005379 PMC 3632090 PMID 23384306 Kanehisa M Goto S Furumichi M Tanabe M Hirakawa M 2010 KEGG for representation and analysis of molecular networks involving diseases and drugs Nucleic Acids Res 38 Database issue D355 60 doi 10 1093 nar gkp896 PMC 2808910 PMID 19880382 Galperin MY Fernandez Suarez XM 2012 The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection Nucleic Acids Res 40 Database issue D1 8 doi 10 1093 nar gkr1196 PMC 3245068 PMID 22144685 Hayden EC 2013 Popular plant database set to charge users Nature doi 10 1038 nature 2013 13642 S2CID 211729309 External links edit nbsp Wikidata has the property nbsp KEGG ID P665 see uses KEGG website GenomeNet mirror site The entry for KEGG in MetaBase Retrieved from https en wikipedia org w index php title KEGG amp oldid 1197323740, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.