COG database update: focus on microbial diversity, model organisms, and widespread pathogens
- PMID: 33167031
- PMCID: PMC7778934
- DOI: 10.1093/nar/gkaa1018
COG database update: focus on microbial diversity, model organisms, and widespread pathogens
Abstract
The Clusters of Orthologous Genes (COG) database, also referred to as the Clusters of Orthologous Groups of proteins, was created in 1997 and went through several rounds of updates, most recently, in 2014. The current update, available at https://www.ncbi.nlm.nih.gov/research/COG, substantially expands the scope of the database to include complete genomes of 1187 bacteria and 122 archaea, typically, with a single genome per genus. In addition, the current version of the COGs includes the following new features: (i) the recently deprecated NCBI's gene index (gi) numbers for the encoded proteins are replaced with stable RefSeq or GenBank\ENA\DDBJ coding sequence (CDS) accession numbers; (ii) COG annotations are updated for >200 newly characterized protein families with corresponding references and PDB links, where available; (iii) lists of COGs grouped by pathways and functional systems are added; (iv) 266 new COGs for proteins involved in CRISPR-Cas immunity, sporulation in Firmicutes and photosynthesis in cyanobacteria are included; and (v) the database is made available as a web page, in addition to FTP. The current release includes 4877 COGs. Future plans include further expansion of the COG collection by adding archaeal COGs (arCOGs), splitting the COGs containing multiple paralogs, and continued refinement of COG annotations.
Published by Oxford University Press on behalf of Nucleic Acids Research 2020.
Similar articles
-
COG database update 2024.Nucleic Acids Res. 2025 Jan 6;53(D1):D356-D363. doi: 10.1093/nar/gkae983. Nucleic Acids Res. 2025. PMID: 39494517 Free PMC article.
-
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.Biol Direct. 2007 Nov 27;2:33. doi: 10.1186/1745-6150-2-33. Biol Direct. 2007. PMID: 18042280 Free PMC article.
-
Expanded microbial genome coverage and improved protein family annotation in the COG database.Nucleic Acids Res. 2015 Jan;43(Database issue):D261-9. doi: 10.1093/nar/gku1223. Epub 2014 Nov 26. Nucleic Acids Res. 2015. PMID: 25428365 Free PMC article.
-
Sensing of environmental signals: classification of chemoreceptors according to the size of their ligand binding regions.Environ Microbiol. 2010 Nov;12(11):2873-84. doi: 10.1111/j.1462-2920.2010.02325.x. Epub 2010 Aug 25. Environ Microbiol. 2010. PMID: 20738376 Review.
-
A genomic perspective on protein families.Science. 1997 Oct 24;278(5338):631-7. doi: 10.1126/science.278.5338.631. Science. 1997. PMID: 9381173 Review.
Cited by
-
AsgeneDB: a curated orthology arsenic metabolism gene database and computational tool for metagenome annotation.NAR Genom Bioinform. 2022 Nov 1;4(4):lqac080. doi: 10.1093/nargab/lqac080. eCollection 2022 Dec. NAR Genom Bioinform. 2022. PMID: 36330044 Free PMC article.
-
Environment-Related Genes Analysis of Limosilactobacillus fermentum Isolated from Food and Human Gut: Genetic Diversity and Adaption Evolution.Foods. 2022 Oct 8;11(19):3135. doi: 10.3390/foods11193135. Foods. 2022. PMID: 36230211 Free PMC article.
-
The Phylogeny and Metabolic Potentials of an Aromatics-Degrading Marivivens Bacterium Isolated from Intertidal Seawater in East China Sea.Microorganisms. 2024 Jun 27;12(7):1308. doi: 10.3390/microorganisms12071308. Microorganisms. 2024. PMID: 39065077 Free PMC article.
-
Genome-wide transcriptome profiling reveals molecular response pathways of Trichoderma harzianum in response to salt stress.Front Microbiol. 2024 Feb 1;15:1342584. doi: 10.3389/fmicb.2024.1342584. eCollection 2024. Front Microbiol. 2024. PMID: 38362502 Free PMC article.
-
Nitrogen Metabolism in Pseudomonas putida: Functional Analysis Using Random Barcode Transposon Sequencing.Appl Environ Microbiol. 2022 Apr 12;88(7):e0243021. doi: 10.1128/aem.02430-21. Epub 2022 Mar 14. Appl Environ Microbiol. 2022. PMID: 35285712 Free PMC article.
References
-
- Tatusov R.L., Koonin E.V., Lipman D.J.. A genomic perspective on protein families. Science. 1997; 278:631–637. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources