Beta Now Live! New & Improved dbGaP Homepage Design

Beta Now Live! New & Improved dbGaP Homepage Design

NCBI is excited to introduce a fresh look and feel to the Database of Genotypes and Phenotypes (dbGaP). Our beta homepage is now available and will become the default later in 2025. We encourage you to try it out and let us know what you think! This represents the first step of our ongoing modernization effort. The updated homepage will allow us to make continuous improvements to dbGaP based on your feedback.    Continue reading “Beta Now Live! New & Improved dbGaP Homepage Design”

Now Available: Updated Bacterial and Archaeal Reference Genome Collection

Now Available: Updated Bacterial and Archaeal Reference Genome Collection

Download the updated bacterial and archaeal reference Genome collection! We built this collection of 21,794 genomes by selecting the “best” genome assembly for each species among the 400,000+ prokaryotic genomes in RefSeq, which is 536 more than was included in the January release. Continue reading “Now Available: Updated Bacterial and Archaeal Reference Genome Collection”

Faster, Better Results for Protein BLAST Searches

Faster, Better Results for Protein BLAST Searches

Effective August 2025, ClusteredNR will become the protein BLAST default database 

We are excited to announce that the default database for protein BLAST searches will soon be the NCBI ClusteredNR database! Introduced in 2022, ClusteredNR is a collection of protein sequence clusters built from the current default database, nr. The representative sequence for each cluster is chosen based on its title and reflects the function of the proteins in the cluster, helping you focus on meaningful biological insights and decreasing redundant results.  

What’s better about ClusteredNR?
  • Faster searches 
  • Decreased redundancy in results 
  • Broader taxonomic coverage in results 

Continue reading “Faster, Better Results for Protein BLAST Searches”

NCBI Taxonomy Updates to Virus Classification

NCBI Taxonomy Updates to Virus Classification

Starting April 28, 2025

In December 2024, we announced several key changes to virus classification in the NCBI Taxonomy database. These updates are part of our ongoing efforts to ensure viral taxonomy reflects the latest scientific understanding and aligns with international standards set by the International Committee on Taxonomy of Viruses (ICTV). We will begin implementing these updates the week of April 28, 2025. 

What to expect from these updates 
  • Improvements in taxonomic groupings and names: Some taxa will be renamed, reclassified, or reorganized based on evolving research and genomic data.  
  • Addition of new binomial species names: We will add more than 7,000 new binomial virus species names. The former species names will be moved below the new names in the taxonomy hierarchy.  

Continue reading “NCBI Taxonomy Updates to Virus Classification”

PubMed Central’s Updated Full-Text Search Preview Now Available

PubMed Central’s Updated Full-Text Search Preview Now Available

As previously announced, NLM’s NCBI is modernizing the PubMed Central (PMC) website. The next step is to update the PMC search functionality and user experience. Before we transition to an updated search later this year, we have a beta version available for you to preview and test!

Try PMC Beta Search and share your feedback

We invite you to try out the updated PMC Beta Search by clicking the link under the search bar on the PMC website (Figure 1). You can submit your feedback by using the “Provide Feedback” button (Figure 2). Feedback will be used to improve the PMC Beta Search before it becomes the default in PMC. Visit the PMC Beta Search user guide for general guidance on the updated search. Continue reading “PubMed Central’s Updated Full-Text Search Preview Now Available”

Coming Soon! Enhancements to ClinVar Homepage

Coming Soon! Enhancements to ClinVar Homepage

Many people visit NCBI’s ClinVar site every day, multiple times a day. As the field of clinical genetics advances, more and more new visitors also come to ClinVar to research the clinical significance of genetic variants. Based on feedback from new and existing customers, we are improving the homepage to serve as a better introduction to data and services within the resource.  

What’s new? 

Improved ClinVar homepage: 

The new ClinVar homepage will allow users to become familiar with the ClinVar data model – how we store and aggregate the data provided by submitters – as well as the organizations that submit data to ClinVar. Visitors can also browse the various methods used to access data in ClinVar.  Continue reading “Coming Soon! Enhancements to ClinVar Homepage”

dbSNP Build 157 Release

dbSNP Build 157 Release

RefSNP (rs) exceed 1.2 billion records 

We are pleased to announce the release of the Database of Single Nucleotide Polymorphisms (dbSNP) Build 157, which has approximately 1.2 billion Reference SNP (rs) records across the human genome. This build includes updated datasets from 1000Genomes, TOPMed, gnomAD, NCBI ALFA release 3, and other studies that now incorporate expanded allele frequency data.

Build 157 includes: 

  • Total RS Count: 1,172,689,405 (live)
  • Total SS Count: 4,849,775,973

All data remain open-access and available for web search, FTP download, and API access.   Continue reading “dbSNP Build 157 Release”

RefSeq Release 229 is Now Available!

RefSeq Release 229 is Now Available!

Check out RefSeq release 229, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

What’s included in this release?

As of March 3, 2025, this full release incorporates genomic, transcript, and protein data containing:

  • 522,879,448 records
  • 399,577,538 proteins
  • 68,985,910 RNAs
  • Sequences from 164,117 organisms 

Continue reading “RefSeq Release 229 is Now Available!”

GenBank Release 265.0 Now Available!

GenBank Release 265.0 Now Available!

GenBank release 265.0 (3/8/2025) is now available on the NCBI FTP site. This release has 41.96 trillion bases and 5.56 billion records.

The current release has: 

  • 255,669,865 traditional records containing 5,415,448,651,743 base pairs of sequence data
  • 4,152,691,448 WGS records containing 35,643,977,584,264 base pairs of sequence data
  • 961,491,801 bulk-oriented TSA records containing 824,439,978,941 base pairs of sequence data
  • 189,703,939 bulk-oriented TLS records containing 7,8062,322,564 base pairs of sequence data 

Continue reading “GenBank Release 265.0 Now Available!”

New Ranks in NCBI Taxonomy: Domain & Realm

New Ranks in NCBI Taxonomy: Domain & Realm

Update (March 26, 2025): The rank for Viruses is now “acellular root.” Correspondingly, the rank for cellular organisms is “cellular root.”

As previously announced, NCBI continues to make improvements to our Taxonomy resource. There have been recent updates to the International Code of Nomenclature of Prokaryotes (ICNP) and proposals by the International Committee on Taxonomy of Viruses (ICTV). As a result, NCBI Taxonomy has discontinued the use of rank “superkingdom” to classify organisms into Archaea, Bacteria, Eukaryota, and Viruses. 

What’s changing? 

New rank: Domain 
  • “Domain” replaces “superkingdom” for Archaea, Bacteria, and Eukaryota  
  • “Acellular root” replaces “superkingdom” for Viruses

Continue reading “New Ranks in NCBI Taxonomy: Domain & Realm”