About proGenomes

proGenomes (Nucleic Acids Res doi: 10.1093/nar/gkw989) provides 25038 consistently annotated bacterial and archaeal genomes. Taxonomic annotations are provided as species clusters (Mende et al., Nature Methods, 2013) and as NCBI taxonomy. Functional annotations of 88 million genes are provided as eggNOG orthologous groups (Huerta-Cepas et al., NAR, 2016), carbohydrate-active enzymes via CAZy (Lombard at al., NAR, 2013; Yin et al., NAR, 2012) as well as their role in antibiotic resistance and virulence.

We further provide a set of 40 universal, single-copy genes for each of the genomes (Cicarelli et al., Science. 2006; Sorek et al., Science, 2007) to support phylogenetic studies of the genomes. Additionally, 5306 representative genomes covering all species clusters are available for direct download and these can be used for the annotation of metagenomics datasets, large scale phylogenetics and other comparative approaches. Within representative genomes, we also provide habitat specific sets.

Start exploring proGenomes by searching for a taxonomic group or species cluster or an individual genome.


Workflow to generate the underlying data of the database


proGenomes is free for academic and non-commercial use. For commercial use or customized versions, please contact biobyte solutions GmbH.

We hope you find the database user-friendly and easy to use. However, if you encounter any problems or have questions, please