Bioinformatics Databases & Tools

BIOINFORMATICS DATABASES & WEB-SERVERS

Best Biological Databases and Web Servers

The list includes a number of useful databases and web-servers used in bioinformatics and biology research.

Show All

Sequence Databases

Gene Expression and Regulation Databases

Protein Structure Databases

Protein Domain ann Family Databases

Interaction and Pathway Databases

Metabolite Databases

Specialized Databases

Disease-Specific Databases

Sequence Databases

Nucleotide Sequence Databases

Nucleotide@NCBI – Database of sequences from several sources, including GenBank, RefSeq, TPA and PDB.
ENA@EBI – European Nucleotide Archive comprehensive record of the world’s nucleotide sequencing information.
DDBJ – The nucleotide sequence database of Japan.

Protein Sequence Databases

PIR – Protein Information Resource is an integrated public bioinformatics resource to support genomic, proteomic and systems biology research.
Protein@NCBI – Database of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.
UniProt – Database of protein sequence and functional information.

Gene Databases

Entrez Gene – Integrates information from a wide range of species.
GeneCards – An integrative database of all annotated and predicted human genes.

Gene Prediction Servers

Genscan – Identification of complete gene structures in genomic DNA.
GeneMark – Gene Prediction in Bacteria, Archaea, Metagenomes and Metatranscriptomes.
GENEID – For predicting genes, exons, splice sites and other signals along a DNA sequence.
AUGUSTUS – For predicting genes in eukaryotic genomic sequences.
EuGene – Integrative gene finder for eukaryotic and prokaryotic genomes.

Genome Databases and Browsers

ENSEMBL – Genome browser for vertebrate genomes.
UCSC Genome Browser – Integrates reference sequence and working draft assemblies for a large collection of genomes at the University of California at Santa Cruz.
Phytozome – Portal for plant comparative genomics .
Gramene – Resource for comparative functional genomics in crops and model plant species.
NCBI Genome Data Viewer – A genome browser for exploration and analysis of eukaryotic RefSeq genome assemblies.
NCBI Genome – Organizes information on genomes including sequences, maps, chromosomes, assemblies, and annotations.
VISTA – A comprehensive suite of programs and databases for comparative analysis of genomic sequences.
GOLD – Genomes Online Database, is a World Wide Web resource for comprehensive access to information regarding genome and metagenome sequencing projects, and their associated metadata.
MITOMAP – A human mitochondrial genome database.

Genome Analysis

GeneCensus – Genome comparisons in terms of metabolic pathway activity and protein family sharing.
GWAS Catalog – The NHGRI-EBI Catalog of human genome-wide association studies.
UCSC Xena – An online exploration tool for public and private, multi-omic and clinical/phenotype data.

Must Read

From Boltz-1 to Boltz-2: Did We Finally Bridge the Gap Between Structure and Affinity?

Deotima Chakraborty - June 10, 2025 0

In modern biology, accurately simulating biomolecular interactions is a major difficulty. Our capacity to predict biomolecular complex structures has significantly improved with recent developments...

D-I-TASSER Outperforms AlphaFold? A New Frontier in Protein Structure Modeling

Deotima Chakraborty - May 29, 2025 0

The requirement and utility of conventional force field-based folding simulations have been called into question by the overwhelming success of deep learning techniques in...

Unlocking Biomolecular Secrets with AF3Score: A Leap Forward in Structural Evaluation

Anchal Negi - May 26, 2025 0

Researchers from Changping Laboratory in Beijing introduced AF3Score, a novel adaptation of AlphaFold3 designed to evaluate biomolecular structures with unprecedented accuracy. This innovation addresses...

Overlapping Genes, Unfolding Insights: Synthetic Biology Meets AI

Deotima Chakraborty - May 22, 2025 0

Given that the sharing of codon nucleotides dramatically reduces the size of protein sequences, viruses in nature often generate overlapping genes (OLG) in alternate...

PhysDock: A New Era in Protein-Ligand Docking with AI and Physics

Anchal Negi - May 19, 2025 0

In a breakthrough that could transform drug discovery, a team of researchers from ShanghaiTech University has introduced PhysDock, a novel AI-powered model designed to...

Gene Expression and Regulation Databases

Gene Expression Databases

GENT2 – Gene expression database for normal and tumor tissues.
GEO@NCBI – The Gene Expression Omnibus repository contains individual gene expression profiles from curated DataSets.
Allen Brain Atlas – Gene expression and neuroanatomical data.
TCGA – The Cancer Genome Atlas provides tools for visualizing, querying and downloading the data released quarterly by the consortium’s member projects.
Cell Miner – A database and query tool designed for cancer research.
Expression Atlas – Provides information about gene and protein expression.

Gene Regulation Databases

miRBase – The microRNA database is a searchable database of published miRNA sequences and annotation.
TRANSFAC – Provides data on eukaryotic transcription factors, their experimentally-proven binding sites, consensus binding sequences (positional weight matrices) and regulated genes.
DBTSS – Database of Transcriptional Start Sites.
ENCODE – A public research consortium aimed at identifying all functional elements in the human and mouse genomes.

Protein Structure Databases

Protein 3D Structure Databases

PDB – Protein Data Bank archive-information about the 3D shapes of proteins, nucleic acids, and complex assemblies.
Structure@NCBI – Protein 3D structure repository at NCBI.
PDBe@EBI – The EBI macromolecular structure database.
PDBSum@EBI – The PDB summary database at EBI.
MMDB@NCBI – The macromolecular database maintained at NCBI.
BMRB – The biological magnetic resonance data bank.
SCOP – Structural Classification of Proteins aims to provide a comprehensive description of the structural and evolutionary relationships between all known proteins structures.
CATH – The database of Calcification, Architecture, Topology and Homologous superfamily.

Databases of protein domain, function, expression and family

Protein Domain Databases

InterPro – A resource that provides functional analysis of protein sequences.
CDD – A database of conserved protein domains.
ProDom – A database of comprehensive set of protein domain families automatically generated from the UniProt knowledge database.
SMART – Simple Modular Architecture Research Tool. It allows the identification and annotation of genetically mobile domains and the analysis of domain architectures.
HPA – The human protein atlas shows expression and localization of proteins in a large variety of normal human tissues, cancer cells and cell lines with the aid of immunohistochemistry.

Protein Family Databases

PFam – A large collection of protein families.
PROSITE – A database of protein families and domains.
RFam – Database of RNA families, represented by multiple sequence alignments,consensus secondary structures and covariance models.
DFam – Database of Transposable Element DNA sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations.
TreeFam – Database composed of phylogenetic trees inferred from animal genomes.

Interaction and Pathway databases

Protein Interaction Databases

STRING@EMBL – A web server for protein-protein interaction.
BioGRID – Database of Protein, Genetic and Chemical Interactions
STITCH@EMBL – A web server for chemical-protein interaction.
REACTOME – An open-source, open access, manually curated and peer-reviewed pathway database.
DAVID – Database for Annotation, Visualization and Integrated Discovery

Pathway Databases

KEGG – A collection of manually drawn pathway maps.
PathGuide – A meta-database that provides an overview of more than 190 web-accessible biological pathway and network databases.
Pathway Commons – A collection of publicly available pathway information from multiple organisms.
PhosphoSitePlus – A comprehensive information and tools for the study of protein post-translational modifications.
METscout – Database brings together metabolism and gene expression landscapes.

Metabolite Databases

HMDB – Human Metabolome Database.
KEGG LIGAND Database – Database for universe of chemical substances and reactions that are relevant to life.
KNApSAcK – A Comprehensive Species-Metabolite Relationship Database.
LIPID MAPS – LIPID Metabolites And Pathways Strategy. Provide access to lipid nomenclature, databases, tools, protocols, standards, tutorials, meetings, publications, and other resources.
MassBank – High Quality Mass Spectral Database.
MetaCyc – It is a curated database of experimentally elucidated metabolic pathways from all domains of life.
METLIN – A repository of metabolite information and tandem mass spectrometry data designed to facilitate metabolite identification in metabolomics.

Specialized Databases

Bacterial Genome Databases

PATRIC – The Pathosystems Resource Integration Center provides integrated data and analysis tools to support biomedical research on bacterial infectious diseases.
BacDive – The Bacterial Diversity Metadatabase is the world’s largest database for standardized bacterial information.

Virus Genome Databases

Viral Genomes – Viral genome information resource at NCBI.
GISAID – Global Initiative on Sharing Avian Influenza Data.
NCBI Flu – Influenza Virus Resource with influenza genomic data and analysis tools.
Plant Viruses – This site provides a central source of information about viruses, viroids and satellites of plants, fungi and protozoa.

Microbial Databases

ECMDB – E. coli Metabolome Database of small molecular metabolites found or produced by Escherichia coli (strain K12, MG1655).

IMG – Integrated Microbial Genomes system serves as a resource for analysis and annotation of genome and metagenome datasets in a comprehensive comparative context.

LoQAtE – The localization and quantititation atlas of the yeast proteome.

Plant Databases

PlantTFDB – The database of plant transcription factors.
TAIR – The Arabidopsis Information Resource (TAIR) is a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana.
AraPort – Araport is a web-server for Arabidopsis thaliana genomics.
IC4R – A curated database providing rice genome sequences, updating rice gene annotations and integrating multiple omics data through community-contributed modules.
Oryzabase – A comprehensive rice science database.
MaizeGDB – Maize Genetics and Genomics Database
SoyBase – Integrating Genetics and Molecular Biology for Soybean Researchers.
SGN – Solanaceae Genomics Network is a data resource of the Solanaceae species including tomoto, potato, peppper, eggplant, petunia, nicotiana.
CuGenDB – The web resource for the International Cucurbit Genomics Initiative including melon, cucumber, watermelen, pumpkin, etc.
GDR – Genome Database for Rosaceae which provides data mining tools and publicly available genomics, genetics and breeding data for Rosaceae.
GoMapMan – Resource for gene functional annotations in the plant sciences.
NPACT – A curated database of plant derived natural compounds that exhibit anti-cancerous activity.
PGDD – A database used to identify and catalog plant genes in terms of intragenome or cross-genome syntenic relationships.
PIECE – A plant gene structure comparison and evolution database of 25 species
PlantRNA – Database for tRNA sequences of plants and algae.
PlnTFDB– Plant Transcription Factor Database provides putatively complete sets of transcription factors (TFs) and other transcriptional regulators in completely sequenced plants.
PMRD – Plant microRNA Database integrates publically available plant miRNA data.
SALAD – Motif-based database of protein annotations for plant comparative genomics.

Model Organism Databases

MGI – International database resource for the laboratory mouse.
RGD – Rat Genome Database. Integrates genetic, genomic, phenotype, and disease-related data generated from rat research.
XenBase – Integrates all the diverse biological, genomic, genotype and phenotype data available from Xenopus research.
Zfin – ZFIN serves as the zebrafish model organism database.
FlyBase – Primary repository of Drosophila Genes & Genomes
OnTheFly – A database of Drosophila melanogaster transcription factor DNA binding specificities.
FlyAtlas – The Drosophila gene expression atlas.
WormBase – Integrates information concerning the genetics, genomics and biology of C. elegans and related nematodes
SGD – The Saccharomyces Genome Database
BDGP – Berkeley Drosophila Genome Project
BeeBase – Comprehensive sequence data source for the bee research community.
PomBase – A comprehensive database of Schizosaccharomyces pombe.
AtMAD – Arabidopsis thaliana Multi-omics Association Database.
ZInc – Database on zebrafish mutations.
OikoBase – A curated genome expression database of Oikopleura dioica.

Invertebrate Vectors of Human Pathogens Database

VectorBase – Database of Invertebrate Vectors of Human Pathogens. Includes reference and variant genome sequence, structural and functional annotations, and phenotypic and population data for traits such as insecticide resistance.

Disease-Specific Databases

AudGenDB – The Audiological and Genetic Database
EDKB – Endocrine Disruptor Knowledge Base
HGMD – The Human Gene Mutation Database
NIAID – National Institute of Allergy and Infectious Diseases
OMIM – Online Mendelian Inheritance in Man. An Online Catalog of Human Genes and Genetic Disorders
PC-GDB – The pancreatic cancer gene database. Latest information on genes causing pancreatic cancer.
Pancreatic Cancer Database – Resource of experimentally demonstrated molecular alterations associated with pancreatic cancer in cancer tissues or cancer cell lines.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

<img class="tdb-logo-img" src="https://cbirt.net/wp-content/uploads/2021/12/CBIRTtextLogo.png" alt="Logo" title="" width="849" height="309" data-eio="l" />CBIRTCentre of Bioinformatics Research & Technology

<img class="tdb-logo-img" src="https://cbirt.net/wp-content/uploads/2021/12/CBIRTtextLogo.png" alt="Logo" title="" width="849" height="309" data-eio="l" />CBIRTCentre of Bioinformatics Research & Technology

<img class="tdb-logo-img" src="https://cbirt.net/wp-content/uploads/2021/12/CBIRTtextLogo.png" alt="Logo" title="" width="849" height="309" data-eio="l" />CBIRTCentre of Bioinformatics Research & Technology

CBIRT

BIOINFORMATICS DATABASES & WEB-SERVERS

Best Biological Databases and Web Servers

Sequence Databases

Nucleotide Sequence Databases

Protein Sequence Databases

Gene Databases

Gene Prediction Servers

Genome Databases and Browsers

Genome Analysis

Must Read

Gene Expression and Regulation Databases

Gene Expression Databases

Gene Regulation Databases

Protein Structure Databases

Protein 3D Structure Databases

Databases of protein domain, function, expression and family

Protein Domain Databases

Protein Family Databases

Interaction and Pathway databases

Protein Interaction Databases

Pathway Databases

Metabolite Databases

Metabolite Databases

Specialized Databases

Bacterial Genome Databases

Virus Genome Databases

Microbial Databases

Plant Databases

Model Organism Databases

Invertebrate Vectors of Human Pathogens Database

Disease-Specific Databases

Company

Latest News

Popular Categories

CBIRTCentre of Bioinformatics Research & Technology

CBIRTCentre of Bioinformatics Research & Technology

CBIRTCentre of Bioinformatics Research & Technology