Bioinformatic databases

Due to the volume and structure of the data obtained from biological and medical experiments, the results are made public in the form of online databases. New bioinformatic databases containing the results of scientific research appear on the Internet every year. The majority of them are also regularly updated and organized.

Currently the largest resource of this type is the NCBI (National Center for Biotechnology Information) database, containing the sequences of nucleotides, amino acids, gene annotations, as well as raw sequencing results and others. The database has been maintained and developed for decades by institutions subordinate to the National Institute of Health of United States.

Another of the prominent bioinformatic databases is the UCSC (derived from the University of California, Santa Cruz), where genome sequences of organisms are regularly updated, including human genome reference as hg18/hg19/hg38. The database contains up-to-date information regarding the referential sequence of the human genome and other related organisms.

Additionally, there are numerous smaller, yet highly specialized databases, such as:

  • HGMD and PGMD, where the identified genetic variants are correlated with disease units and additional information about available pharmacotherapies.
  • GreenGenes and SILVA – widely used in the field of microbiological sciences, including bioinformatic analyses of bacteria (16S metagenomics), containing, among others, the sequence sets attributed to all taxonomic level, as well as the information about the phylogenetic origin of organisms and their affinity.
  • In the field of chemical sciences we can distinuish EMBL databases containing nucleic acid sequences, PDB containing sequences of amino acids and proteins, or KEGG database with metabolic pathways.
  • And many others.

