List of biological databases

Biological databases are stores of biological information.[1] The journal Nucleic Acids Research regularly publishes special issues on biological databases and has a list of such databases. The 2018 issue has a list of about 180 such databases and updates to previously described databases.[2]

Meta databases

Meta databases are databases of databases that collect data about data to generate new data. They are capable of merging information from different sources and making it available in a new and more convenient form, or with an emphasis on a particular disease or organism.

Model organism databases

Model organism databases provide in-depth biological data for intensively studied model organisms.

Nucleic acid databases

DNA databases

Primary databases
International Nucleotide Sequence Database (INSD) consists of the following databases.

DDBJ (Japan), GenBank (USA) and European Nucleotide Archive (Europe) are repositories for nucleotide sequence data from all organisms. All three accept nucleotide sequence submissions, and then exchange new and updated data on a daily basis to achieve optimal synchronisation between them. These three databases are primary databases, as they house original sequence data. They collaborate with Sequence Read Archive (SRA), which archives raw reads from high-throughput sequencing instruments.

Secondary databases

  • 23andMe's database
  • HapMap
  • OMIM (Online Mendelian Inheritance in Man): inherited diseases
  • RefSeq
  • 1000 Genomes Project: launched in January 2008. The genomes of more than a thousand anonymous participants from a number of different ethnic groups were analyzed and made publicly available.
  • SNP / Disease Databases

Gene expression databases (mostly microarray data)

Genome databases

These databases collect genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species genomes, or a single model organism genome.

Phenotype databases

  • PHI-base: pathogen-host interaction database. It links gene information to phenotypic information from microbial pathogens on their hosts. Information is manually curated from peer reviewed literature.
  • RGD Rat Genome Database: genomic and phenotype data for Rattus norvegicus

RNA databases

Amino acid / protein databases

Protein sequence databases

Protein structure databases

  • Protein Data Bank (PDB), comprising:
    • Protein DataBank in Europe (PDBe)[10]
    • ProteinDatabank in Japan (PDBj)[11]
    • Research Collaboratory for Structural Bioinformatics (RCSB)[12]

For more protein structure databases, see also Protein structure database.

Protein model databases

  • ModBase: database of comparative protein structure models (Sali Lab, UCSF)
  • Protein Model Portal (PMP): meta database that combines several databases of protein structure models (Biozentrum, Basel, Switzerland)
  • Similarity Matrix of Proteins (SIMAP): database of protein similarities computed using FASTA
  • Swiss-model: server and repository for protein structure models

Protein-protein and other molecular interactions

Signal transduction pathway databases

Metabolic pathway and protein function databases

Additional databases

Exosomal databases

  • ExoCarta
  • exoRBase: circular RNA (circRNA), long non-coding RNA (lncRNA) and messenger RNA (mRNA) derived from RNA-seq data analyses of human blood exosomes[14]

Mathematical model databases

Taxonomic databases

  • BacDive: bacterial metadatabase that provides strain-linked information about bacterial and archaeal biodiversity, including taxonomy information
  • EzTaxon-e: database for the identification of prokaryotes based on 16S ribosomal RNA gene sequences

Radiologic databases

Wiki-style databases

Specialized databases

References

  1. Wren JD, Bateman A (October 2008). "Databases, data tombs and dust in the wind". Bioinformatics. 24 (19): 2127–8. doi:10.1093/bioinformatics/btn464. PMID 18819940.
  2. "Volume 46 Issue D1 | Nucleic Acids Research | Oxford Academic". academic.oup.com. Retrieved 2018-09-04.
  3. Blake, J. A. (2002-01-01). "The Mouse Genome Database (MGD): the model organism database for the laboratory mouse". Nucleic Acids Research. 30 (1): 113–115. doi:10.1093/nar/30.1.113. ISSN 1362-4962.
  4. ArrayExpress
  5. GEO
  6. GTEx
  7. Dash S, Campbell JD, Cannon EK, Cleary AM, Huang W, Kalberer SR, Karingula V, Rice AG, Singh J, Umale PE, Weeks NT, Wilkey AP, Farmer AD, Cannon SB (January 2016). "Legume information system (LegumeInfo.org): a key component of a set of federated data resources for the legume family". Nucleic Acids Research. 44 (D1): D1181–8. doi:10.1093/nar/gkv1159. PMC 4702835. PMID 26546515.
  8. "Saccharomyces Genome Database | SGD". www.yeastgenome.org. Retrieved 2018-09-04.
  9. "RNAcentral: a comprehensive database of non-coding RNA sequences". Nucleic Acids Research. 45 (D1): D128–D134. January 2017. doi:10.1093/nar/gkw1008. PMID 27794554.
  10. Mir S, Alhroub Y, Anyango S, Armstrong DR, Berrisford JM, Clark AR, Conroy MJ, Dana JM, Deshpande M, Gupta D, Gutmanas A, Haslam P, Mak L, Mukhopadhyay A, Nadzirin N, Paysan-Lafosse T, Sehnal D, Sen S, Smart OS, Varadi M, Kleywegt GJ, Velankar S (January 2018). "PDBe: towards reusable data delivery infrastructure at protein data bank in Europe". Nucleic Acids Research. 46 (D1): D486–D492. doi:10.1093/nar/gkx1070. PMID 29126160.
  11. Kinjo AR, Bekker GJ, Suzuki H, Tsuchiya Y, Kawabata T, Ikegawa Y, Nakamura H (January 2017). "Protein Data Bank Japan (PDBj): updated user interfaces, resource description framework, analysis tools for large structures". Nucleic Acids Research. 45 (D1): D282–D288. doi:10.1093/nar/gkw962. PMID 27789697.
  12. Rose PW, Prlić A, Altunkaya A, Bi C, Bradley AR, Christie CH, et al. (January 2017). "The RCSB protein data bank: integrative view of protein, gene and 3D structural information". Nucleic Acids Research. 45 (D1): D271–D281. doi:10.1093/nar/gkw1000. PMC 5210513. PMID 27794042.
  13. Basse MJ, Betzi S, Bourgeas R, Bouzidi S, Chetrit B, Hamon V, Morelli X, Roche P (January 2013). "2P2Idb: a structural database dedicated to orthosteric modulation of protein-protein interactions". Nucleic Acids Research. 41 (Database issue): D824–7. doi:10.1093/nar/gks1002. PMID 23203891.
  14. Li S, Li Y, Chen B, Zhao J, Yu S, Tang Y, Zheng Q, Li Y, Wang P, He X, Huang S (January 2018). "exoRBase: a database of circRNA, lncRNA and mRNA in human blood exosomes". Nucleic Acids Research. 46 (D1): D106–D112. doi:10.1093/nar/gkx891. PMID 30053265.
  15. (IHEC) data portal
  16. CEEHRC
  17. Blueprint
  18. EGA
  19. DEEP
  20. CREST
  21. "Sharing epigenomes globally". Nature Methods. 15 (3): 151–151. 2018. doi:10.1038/nmeth.4630. ISSN 1548-7105.
  22. dbGaP
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.