4. Biological Databases
Biological databases are digital repositories of biological data,
such as DNA sequences, protein sequences, protein structures,
and gene expression data.
They are used by scientists to store, organize, and analyze
biological data to advance our understanding of life.
Biological databases are essential for biological research.
They provide scientists with a wealth of data to mine for
insights into the molecular basis of life.
6. Nucleic Acid Databases
A nucleotide database is a comprehensive repository of
genetic information that is designed to store and organize
nucleotide sequences that are derived from both DNA and RNA
molecules.
The Nucleotide database is a collection of sequences from
several sources, including GenBank, RefSeq, TPA and PDB.
The Nucleic Acid Database (NDB) (Berman et al., 1992)
was established in 1991 as a resource for specialists in the
field of nucleic acid structure.
It is a centralized platform for storing and accessing
structural information and annotations related to nucleic
acids.
9. Primary Databases
Primary databases are populated with experimentally
derived data such as nucleotide sequence, protein sequence or
macromolecular structure.
Experimental results are submitted directly into the
database by researchers, and the data are essentially archival
in nature.
Once given a database accession number, the data in
primary databases are never changed: they form part of the
scientific record.
10. GenBank
The GenBank sequence database is an open access, annotated collection of all
publicly available nucleotide sequences and their protein translations.
This database is produced at the National Center for Biotechnology
Information (NCBI) as part of the International Nucleotide Sequence Database
Collaboration (INSDC).
The NCBI is located in Bethesda, Maryland, and was founded in 1988
through legislation sponsored by US Congressman Claude Pepper.
GenBank and its collaborators receive sequences produced in laboratories
throughout the world from more than 100,000 distinct organisms.
GenBank continues to grow at an exponential rate, doubling every 18 months.
GenBank is built by direct submissions from individual laboratories, as well as
from bulk submissions from large-scale sequencing centers, and submissions
from private individuals.
11.
12. EMBL
EMBL’s European Bioinformatics Institute (EMBL-
EBI), founded in 1994, provides freely available data,
bioinformatics services and training to the life science
community worldwide.
EMBL-EBI is part of the , an intergov European Molecular Biology Laboratory
(EMBL) , an intergovernmental research organisation funded by over 20 member
states, prospect and associate member states.
EMBL-EBI’s co-location with the Wellcome Sanger Institute offers vital
synergies. EMBL-EBI maintains the world’s most comprehensive range of
molecular data resources, developed in collaboration with colleagues worldwide,
and open to all.
EMBL is also actively engaged in developing its discoveries to benefit
society. EMBL’s technology transfer partner, EMBLEM, identifies,
protects, and commercialises the intellectual property developed at EMBL,
as well as by EMBL alumni and third parties.
13.
14. DDBJ
The DNA Data Bank of Japan is a public database
of nucleotide sequences established at the
National Institute of Genetics (NIG).
Since 1987, the DDBJ has been collecting annotated nucleotide
sequences as its traditional database service.
The principal purpose of DDBJ operations is to improve the quality of
INSD, as public domains. When researchers make their data open to the
public through INSD and commonly shared in worldwide, we at DDBJ
Center make efforts to describe information on the data as rich as
possible, according to the unified rules of INSD, preferably without any
stress by using DDBJ.
DDBJ serves as the only nucleotide sequence archive database in Asia.
15.
16. INSDC
The International Nucleotide Sequence Database Collaboration
(INSDC) is a long-standing foundational initiative that operates
between DDBJ, EMBL-EBI and NCBI.
INSDC covers the spectrum of data raw reads, through alignments
and assemblies to functional annotation, enriched with contextual
information relating to samples and experimental configurations.
The INSDC members work together to ensure that all public
domain nucleotide sequence data deposited in the archives is
preserved as part of the scientific record and is accessible in
standardized formats across the three sites through daily data
exchange.
17.
18. Application of Nucleic Acid Database
Nucleotide databases are used to identify the gene or the function of a
particular nucleotide sequence by comparing an unknown sequence with the
known sequences in the database.
Nucleotide databases can be used to study and examine gene expression by
using the sequence information stored in the databases.
Nucleotide databases are also used to identify potential drug targets and
develop new therapies for genetic diseases.
Nucleotide databases also help in identifying genetic variations that may be
linked to diseases, which ultimately helps in the development of diagnostic
tools and treatments.
Nucleotide databases can be used in phylogenetic analysis to analyze the
evolutionary relationships between organisms, by comparing and examining
their DNA or RNA sequences.