Module 3: DNA Databases and Sequence Queries
|There are three major DNA databases spread across the globe, as part of the
International Nucleotide Sequence Database Collaboration. They encompass all published
(sometimes unpublished) sequence data arising from all sequencing endeavors around the planet. All three of them are interconnected
thanks to the World Wide Web and exchange data daily. Most publications, covering any field of biological research, require that the
authors of articles containing sequences submit their data to any one of these databases. The publications refer to the relevant
sequence using a reference number (Accession number) generated by the
database. This process makes it fairly simple and globally uniform for any researcher in the world to access the submitted sequence(s).
It is worthwhile to mention, at this point, that all these databases have corresponding protein data banks related to the stored DNA
- Genbank (USA): This database can be accessed from the
National Center for Biotechnology Information (NCBI), which is a division of the National
Library of Medicine, funded by the National Institutes of Health (NIH). As of August 2000, there are approximately 9.546 billion
bases of sequence stored in this database in 8.214 million sequence records. Once every two months, the NCBI compiles all the
available sequence entries and updates them as releases.
- EMBL (Europe): The European site for storage of DNA sequences is at the
European Bioinformatics Institute (EBI) of the European Molecular Biology Laboratory
(EMBL) at Hinxton, United Kingdom.
- DDBJ (Japan): The DNA Data Bank of Japan (DDBJ) is the third major location for storing
DNA data in the world.