Module 3: DNA Databases and Sequence Queries


Human Genome Project (HGP)

Major DNA databases module 3 contents Species-specific databases back to the index of modules

Humans carry ~3,000 Megabases (i.e., 3 x 109 bases or base pairs) of DNA in each of their cells. In 1990, the US Department of Energy and National Institutes of Health established the Human Genome Project (HGP) as a 15-year program to determine the sequence of the complete nucleotide content of the human genome. As part of the HGP, there are 16 institutions called the "Genome Centers" (and other sites), all around the world, wherein almost all of the 80,000 - 100,000 human genes (this number is debatable) are being sequenced. The rate of sequencing has increased almost exponentially since the beginning of the project and the latest estimate for completion of the project has been advanced to the year 2003 from year 2005. Advances in sequencing software and hardware and the increased participation of industrial collaborators has accelerated the project completion time line.

On June 26, 2000, the publicly-funded organizations of the HGP (The International Human Genome Sequencing Consortium) and Celera Genomics, a private company, together announced the completion of a "working draft" of the human genome (See Press Release of the White House Announcement). This working draft consists of overlapping fragments covering 97% of the human genome, of which sequence has already been assembled for approximately 85% of the genome. This data set has sequences which are redundant and with gaps between some of the overlapping fragments. As of August 20, 2000, 23.6 % of the non-redundant human DNA sequence has been determined and cataloged in the major DNA data banks (HGP weekly updates). This data includes the complete sequence of chromosomes 21 and 22, the two of the shortest in the human genome. Analysis of the current sequence has shown 38,000 predicted genes confirmed by experimental evidence.

Major DNA databases module 3 contents Species-specific databases back to the index of modules

| Return to SWBIC home |

The Southwest Biotechnology and Informatics Center WWW server is located at "".
Please send comments and suggestions to: [email protected]
© SWBIC 2001