What is the shotgun sequence?

Shotgun sequencing is a DNA sequencing method whereby a long stretch of DNA is physically divided into small fragments (approximately 2,000 base pairs) that are cloned, sequenced, and assembled using computer analysis. It was developed and made famous by Craig Venter of Celera Corporation. Venter developed the technique in 1996 while working at the Genome Research Institute.

Venter founded Celera in 1998 with the mission of sequencing the human genome in three years. This goal was in direct competition with the already operational Human Genome Project, a consortium of universities working together to sequence the human genome using an older strategy called map-based sequencing, or BAC to BAC. This method involved first dividing the genome into 150,000 base-pair pieces called BACs, assembling the BACs in order, and then sequencing each BAC in detail.

Whole genome shotgun sequencing bypasses the creation and mapping of BACs and starts directly with DNA sequencing. The process begins with acquiring a high molecular weight DNA sample from the organism of interest and physically breaking it into small pieces by passing it through a narrow-bore syringe or sonicating it, a way of breaking the sample using sound waves. Cutting is a random process, so the fragment sequences will have some overlap with each other. Cutting does not specifically create the 2,000 base pair fragments needed for sequencing, rather fragments of the desired size must be purified from the mixture.

The next step is to join the DNA fragments with carrier DNA called a vector. This process is known as cloning and creates a sequencing library from which a complete genome sequence will be created. The sequence of each clone in the library is determined, and computer analysis is used to find overlapping or continuous sequences in each fragment. The assembly of the overlaps creates a "contig", which is a long continuous stretch of DNA sequence.

Shotgun cloning will usually generate some gaps between contigs because some sequences are missing from the library by chance. Gaps can be filled by creating a new library or by using known sequences to extend outward from the contig. Because shotgun sequencing sequences DNA fragments randomly, many fragments are sequenced more than once, creating greater certainty that the sequence is correct than if each fragment had only been sequenced once or twice.

The human genome was sequenced by both the Human Genome Project using map-based sequencing and by Celera using shotgun sequencing. Shotgun sequencing is now the method of choice for other types of genome sequencing. The complete genomes of many organisms, such as the plant Arabidopsis thaliana rice, cow, dog, chicken, chimpanzee, rat, mouse, puffer fish, and many microorganisms have been sequenced in this way.

Go up