10. DNA SEQUENCING

References

Maxam, A. & Gilbert, W. (1977).
A new method of sequencing DNA.
Proceedings of the National Academy of Sciences, USA, 74, 560-4.

Sanger, F. , Nicklen, S. & Coulson, A.R. (1977).
DNA sequencing with chain-terminating inhibitors.
Proceedings of the National Academy of Sciences, USA, 74, 5463-7.

Textbooks covering this topic

Brown. Gene Cloning & DNA Analysis.

Primrose et al. Principles of Gene Manipulation.

Stryer. Biochemistry.


DNA or gene sequencing is the determination of the order of bases (nucleotides) in a sample of DNA. It is the reading of the genetic code. However, not all DNA sequences are genes (i.e. coding regions) as there may, depending on the organism and the source of the DNA sample, also be promoters, tandem repeats, introns, etc.

Gene sequencing = DNA sequencing = nucleotide sequencing = base sequencing.

Two methods for the large-scale sequencing of DNA became available in the late 1970's. Remarkably the two different methods were published in the same year (1977) in the same journal! The two methods are:

1.    Maxam-Gilbert sequencing (chemical cleavage method using double-stranded (ds) DNA).

2.    Sanger-Coulson sequencing (chain termination method using single-stranded (ss) DNA).

Nowadays Sanger-Coulson is the more popular method. Various modifications have been developed and it has been automated for very large-scale sequencing, e.g. Human Genome Project.


MAXAM-GILBERT SEQUENCING

This chemical cleavage method uses double-stranded DNA samples and so does not require cloning of DNA into an M13 phage vector to produce single-stranded DNA as is the case with the Sanger-Coulson method. It involves modification of the bases in DNA followed by chemical base-specific cleavage.

Stages:

  1. Double-stranded DNA to be sequenced is labelled by attaching a radioactive phosphorus (32P) group to the 5' end. Polynucleotide kinase enzyme and 32P-dATP is used here.

  2. Using dimethyl sulphoxide and heating to 90oC, the two strands of the DNA are separated and purified (e.g. using gel electrophoresis and the principle that one of the strands is likely to be heavier than the other due to the fact that it contains more purine nucleotides (A and G) than pyrimidines (C and T) which are lighter).

  3. Single-stranded sample is split into separate samples and each is treated with one of the cleavage reagents. This part of the process involves alteration of bases (e.g. dimethylsulphate methylates guanine) followed by removal of altered bases. Lastly, piperidine is used for cleavage of the strand at the points where bases are missing.


    Base specificity

    Chemical used for
    base
    alteration

    Chemical used for altered base removal

    Chemical used for strand cleavage

    G Dimethylsulphate Piperidine Piperidine
    A+G Acid Acid Piperidine
    C+T Hydrazine Piperidine Piperidine
    C Hydrazine + alkali Piperidine Piperidine
    A>C Alkali Piperidine Piperidine

     

  4. If reactions have been arranged to give only one, or a few, cleavages per DNA molecule, a nested set of end-labelled DNA fragments of different lengths is produced. 
    Click on thumbnails below for diagrams and use BACK button on your browser to return to this page.


    maxam-01.gif (5994 bytes)

  5. The samples are run together on a sequencing gel which separates the fragments by electrophoresis depending on their size. DNA bands in the gel are visualized by autoradiography (32P-labelled 5' end fogs photographic film).


    maxam-02.gif (10154 bytes)

  6. The DNA sequence is read directly from the gel. Try it yourself!

Questions:

A.    At which end of the above gel are the shortest DNA fragments?

B.    What is the sequence of the sample DNA?

ANSWERS


SANGER-COULSON SEQUENCING

This chain termination method uses single-stranded (ss) DNA) which is usually cloned in M13 phage vector. The method is based on the interruption by nucleotide analogues of enzymatic synthesis of a second strand of DNA complementary to the sample. A mixture of different length fragments is produced depending where the interruptions occurred. As with the Maxam-Gilbert method, the mixture of fragments is run on a gel and the sequence read off.

Stages:

  1. Sample DNA to be sequenced is spliced into M13 vector DNA. Infected E. coli host cells release phage particles containing single-stranded (ss) recombinant DNA including the sample sequence. DNA is extracted from phage for sequencing.

  2. A short oligonucleotide primer (usually chemically synthesized and sometimes labelled ) is added to the ss recombinant DNA. The primer anneals at a position which will act as the starting point for synthesis of the complementary strand. (The DNA polymerase to be used requires a primer to initiate complementary strand synthesis.)

  3. DNA polymerase (e.g. Klenow fragment of DNA polymerase I or something similar) is then added in the presence of:

    a)    The 4 normal nucleotides: d-ATP, d-CTP, d-GTP and d-TTP (one or more of which are labelled with
    32P).


    b)    A low concentration of 4 analogues of the normal nucleotides in separate incubation mixes. The analogues are dideoxynucleotides (ddNTP) which are identical to normal nucleotides except that the hydroxyl groups (OH) in the sugar ring are replaced with hydrogens (H). Just like the normal substrates, the analogues  (dd-ATP, dd-CTP, dd-GTP and dd-TTP) have  A, C, G or T bases attached. The DNA polymerase can use the analogues as substrates and cannot in fact distinguish them from the normal nucleotides.

  4. Complementary strand synthesis occurs away from the primer (and away from the 5' end). However, when an analogue becomes incorporated into the new strand, chain termination occurs and further synthesis ceases because the 3' end of the analogue lacks the necessary hydroxyl group to allow further chain extension, unlike the normal nucleotide. Therefore, in each of the 4 separate incubation mixes an assortment of partially synthesized radioactively labelled double stranded DNA molecules will be found. The DNA fragments will vary in length depending at which point the analogue became incorporated. Since incorporation is random, the population of molecules in a mix should represent every position of a particular base. Click on thumbnail for diagram.


    sanger-1.gif (30247 bytes)


  5. Each of the 4 mixes is run together on a sequencing gel which separates the fragments by electrophoresis depending on their size. The gel contains urea which causes denaturation of the double-stranded DNA. The process is carried out at high voltage to generate heat which prevents the strands re-associating. DNA bands in the gel are visualized by autoradiography (32P-label fogs photographic film).

  6. The DNA sequence is read directly from the gel in a similar way to a Maxam-Gilbert sequencing gel.


    sanger-2.gif (10229 bytes)

SEQUENCING LARGE MOLECULES OF DNA

Both the Maxam-Gilbert and Sanger-Coulson  methods can only produce about 400 bases of sequence at a time. Most genes are larger than this. To sequence a large DNA molecule it is cut up (using two or more different restriction enzymes) into different fragments and each fragment is sequenced in turn, including overlaps (which are often identified by computer). The full sequence can then be determined. (See Brown and Primrose et al. for more details.)

 

AUTOMATED DNA SEQUENCING

This can be carried out using capillary array electrophoresis.

This method was developed for the Human Genome Project and greatly speeded up its completion.

It is based on the Sanger-Coulson chain termination method but the 4 different dideoxy nucleotides (ddA, ddC, ddG and ddT) are fluorescently labelled (fluorophores) not radioactively labelled.

Since 4 different fluorophores are used, all 4 reactions can be run in the same tube, greatly increasing the speed and ease of sequencing.

After restriction, DNA fragments are separated by capillary electrophoresis using small (approx. 100 microns in diameter), gel-filled capillary tubes, clustered together and read with a laser scanning system. The system is not only more accurate than reading a gel, but longer molecules of DNA can be sequenced.

Electropherogram: As each capillary tube is moved into the path of the laser beam, fluorescently labelled nucleotides are detected one at a time, producing a coloured electropherogram : The information is then analyzed by a computer to generate the final DNA sequence data.



END OF SECTION 10.
NOW GO TO SECTION 11
(GENE LIBRARIES).


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


The Human Genome Project (HGP)

The Human Genome Project (HGP) began in 1990 with the aim of completely sequencing the human genome whose size is 2.8 x 106 kilobases, i.e. nearly 3 billion bases. The HGP was a collaboration involving laboratories in the UK, USA, Japan, France, Germany, etc. and co-ordinated by the Human Genome Organization (HUGO). Much of the  work was done at the Sanger Laboratory in Cambridge.

The original intention was to achieve complete sequencing by the end of the 20th century/millennium but the deadline was postponed several times as the actual size of the task became apparent. In November 1999 the complete sequence of human chromosome 22 was announced - a significant milestone. This is one of the smaller chromosomes but, once this had been completely sequenced, it seemed only a matter of time before all the other chromosomes were sequenced also.

A draft of the human genome was released in 2000 and a complete one in 2003. This was well ahead of the anticipated schedule due to improved and more automated sequencing methods and substitution of lambda phage vector with YAC vectors. The maximum size of lambda vector inserts is 20kb but YAC vectors can accept much larger inserts (2000+ kb) and this speeded up the project.

However, although the complete human sequence is now known, much further work has to be done on analysis. 

Some argue that the HGP was an unnecessary waste of money and resources because much of the DNA in the human genome appears to be redundant or have no function, but time will tell!

RETURN TO YOUR PREVIOUS POSITION BY CLICKING HERE


 

 

 

 

 

 

 

 

 

 

 

 


ANSWERS TO MAXAM-GILBERT GEL QUESTIONS

Q1.   The shortest fragments are at the bottom of the gel. Remember that the original DNA before cleavage was labelled at the 5' end using 32P. Therefore the shortest fragments carrying label are at the bottom of the gel, i.e. at the 5' end. Unlabelled fragments (i.e. those not carrying the original 5' labelled end and those due to more than one cleavage) will not be detected on the autoradiograph (autoradiogram).

Q2.    The sequence of this sample of DNA is:

5'- A A C T A G G C T T T A G C - 3'

RETURN TO YOUR PREVIOUS POSITION BY CLICKING HERE