What does Dave do?

Dave, Anja and Raghu (Cold Spring Harbor Labs, September 1999)

Computational Molecular Biology:

I am working with the Fangman/Brewer genetics lab at the University of Washington. Here is a link to a list of all the lab members; we need to get a much better lab home page!

Before a cell can undergo viable division, it must duplicate its genetic material. In non-bacterial organisms, the genetic material (the DNA) is organized into chromosomes and the duplication of each follows a reproducible temporal pattern, beginning at specific "origins" of replication along the chromosome. One underlying problem is the determination of these origin locations along the chromosome and the timing of activation of each origin. In the yeast Saccharomyces cerevisiae, conventional techniques allow one to analyze whether a specifically chosen "small" fragment of a chromosome contains an origin of replication. However, an exhaustive use of this approach across the entire genome would require roughly 1500 separate experiments. The advent of DNA microarray technology and the complete sequencing of the yeast genome has allowed a genome-wide approach to studying replication of yeast DNA.

Observations revealing that different parts of a chromosome replicate at different times during the cell cycle were made almost 40 years ago (Taylor, 1960) . It is now clear that the human chromosome set is composed of over 1,000 discrete temporal domains that correspond to bands seen in prophase chromosomes, and that replication in these bands begins at different times (e.g., Drouin et al., 1991) . This kind of temporal organization is highly conserved among different species. The temporal program of chromosome replication in the yeast Saccharomyces cerevisiae was discovered using molecular techniques: it was shown that centromeres (interior regions of the chromosome) replicate during the first part of cell cycle S phase and telomeres (regions at the ends of chromosomes) near the end of S phase (McCarroll and Fangman, 1988) . The classical method of determining the kinetics of origin activation in yeast uses a two-pronged approach. One part of the strategy is to identify a chromosomal origin of interest through the use of two-dimensional agarose gel electrophoresis (Brewer and Fangman, 1987). This technique detects origins in chromosomal DNA by the characteristic "bubble" structure of replication intermediates produced by the process of initiation. It is important to emphasize that this method requires making an initial guess as to the location of such an origin. The second part of the strategy is the use of density transfer experimental techniques, modeled on the heavy isotope Meselson-Stahl experiments, to establish the time of replication of a given restriction fragment (usually 3000-15000 base pairs) of the genome (McCarroll and Fangman, 1988). From these two types of experiments, therefore, one can determine whether a particular DNA sequence contains an origin of replication, and when in S phase that sequence is replicated. Notice, this combination of methods is a directed approach: one tests a particular fragment of the chromosome for the location and efficiency of an origin and then assays for its time of replication. To assess the dynamics of replication over extended stretches of the genome under any given experimental condition (rather than the activation of a specific origin), a small set of sequences scattered through the region of interest are chosen to assay, with the assumption that that set will constitute a fair representation of the genome as a whole. The recent invention of genomic DNA microarrays has suggested a new approach to replication analysis. In contrast to the directed approach described above, the new microarray technology allows one to simultaneously investigate origin activation and replication kinetics in a genome-wide fashion.

In a pilot study collaboration between the Fangman/Brewer lab (University of Washington), Elizabeth Winzeler of the Ronald Davis lab (Stanford) and Lisa Wodicka and David Lockhart (Affymetrix Corporation), we have demonstrated the utility of these microarrays in determining the genome-wide dynamics of chromosome replication in the yeast Saccharomyces cerevisiae.

The key experiment behind this proposal involves a so-called density transfer experiment. Cells are grown under heavy isotope conditions for several generations, which will label all DNA as HH (Heavy-Heavy, meaning each strand of the helical DNA duplex is heavy isotope labeled). Cells are synchronized, transferred to light growth medium and released into the replication S phase of the cell cycle, then samples are collected at various times in S phase. For each sample, the DNA is separated into unreplicated (fully dense, HH) and replicated (hybrid density, HL) fractions. These DNA pools are labeled and individually hybridized to an ordered microarray that collectively represents the whole nuclear genome of yeast. At the start of S phase, there should be no DNA in the HL pool, so hybridization will be seen only with the HH pool. As S phase progresses, sequences that are replicated will no longer be present in the HH pool, but will instead appear in the HL pool. Therefore, the earliest sequences that appear in the HL pool (seen as hybridization to the microarray) represent the earliest activated replication origins. In general, a replication origin can be detected as a sequence whose appearance in the HL pool precedes that of its flanking sequences. A comparison of the amount of hybridization with HH vs. HL DNA for each time point will give the percent replication of a given sequence at that time in S phase, from which one can deduce the kinetics of replication of any sequence of interest. Unlike the conventional method, however, a single experiment yields information on the replication kinetics of the whole genome. One tremendous advantage of this approach (besides the greatly increased information yield) is that one no longer has to make assumptions about which portions of the genome are going to be interesting for any given experiment. A consequence of this approach is that the amount of data to be analyzed is voluminous, requiring creative computational analysis to extract useful biological information.

The raw data, which gives a "percent replictation profile", can be plotted and looks like this for yeast chromosome 6:

Finally, in order to study origin location and timing, a surface is constructed in three dimensional space using Fourier smoothed data. If you imagine slicing the surface with a plane parallel to the right-hand end of the pictured box (left below), a curve is traced in that plane. We calculate the area under this curve, then plot these areas as you move across the chromosome. Origin locations are then detected by peaks and the relative heights indicate timing of the origins; the highest peak would correspond to the earliest firing origin. The figures below give the surface plot (left) and area plot (right) for chromosome 6.

The above work on chromosome 6 was carried out because this is the one chromosome for which extensive results were previously known. The satisfying end result is that this new technique recaptured these previously known results. This work, together with results on chromosome 10 are currently being written up for publication. We have carried out the analysis for all the chromosomes. We have a good estimate on the number and locations of the so-called active origins of DNA replication (i.e. all the places DNA replication begins). Our current count is 411. Recently, Steve Bell (MIT) gave a visiting lecture and announced ``...our (Steve's) lab estimates approximately 800 PRC's in a yeast cell...''; this was exciting because there are exactly two PRC's (big protein complexes) for each origin of replication, so our work would predict 822 such complexes.