Multiscale and multivariate methodologies for genomic data analysis
(ANR REFOPOL, O. Hyrien, ENS Paris, C. Thermes, CGM, Gif/Yvette, A. Goldar, CEA/Saclay) Multiscale and multivariate concepts and methodologies are necessary to account for the complexity of genome organization accommodating the tradeoff between DNA compaction and gene accessibility. We developed multiscale wavelet-based algorithms providing us with original clues about the mammalian DNA replication program. These signal-processing tools have now been accepted as bona fide molecular biology protocols.
Using a wavelet-based multiscale pattern recognition framework, we described megabase sized replication domain covering about 1/3 of the human genome as N-shaped regions in DNA strand compositional asymmetry (skew) profiles. Determination of genome-wide replication timing profiles provided us with the experimental confirmation of that skew N-domain border are active replication origins.
Further multiscale analysis of replication timing profiles lead us (i) to describe replication U-domains that display a characteristic U- shaped replication timing profile as the counter part to skew N-domains and (ii) to compute space-scale maps of effective DNA replication speed. These latter measurements are central to our modeling of DNA replication kinetic in mammalian genome. Using PCA, the apparent complexity of a dataset of 13 epigenetic marks was reduced to 4 epigenetic states. Each states correspond to a well defined replication timing window so that the progression of the replication along U-domains corresponds to a directional path across the four chromatin states. These results sheds a new light on the epigenetic regulation of the spatio-temporal replication program in human and provides a framework for further studies in different cell types, in both health and disease.
Finally, in a preliminary work using a graph representation of high throughput chromatin conformation capture data, we showed that replication domain borders are hubs of the chromatin conformation interaction network.