Computational inference of H3K4me3 and H3K27ac domain length

Julian Zubek , Michael L. Stitzel , Duygu Ucar , Dariusz Plewczyński


Background. Recent epigenomic studies have shown that the length of a DNA region covered by an epigenetic mark is not just a byproduct of the assaying technologies and has functional implications for that locus. For example, expanded regions of DNA sequences that are marked by enhancer-specific histone modifications, such as acetylation of histone H3 lysine 27 (H3K27ac) domains coincide with cell-specific enhancers, known as super or stretch enhancers. Similarly, promoters of genes critical for cell-specific functions are marked by expanded H3K4me3 domains in the cognate cell type, and these can span DNA regions from 4-5kb up to 40-50kb in length. These expanded H3K4me3 domains are known as buffer domains or super promoters. Methods. To ask what correlates with-and potentially regulates-the length of loci marked with these two important histone marks, H3K4me3 and H3K27ac, we built Random Forest regression models. With these models, we computationally identified genomic and epigenomic patterns that are predictive for the length of these marks in seven ENCODE cell lines. Results. We found that certain epigenetic marks and transcription factors explain the variability of the length of H3K4me3 and H3K27ac marks across different cell types, which implies that the lengths of these two epigenetic marks are tightly regulated in a given cell type. Our source code for the regression models and data can be found at our GitHub page: Discussion. Our Random Forest based regression models enabled us to estimate the individual contribution of different epigenetic marks and protein binding patterns to the length of H3K4me3 and H3K27ac deposition patterns, therefore potentially revealing genomic signatures at cell specific regulatory elements.

Author Julian Zubek - [University of Warsaw, Centre of New Technologies]
Julian Zubek,,
, Michael L. Stitzel - [University of Connecticut]
Michael L. Stitzel,,
, Duygu Ucar - [University of Connecticut]
Duygu Ucar,,
, Dariusz Plewczyński (FMIS / DIPS)
Dariusz Plewczyński,,
- Department of Information Processing Systems
Journal seriesPeerJ, ISSN 2167-8359
Issue year2016
ASJC Classification1100 General Agricultural and Biological Sciences; 1300 General Biochemistry, Genetics and Molecular Biology; 2700 General Medicine; 2800 General Neuroscience
Languageen angielski
Score (nominal)35
Score sourcejournalList
ScoreMinisterial score = 35.0, 04-06-2020, ArticleFromJournal
Ministerial score (2013-2016) = 35.0, 04-06-2020, ArticleFromJournal
Publication indicators Scopus Citations = 3; WoS Citations = 3; Scopus SNIP (Source Normalised Impact per Paper): 2016 = 0.865; WoS Impact Factor: 2016 = 2.177 (2) - 2016=2.354 (5)
Citation count*
Share Share

Get link to the record

* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Are you sure?