The Interactive Fly
Zygotically transcribed genes
INDEX
Chromatin consists of proteins that serve as the structural organizer of DNA, binding DNA into higher order structures and ultimately forming the chromosome itself. Chromatin restricts the access of DNA to transcription factors. Both Polycomb and trithorax group proteins act to remodel chromatin altering the accessibility of DNA to factors required for gene transcription. Polycomb group genes are involved in chromatin based gene silencing, while trithorax group genes counteract the silencing effects of chromatin to maintain gene activity.
There is an evolving understanding of the enzymes that function to remodel chromatin. At least two systems related to yeast SWI/SNF proteins function to open up chromatin, permitting access to transcription factors. Information on SWI/SNF homologs can be found at the ISWI and Brahma sites. Information about the potential role of the origin recognition complex in chromatin remodeling can be found at the Origin recognition complex 2 (ORC2) site.
Nucleosome assembly protein-1 (NAP1) and Chromatin assembly factor 1 subunit (CAF1) play the role of histone chaperones in establishing an ordered nucleosome structure on newly synthesized DNA. Drosophila CAF-1 appears to comprise four subunits of 180, 105, 75 and 55 kDa. The smallest subunit of Drosophila CAF-1, p55, is homologous to a mammalian RbAp-48 protein which is associated with the HD1 histone deacetylase. A model for the role of core histone chaperones in chromatin assembly is as follows: CAF-1 binds to newly synthesized H3 and (acetylated) H4 and mediates the formation of the H3-H4 tetramer into newly replicated DNA: histones H2A-H2B are subsequently incorporated with the assistance of other histone chaperones, such as nucleoplasmin or NAP1, to give the complete histone octamer. The initial histone acetylation may be required to neutralize its high positive charge, allowing it to be assembled into chromatin. Deacetylation of histones carried out by histone deacetylase, could be a prerequisite to maturation of chromatin. In any case, it is now clear that chromatin assembly and maturation involves histone acetylation and that this process begins in cytoplasm and histones are subsequently transferred to the nucleus and the deacetylated (Tyler, 1996).
Regulatory elements called enhancers, or locus control regions are capable of exerting their influence over long distances, and in an orientation-independent manner to orchestrate the complex gene expression patterns required for embryonic development. How are the effects of enhancers confined to the genes they regulate? In recent years the concept of chromatin based domain boundaries or insulator elements has developed, based on the genetic properties of several eukaryotic genes. One example of an insulator element is the Drosophila gypsy insulator. For discussion of the gypsy insulator, and the role of two proteins, Suppressor of Hairy wing and MOD(MDG4) in its regulation, see the su(Hw) site.
One other aspect of gene silencing has been established for mammalian and yeast systems. Whereas histone acetylation is known to be involved in gene activation in Drosophila dosage compensation (See Male-specific lethal 2), a role for deacetylation in gene silencing has not yet been established in Drosophila. Two examples of the role of histone deacetylation in gene silencing in mammals will be described briefly here. Histone deacetylation plays a role in mammalian Myc mediated silencing (see Drosophila Myc Evolutionary Homologs section for more information) and in mammalian nuclear receptor mediated silencing (see Ecdysone receptor Evolutionary Homologs section for more information).
Myc family proteins function through heterodimerization with the stable, constitutively expressed bHLH-Zip protein, Max. Human Mad protein homodimerizes poorly but binds Max in vitro, forming a sequence-specific DNA binding complex with properties very similar to those of Myc-Max. Both Myc-Max and Mad-Max heterocomplexes are favored over Max homodimers. Mad does not associate with Myc or with representative bHLH, bZip, or bHLH-Zip proteins. On the other hand, Myc-Max and Mad-Max complexes carry out opposing functions in transcription and Max plays a central role in this network of transcription factors (Ayer, 1993).
Members of the Mad family of bHLH-Zip proteins heterodimerize with Max to repress transcription in a sequence-specific manner. Transcriptional repression by Mad:Max heterodimers is mediated by ternary complex formation with either of the corepressors mSin3A or mSin3B. mSin3A is an in vivo component of large, heterogeneous multiprotein complexes and is tightly and specifically associated with at least seven polypeptides. Two of the mSin3A-associated proteins, p50 and p55, are highly related to the histone deacetylase HDAC1. The mSin3A immunocomplexes possess histone deacetylase activity that is sensitive to the specific deacetylase inhibitor trapoxin. mSin3A-targeted repression is reduced by trapoxin treatment, suggesting that histone deacetylation mediates transcriptional repression through Mad-Max-mSin3A multimeric complexes (Hassig, 1997).
The same proteins that mediate transcriptional silencing of Mad-Max also mediate transcriptional silencing of nuclear hormone receptors that are attached to DNA but free of ligand. Whereas liganded nuclear receptors serve as transcriptional activators, unliganded nuclear receptors serve as repressors. How does the unliganded nuclear receptor transmit a repressive signal to the transcriptional apparatus and what is the nature of this signal? In fact, the target of the unliganded nuclear receptor is not RNA polymerase but chromatin, and repression is mediated by corepressors, proteins that associate with unliganded nuclear receptors that assemble a macromolecular complex that modifies chromatin so as to silence gene activity. The macromolecular complex acts to deacetylate histone. The transcriptional corepressors SMRT and N-CoR function as silencing mediators for retinoid and thyroid hormone receptors. SMRT and N-CoR directly interact with unliganded nuclear receptors, and these corepressors in turn directly interact with mSin3A, a corepressor for the Mad-Max heterodimer and a homolog of the yeast global-transcriptional repressor Sin3p. The recently characterized histone deacetylase 1 (HDAC1) interacts with Sin3A and SMRT to form a multisubunit, ternary repressor complex. Histone deacetylase in turn targets chromatin, converting it into a form that is unaccessible to the transcriptional apparatus. Consistent with this model, it is found that HDAC inhibitors synergize with retinoic acid to stimulate hormone-responsive genes and the differentiation of myeloid leukemia (HL-60) cells. Addition of a deacetylase inhibitor such as Trichostatin A relieves transcriptional repression resulting in a promoter that is sensitive to the addition of activating hormone. This work establishes a convergence of repression pathways for bHLH-Zip proteins and nuclear receptors and suggests that this type of regulation may be more widely conserved than previously suspected (Nagy, 1997).
Polycomb/Trithorax response elements (PRE/TREs) maintain transcriptional decisions to ensure correct cell identity during development and differentiation. There are thought to be over 100 PRE/TREs in the Drosophila genome, but only very few have been identified due to the lack of a defining consensus sequence. The definition of sequence criteria that distinguish PRE/TREs from non-PRE/TREs is reported in this study. Using this approach for genome-wide PRE/TRE prediction, 167 candidate PRE/TREs are reported, that map to genes involved in development and cell proliferation. Candidate PRE/TREs are shown to be bound and regulated by Polycomb proteins in vivo, thus demonstrating the validity of PRE/TRE prediction. Using the larger data set thus generated, three sequence motifs that are conserved in PRE/TRE sequences have been identified (Ringrose, 2003).
The detection of PRE/TREs by prediction generates a large data set that can be used to search for further common sequence features. To this end, the 30 highest scoring PRE/TRE hits were scanned for motifs that occur significantly more often in PRE/TREs than in randomly generated sequence. Five significant motifs were found. Not surprisingly, but reassuringly, two known motifs, the GAF and PHO binding sites were found. The Zeste binding motif was not found by this analysis, although it occurs as frequently as GAGA factor in the 30 sequences analyzed. This is probably due to the shortness and degeneracy of the Zeste motif, and suggests that other such short motifs will also be missed by this approach (Ringrose, 2003).
Nevertheless, three additional motifs were found. The first, called GTGT, is found several times in 14 of the sequences. The second motif, poly T, is found several times in almost all 30 PRE/TRE sequences analyzed. Some variants of this site match the binding consensus for the Hunchback protein, which has been shown to be an early regulator at some PRE/TREs. The third motif, TGC triplets, occurs several times in 13 of the PRE/TRE sequences. No binding factor for this sequence has yet been identified (Ringrose, 2003).
To further examine these three motifs, motif occurrence was evaluated in all 167 predicted PRE/TREs and in the promoter peaks described above. In contrast to the known GAF, Z, and PHO motifs, the three motifs each occur in only a subset of predicted and known PRE/TREs, and do not occur significantly together. These motifs may thus each define a subclass of PRE/TREs. Consistent with this idea, some of the lowest scoring known PRE/TRE sequences indeed contain one or more of the three motifs (Ringrose, 2003).
Although no correlation between particular sites and high scores was found, a negative correlation was found between numbers of GAF/Z and PHO sites (a correlation coefficient of -0.78, indicating that when many GAF/Z sites are present, there are few PHO sites, and vice versa). This suggests that each PRE/TRE may have a preferred ground state, in which it is either predisposed to silencing (many PHO sites) or to activation (many GAF/Z sites) (Ringrose, 2003).
In summary, this analysis identifies three motifs that occur significantly in association with known PRE/TRE motifs. Further functional characterization of these motifs and the proteins that bind them may contribute to a more complete definition of the sequence requirement for PRE/TRE function, and of subclasses of PRE/TREs (Ringrose, 2003).
This study offers four main contributions to the understanding of PRE/TRE function. First, a larger set of sequences have been defined that will facilitate the more complete definition of PRE/TRE sequence requirements. Three motifs have been identified that may contribute to this goal. The definition of the minimal requirement for PRE/TRE function will not be a trivial task. Analysis of motif composition and order in the 167 predicted PRE/TREs reveals that there is a great diversity of patterns, with no preferred linear order. It is possible that each different pattern of motifs reflects a subtly different function. However, the concept of a linear order of motifs may well be irrelevant, because these elements operate in the three-dimensional context of chromatin. The fact that such a diversity of PRE/TRE designs exist indicates that the vast majority of them would defy detection by conventional pattern-finding algorithms, and underlines the advantages of the approach described in this study (Ringrose, 2003).
Although no linear constraints on motif order were found, the fact that only motif pairs, and not single motifs, are able to identify PRE/TREs strongly suggests that this close spacing of sites has functional significance. Multiple sites may work in concert, to promote cooperative binding of similar proteins (e.g., repeated PHO sites) or to provoke competition between dissimilar proteins (e.g., closely spaced GAGA factor and PHO sites). In addition, in chromatin, only a subset of sites will be exposed and optimally available for binding at any one time, while others will be occluded by nucleosomes. The trxG includes nucleosome remodeling machines, raising the intriguing possibility that remodeling of PRE/TREs in chromatin may contribute to epigenetic switching by exposing different sets of protein binding sites (Ringrose, 2003).
Second, a PRE/TRE peak is observed at the promoter of all the genes examined. This strongly suggests that promoter binding is a general principle of PRE/TRE function. It has been reported that PcG proteins can interact with general transcription factors. It has hitherto been unclear whether the observed PcG/trxG binding at promoters of the genes they regulate is mediated indirectly via such an interaction, or whether the PcG and trxG bind directly to PRE/TREs at the promoters. The high scores observed at promoters favor the latter interpretation (Ringrose, 2003).
Third, it has been shown that in most cases, PRE/TREs do not occur in isolation, but are accompanied by one or more other peaks nearby. These grouped PRE/TREs may create multiple attachment sites for PcG and trxG proteins, which come together to build a fully operational complex at the promoter. Alternatively, grouped PRE/TREs may be individually regulated by tissue-specific enhancers as in the BX-C. Thus, each of the many PRE/TREs of the homothorax gene may interact with the promoter PRE/TRE in different tissues. This idea is consistent with the fact that Homothorax has specific roles in diverse developmental processes (Ringrose, 2003).
Finally, the current list of about ten PcG/trxG target genes has been expanded to over 150 genes, identifying candidates for epigenetic regulation. The genes thus identified encompass every stage of development, suggesting that the PcG/trxG are global regulators of cellular memory. Experiments to further investigate and compare this regulation for individual genes are currently underway (Ringrose, 2003).
Polycomb group (PcG) complexes are multiprotein assemblages that bind to chromatin and establish chromatin states leading to epigenetic silencing PcG proteins regulate homeotic genes in flies and vertebrates, but little is known about other PcG targets and the role of the PcG in development, differentiation and disease. This study determined the distribution of the PcG proteins PC, E(Z) and PSC and of trimethylation of histone H3 Lys27 (me3K27) in the Drosophila genome
using chromatin immunoprecipitation (ChIP) coupled with analysis of immunoprecipitated DNA with a high-density genomic tiling microarray. At more than 200 PcG target genes, binding sites for the three PcG proteins colocalize to presumptive Polycomb response elements (PREs). In contrast, H3 me3K27 forms broad domains including the entire transcription unit and regulatory regions. PcG targets are highly enriched in genes encoding transcription factors, but they also include genes coding for receptors, signaling proteins, morphogens and regulators representing all major developmental pathways (Schwartz, 2006).
The components of PcG complexes are products of PcG genes, first discovered as crucial regulators of homeotic genes in Drosophila. Immunostaining of Drosophila polytene chromosomes, however, showed PcG proteins at about 100 cytological loci, implying a much larger number of target genes. Functional analysis has identified PREs as DNA sequences able to recruit PcG proteins and establish PcG silencing of neighboring genes. Two types of PcG complexes bind to PREs. PRC1-type complexes include a core quartet of proteins: PC, PSC, PH and dRing. PRC2-type complexes include E(Z), which methylates histone H3 Lys27. Mono- and dimethylated Lys27 is widely distributed in the genome, but PcG sites characteristically contain trimethylated Lys27 (me3K27). The activity of the E(Z) complex is essential for stable silencing, and it has been proposed that H3 me3K27 recruits the PRC1 complex through the specific affinity of the PC chromodomain for me3K27. But the relationships between PRC1 and PRC2 complexes, between their binding sites and histone methylation, and between binding, methylation and gene expression are not well understood and remain the subject of debate. The genomic distribution of three PcG proteins [PC, PSC and E(Z)] and of histone H3 me3K27 was examined using chromatin immunoprecipitation (ChIP). Since PcG target genes may be repressed in some tissues and active in others, a cultured cell line was used to minimize heterogeneity (Schwartz, 2006).
Viewed at the scale of a chromosome arm, the distributions of PC, PSC, E(Z) and me3K27 coincide at a number of distinct binding peaks (which are referred to as 'PcG sites') that correspond to 70% of the bands reported in salivary gland polytene chromosomes stained with the corresponding antibodies. To minimize false positives, the analysis focused on the PcG sites that showed simultaneous binding of two or more proteins, each above twofold enrichment. Of the 149 PcG sites detected (see the supplemental figure), 95 showed strong binding of all four proteins ('strong' PcG sites), whereas in 54 sites the binding was lower and below threshold for one of the proteins ('weak' PcG sites). At higher resolution, most PcG sites involve two or more genes, often sharing structural or functional similarities. Thus, PcG sites involve the following: engrailed (en) and invected (inv); the PcG genes ph-p and ph-d; the Dorsocross T-box gene cluster; the muscle NK homeobox gene cluster; the wingless cluster; and the two homeotic complexes ANT-C and BX-C (Schwartz, 2006).
The Bithorax complex (BX-C) is a cluster of three homeotic genes (Ubx, abd-A and Abd-B) responsible for segmental identity in the abdomen and posterior thorax. The most prominent features are two sharp binding peaks for all three PcG proteins at the sites of the bx and bxd PREs that control Ubx. No peak was detected over the Ubx proximal promoter, although the entire gene shows a low but significant level of PC. A series of lower peaks emerged in the abd-A region and part of the Abd-B gene. Some of these correspond to the known PREs iab-2. In contrast, the distribution of H3 me3K27 oscillated rapidly above a high plateau that covers Ubx and abd-A but not Abd-B. RT-PCR was used to determine the mRNA levels corresponding to these three genes. Transcription of Ubx and abd-A in these cells was very low but distinctly above background. Abd-B was highly transcribed, at levels 300 times higher than Ubx. This pattern of activity was reflected by the distribution of both PcG proteins and me3K27. It is noted that in the Abd-B regulatory region, the previously characterized Fab-7 and Fab-8 PREs neither bound PcG proteins nor were methylated in these cells. The Abd-B gene has five distinct promoters. A sharp resurgence of both methylation and PcG protein binding in the region of the most upstream Abd-B promoter suggests that, in contrast to the other four promoters, this one might be repressed in the cultured cells. RT-PCR analysis using primers specific for mRNAs initiating from each promoter confirmed that the most upstream promoter is silent and that the other four are active. These results support the view that binding of PcG proteins to PREs is associated with transcriptional quiescence, whereas robust transcriptional activity is accompanied by lack of binding to the PREs and lack of Lys27 methylation over the transcription unit (Schwartz, 2006).
Strong genomic sites bind all three PcG proteins. The PSC and E(Z) peaks generally rise sharply and are contained within less than 2 kb, whereas PC frequently forms a broader peak that may include shoulders or subsidiary peaks absent for E(Z) and PSC and subsides to background more gradually. These peak binding regions are thought of as corresponding to PREs, which they in fact do in the cases where these are known. Additional binding peaks may be found within or downstream of the transcription unit. In contrast, distribution of H3 me3K27 at each site is very broad, forming a domain of tens or even hundreds of kilobases encompassing the transcription unit and regulatory regions of one or more genes but, rather than a level plateau, it consists of a series of deep oscillations (Schwartz, 2006).
The strong binding peaks or putative PREs are often associated with low values or troughs in the methylation profile and at secondary peaks the PC distribution frequently echoes methylation peaks. Overall, their relationship does not support the idea that methylation of Lys27 suffices to recruit binding of PC. It is proposed instead that PC bound to the strong binding peaks, the presumptive PREs, is recruited by proteins that bind specifically to those sequences. The weaker PC binding peaks and tails that mirror the methylation profile near PREs may represent a second mode of PC binding mediated by the interaction of the chromodomain with H3 me3K27 (Schwartz, 2006).
It is supposed that methylation domains initiated by a PRE might spread bidirectionally until they encounter 'active' chromatin, characterized by histone acetylation or methylation of H3 Lys4, marks typical of transcriptionally active genes. Alternatively, specific features might shape the methylation domain either positively, by attracting the methyltransferase complex, or negatively, by blocking productive interactions with the PRE. As in the case of the Abd-B gene or of CG7922 and CG7956 genes, sudden drops in levels of me3K27 are generally associated with transcriptional activity. Are insulators involved in protecting CG7922 and CG7956 from silencing, or is the activity of these two genes simply epigenetically maintained from the time the cell line was originally established? Further work is required to answer this question (Schwartz, 2006).
In many cases, the presumptive PRE lies between divergently transcribed genes such as dco and Sox100B. Which of the two is the PRE target? As PREs can act at distances of 20-30 kb, the proximity of PcG peaks to a promoter is not a reliable guide. It is proposed that the methylation domain is the clue to the target of PcG regulation. A PcG peak is not considered to regulate a promoter if the gene is not included in the methylation domain. When multiple genes are included in the methylation domain, it is likely that they are all affected by PcG regulation. However, this study distinguishes between genes that contain methylation as well as one or more PcG proteins and genes that contain only methylation (Schwartz, 2006).
The 95 'strong' binding sites in the genome encompass a total of 392 genes. Of these 392 genes, 186 contain both PcG binding and methylation, and the remainder are found within broad methylation domains associated with PcG proteins binding but do not bind PcG proteins over their own promoter or transcription unit. They may represent genes not directly targeted but affected by the spread of methylation. An analysis of their ontology indicates that these two classes are in fact very different. Transcription regulators constitute 64.5% of the first set, compared to 4.3% for the full annotation set. Instead they constitute only 4.0% of those genes that contain only me3K27. These comparisons strongly suggest that (1) genes that regulate transcription are preferred PcG targets, and (2) genes that only include the tails of a methylation domain are probably not primary targets of PcG regulation. A similar preference is also seen among the 'weak' binding sites. These include a total of 74 genes containing both PcG proteins and methylation, 28.4% of which encode transcription regulators. Flanking genes containing only methylation include only 5.7% transcription regulators. Although transcription regulators are preferred PcG targets, secreted proteins, growth factors or their receptors, and signaling proteins are also targeted. PcG target genes include components of all the major differentiation and morphogenetic pathways in Drosophila (Schwartz, 2006).
The major features of PcG binding shown by this work are that, although the proteins themselves are highly localized at presumptive PREs, the domain of histone methylation they produce is much broader. If the E(Z) methyltransferase is localized at the PRE, how is the extensive methylation domain produced? A looping mechanism is proposed in which interaction of PRE-bound complexes with flanking chromatin is mediated by the PC chromodomain. The observed broader distribution of PC might result from crosslinking of the chromodomain to methylated H3, reflecting this mechanism (Schwartz, 2006).
Are PREs defined by characteristic sequence motifs? Although the analysis of the sequences underlying the binding peaks will be presented elsewhere, it is noted that Ringrose (2003) devised an algorithm based on GAGA factor, PHO and Zeste binding motifs to identify sequences likely to represent PREs. This algorithm correctly predicts a number of the strong PcG binding sites (27%) and a few of the weaker sites (7%), overall 20%; however, it does not predict the majority of the PcG sites. The reverse is also true: only 22% of the PREs predicted by Ringrose bind PcG proteins in these experiments. Together, these data suggest that additional criteria are necessary to predict most PREs reliably (Schwartz, 2006).
As expected, PcG proteins and me3K27 are associated with transcriptional quiescence, but the data suggest that this is not an absolute condition. Low but significant transcription levels are detected even for the repressed Ubx and abd-A genes. Two target sites, polyhomeotic and the Psc-Su(z)2 site, contain PcG genes, which must be active to ensure the functioning of the PcG mechanism. The polyhomeotic locus is one of two sites in the entire genome that bind PC but lack appreciable levels of E(Z) and of Lys27 methylation. Instead, the Psc-Su(z)2 region is well methylated and binds both PC and E(Z) at multiple peaks. It is concluded that PcG mechanisms do not invariably lead to transcriptional silencing and are compatible with moderate levels of transcription (Schwartz, 2006).
Another point of interest is the number and kind of genes that are PcG targets. Considering the developmental difference between salivary gland cells and the embryo-derived tissue culture cells, the substantial number of shared PcG sites suggests that a majority of target sites are occupied in a large percent of cells. Target genes are in fact predominantly regulatory genes that control major differentiation and morphogenetic pathways. These pathways and their genes are highly conserved, and recent work shows that they are also regulated by PcG in mammals. It might be expected that in a given cell type most alternative genomic programs would be repressed save the subset required in that cell type. The emerging picture from these studies is that PcG regulation is a key mechanism in genomic programming (Schwartz, 2006).
Histone H3 variants specify modes of chromatin assembly Histone variants have been known for 30 years, but their functions
and the mechanism of their deposition are still largely unknown.
Drosophila has three versions of histone H3. H3 packages the bulk
genome, H3.3 marks active chromatin and may be essential for
gene regulation, and Cid is the characteristic structural component
of centromeric chromatin. The properties of
these histones have been characterized by using a Drosophila cell-line system that allows precise analysis of both DNA replication and histone deposition. The deposition of H3 is restricted to replicating DNA. In striking contrast, H3.3 and Cid deposit throughout the cell cycle. Deposition of H3.3 occurs without any corresponding DNA replication. To
confirm that the deposition of Cid is also replication-independent
(RI), centromere replication was examined in cultured cells and
neuroblasts. It was found that centromeres replicate out of phase
with heterochromatin and display replication patterns that may
limit H3 deposition. This confirms that both variants undergo RI
deposition, but at different locations in the nucleus. How variant
histones accomplish RI deposition is unknown, and raises basic
questions about the stability of nucleosomes, the machinery that
accomplishes nucleosome assembly, and the functional organization
of the nucleus. The different in vivo properties of H3, H3.3, and
Cid set the stage for identifying the mechanisms by which they are
differentially targeted. It is suggested that local effects of
'open' chromatin and broader effects of nuclear organization help to guide the two different H3 variants to their target sites (Ahmad, 2002).
Nucleosomes are the fundamental units of chromatin, consisting
of 146 bp of DNA wrapped around an octamer of
four core histones. Histone deposition occurs primarily as DNA
replicates to complete chromatin doubling. During S phase
of the cell cycle, new histones are produced in abundance for
immediate replication-coupled deposition. In most metazoans,
this abundant S-phase synthesis results from the tight regulation
of tens to hundreds of intronless histone genes that have special
3' untranscribed regions instead of poly(A) tails. However,
some histones are produced from orphan genes outside of S
phase. In Drosophila, orphan genes encode two H3 variants: one
encodes Cid, the centromeric histone, and two encode H3.3,
the replacement variant. These variants
have equivalents in many other eukaryotes. The H3.3
histone is nearly identical to H3, differing at only four amino acid
positions. Cid differs profoundly from H3 in sequence, showing
some significant identity only within the histone fold domain.
Surprisingly, these three histones have different deposition
properties. H3 and H3.3 are deposited as DNA replicates, but
both H3.3 and Cid can be deposited at sites that are not
undergoing DNA replication. Whereas only
a minor fraction of the bulk genome is packaged into Cid- and
H3.3-containing nucleosomes, each variant is targeted to different
specialized sites, with Cid localizing to centromeres and H3.3
to transcriptionally active genes. Specific localization of centromeric
H3-like histones (CenH3s) has been observed in various
animals, fungi, and plants. Also, an H3.3-like
histone targets the transcriptionally active macronucleus in
ciliates. Thus, the targeting of H3 variants is likely a feature
of every eukaryotic cell, where centromeres and transcribed
regions are the major loci of activity in metaphase and interphase,
respectively. Both kinds of loci use a distinct pathway for
nucleosome assembly, and this study explores the properties
of this process (Ahmad, 2002).
Studies of histone deposition have generally been done using
crude extracts, purified components or pools of cells from which
bulk chromatin is extracted. These methods reveal the
average properties of chromatin, and have shown that the bulk
of chromatin doubles as DNA replicates. Extensive in vitro work
has demonstrated that the assembly of nucleosomes is a stepwise
process in which deposition of an (H3:H4)2 tetramer is followed
by addition of two H2A:H2B dimers. The new histones are
brought to the replication fork in a complex with chromatin
assembly factor 1 (CAF1). CAF1 appears to be recruited to the
replication fork by binding to the ring-shaped proliferating cell
nuclear antigen (PCNA) that encircles the DNA template at
each replication fork. Histones from the parent DNA are
distributively segregated to the two sister chromatids behind the
replication fork, and the gaps in their nucleosomal arrays are
rapidly filled by step-wise assembly of new nucleosomes. These
nucleosomes are then matured by addition of linker histones and
covalent modification of histone tails to complete chromatin (Ahmad, 2002).
Nucleosomes containing H3 variants comprise only a small
proportion of bulk chromatin, and thus their properties have
been generally undetectable. However, replacement H3 variants
can become enriched in the chromatin of nonreplicating cells. This means that other ways of depositing histones must
exist; but because such variant enrichment has been detectable only
in unusual cell types (such as long-lived neurons or spermatocytes), studies of the phenomenon have been limited. The ability
to tag histones and examine their deposition properties in single
cells has allowed a gain in insight into chromatin assembly
processes (Ahmad, 2002).
A cytological assay system was developed for studying replication
and chromatin assembly by using Drosophila Kc cells, a cell
line that displays a regular cell division schedule and
a consistent tetraploid karyotype. Organization of the Drosophila
nucleus is visually simple, because the late-replicating heterochromatin
typically coalesces into a compartment in the
nucleus, termed the chromocenter. This provides both
a temporal and spatial distinction between the early replicating,
gene-rich euchromatin, and the late-replicating heterochromatin (Ahmad, 2002).
DNA replication can be tracked either by pulse-labeling with
nucleotide analogs or by using anti-PCNA antibody. Furthermore,
by introducing histone-GFP fusion constructs and producing
a pulse of the tagged protein, histone deposition can be tracked during the cell cycle. Using this system, it has been possible to quantitatively examine DNA replication and histone
deposition in unsynchronized populations of cells (Ahmad, 2002).
GFP-tagged H3 shows exclusively replication-coupled deposition,
displaying co-localization with replication markers and
showing no detectable deposition in cells in which replication has
been blocked. The N-terminal tail of H3 is required, suggesting
that the H3 tails of tetramer particles interact with
accessory factors at some early step in nucleosome assembly in
vivo (Ahmad, 2002).
In contrast to the properties of GFP-tagged H3 in cells, tagged
H3.3 deposits in a replication-independent manner at actively
transcribing loci. Deposition can occur in any stage of the cell
cycle, and it is not accompanied by
unscheduled DNA synthesis. Incorporation of H4 also occurs at
these target sites, as expected for deposition of (H3.3:H4)2
tetramers; but how replication-independent (RI) histone deposition
occurs is virtually unknown. Tagged Cid can also deposit throughout the cell cycle, suggesting that its deposition is also replication-independent (Ahmad, 2002).
However, this conclusion depends on knowing the timing of
centromere replication. Centromeres replicate
within a defined portion of S phase and Drosophila centromeres replicate as isolated domains within
later-replicating heterochromatin (Ahmad, 2002).
Historically, centromeres have been thought to replicate very
late in the cell cycle. This is because they are embedded within
pericentric heterochromatin, which replicates late. Analysis has
usually relied on visualization at mitosis; but mitotic chromosomes
have inherently low resolution because they are highly
condensed. Indeed, a recent study showed that Drosophila
centromeres cannot be resolved from heterochromatin in 44% of
spread mitotic chromosomes. Despite this limitation, it has been concluded Cid-containing chromatin replicates
on the same late schedule as pericentric heterochromatin. However,
this could be late replication in pericentric heterochromatin
that was mis-scored as replication of centromeres (Ahmad, 2002).
This uncertainty has been addressed by analyzing mitotic
chromosome replication patterns, providing brief 15-min pulses
to Kc cells and examining mitotic figures after a chase. This
provides a 'snapshot' of replication at single points in the cell
cycle. Examples of heterochromatin replication
patterns are observed similar to those previously reported, where labeling
overlaps Cid spots. However, unambiguous
examples of chromosomes that were intensely labeled
throughout the euchromatic arms, with foci directly coinciding
with centromeres, are also observed. These centromeric foci are surrounded by heterochromatin that did not replicate during the
labeling pulse (Ahmad, 2002).
Experiments using interphase Kc cells revealed
that ~90% of centromere replication occurs when euchromatin
is replicating. The remaining 10% may be late replication
in centromeric regions, but is more likely the result of nearby
heterochromatic replication foci that can not be resolved from
sites with Cid. Such early replication of centromeres is not
limited to tetraploid Kc cells --
similar replication patterns are observed in diploid larval neuroblasts -- although the much shorter cell cycle time and the more
irregular chromocenter limits quantitative analysis. Therefore,
this early timing of centromere replication appears to be general
for Drosophila cells (Ahmad, 2002).
A series of progressively more direct experiments have provided
insight into the fine structure in the centromere region. A model
for the centromeric constriction has suggested that loops of DNA
coil through the constriction, with centromeric nucleosomes
lying in the outward parts of these coils, and conventional
nucleosomes in the interior portions. This would
account for the polar structure of the entire centromere if
centromeric nucleosomes nucleate kinetochore formation (and
thus microtubule capture) and conventional nucleosomes recruit
cohesins (and thus centromeric cohesion). The linear arrangement
of nucleosomes along centromeric DNA would then be
alternating blocks of centromeric and conventional nucleosomes
within the centromeric domain. A study using
stretched chromatin fibers has demonstrated that Cid and H3 are
interspersed in Drosophila, although these are not included in the
same nucleosome. Apparently, blocks packaged in one kind
of nucleosome alternate with blocks packaged in the other (Ahmad, 2002).
How could the duplication of such regular but discontinuous arrays of
nucleosomes occur? The alternating pattern of nucleosomes on stretched chromatin fibers is reminiscent of replication patterns on fibers from
normal chromatin. Replication origins within a chromatin
domain often appear to be regularly spaced with an interval of
50-100 kb, and these origins fire synchronously. Perhaps the
nucleosome blocks in the centromeric regions correspond to an
underlying regular arrangement of replication origins throughout
the entire centromeric domain. If Cid-containing blocks
include the origins for these domains, and if replication initiates
at a time when H3 is not available, ultimately only the RI
deposition of Cid will package these blocks. The later replicating
stretches would incorporate H3 as it becomes available. In this
way, the fine pattern of replication would maintain the discontinuous
Cid arrays over an extended region (Ahmad, 2002).
The model for maintaining the higher-order chromatin structure
of the entire centromere has precise requirements for
replication patterns in this region: a discontinuously spaced
arrangement of origins must correspond to the blocks of Cid-containing
chromatin. At least two other
patterns of replication in this region can be imagined: (1) all
Cid- and H3-containing blocks might replicate simultaneously
(pattern 2); (2) a single origin might replicate the entire
domain (pattern 3) (Ahmad, 2002).
The possibility of the existence of discontinuous replication track corresponding to blocks of centromeric chromatin was investigated by pulse-labeling cells for only
15 min. To prepare stretched chromatin fibers,
nuclei spread on a glass slide were disrupted in a high-salt buffer. As the buffer
runs off the slide, it pulls chromatin fibers behind it. Stretched centromeres were identified and those fibers were examined in
which nucleotide incorporation was unambiguous. In each of
these cases it was clear that replication was occurring in discrete
patches scattered throughout the centromeric domain. These replication tracks must arise from multiple origins, and
thus the possibilities that the entire domain
replicates from a single origin, or that the whole domain
replicates simultaneously can be ruled out (Ahmad, 2002).
These patches correspond significantly with the segments
between Cid-containing chromatin. Thus, from published experiments
and the experiments described here it appears that
replication occurs in two discrete phases: all CenH3-containing
chromatin within a domain replicates, and at a different time all
H3-containing chromatin replicates. Therefore, replication
within this domain is discontinuous and initiates from multiple
origins (Ahmad, 2002).
Given that deposition of any H3 must occur in the form of
(H3:H4)2 tetramers, there must be discrimination of H3-
containing tetramers from tetramers containing variants. Thus
analysis of RI assembly was used to initiated the mapping of discriminating sites within the histone variants. It was found that one type of discrimination is a cluster of three residues within the histone
fold domain (HFD) of H3 that limits it to replication-coupled
deposition. Furthermore, because both Cid and H3.3 undergo
RI deposition but have mutually exclusive targets, there must be
additional discrimination between these variants (Ahmad, 2002).
Replication-coupled nucleosome assembly is aided by accessory
factors that are recruited to the replication fork by binding
to PCNA. However, the process of RI deposition must be
different, because RI deposition of H3.3 does not require
portions of the histone that are required for replication-coupled
deposition. Furthermore, the lack of PCNA during gap phase
deposition raises the question of what is recruiting histones to the
sites. The phenomenon of CenH3 targeting has raised expectations
that a specific, localized chromatin assembly factor or
histone modification will be involved in the targeting of CenH3s. Indeed, a chromatin remodeler of the RSC family,
PyBAF, localizes to kinetochores during mitosis of mammalian
cells. Furthermore, RSC mutations in budding yeast alter chromatin structure
specifically around centromeres, and perhaps RSC activity
is involved in assembly of centromeric nucleosomes. Mutations
in CAF and Hir genes also give centromere defects, and it has
been suggested that these factors are involved in loading the
yeast CenH3 Cse4p. However, a role for any of these factors
does little to explain the specific targeting of CenH3s, because
these factors are all widely distributed in the nucleus (Ahmad, 2002).
The best candidate for a uniquely centromere-localized chromatin
assembly factor is the Mis6 protein in fission yeast.
This protein is required for centromeric localization of the
CenH3 SpCENP-A, but Mis6 homologs in budding yeast (Ctf3) and in mammals (CENP-I) localize to centromeres
but are not required for targeting CenH3s. Thus, Mis6
proteins appear to be structural components of centromeres, not
histone assembly factors (Ahmad, 2002).
An alternative model is that some feature of centromeric
chromatin facilitates the targeting of its specialized histones. An
obvious candidate for this feature is that centromeric nucleosomes
themselves bind to and thereby recruit new CenH3
tetramers for future deposition. Such an interaction is a possible
molecular mechanism for direct templating of centromere duplication. Regardless of whether CenH3 targeting involves
specialized co-factors, templating, or both, the question remains
as to why it should use an RI pathway (Ahmad, 2002).
The targeted deposition of H3.3 to active genes is likewise
replication-independent, although transcription-coupled assembly
may facilitate (H3.3:H4)2 deposition. Perhaps H3.3 targeting
is mediated by a component of RNA polymerase complexes (Ahmad, 2002).
Because RNA polymerases move processively along the DNA
during transcription, a contiguous transcribed segment of DNA
might incorporate the H3.3 variant. Alternatively, RI deposition
of H3.3 may be facilitated by any of a number of ATP-dependent
chromatin remodeling complexes to target specific sites near
transcription units. Any candidate factor might be expected to
preferentially use H3.3 instead of H3, but whether there is any
such discriminating factor is unknown, because all in vitro studies
of higher eukaryotic chromatin assembly have been performed
with H3. It is anticipated that this will soon be addressed. However,
the prospects for identifying a unique remodeler that is
required for RI deposition are uncertain, because budding yeast
mutants that eliminate any known chromatin assembly factors do
not eliminate chromatin assembly. Thus
the possibility has to be considered that RI deposition at active genes and at centromeres uses generic remodeling activities, and that components or structural aspects common to both centromeres and actively
transcribed genes may result in RI histone deposition at both
kinds of sites (Ahmad, 2002).
The deposition of histones throughout the cell cycle by a
replication-independent process implies that previously existing
nucleosomes are unraveled, and their histones released. It is
known that the process of transcription results in a local unfolding
of the chromatin fiber and an 'open' chromatin configuration. Although transcription of nucleosomal templates with bacterial polymerases can occur in vitro without displacing histone octamers from DNA, in vivo assays demonstrated that a measurable amount of transcription-dependent
histone displacement does occur in eukaryotic nuclei. In fact, even in vitro, RNA polymerase II is virtually unable to transcribe nucleosomal
DNA under physiological conditions. Transcription requires
that histone-DNA contacts be broken for polymerase to
transit the nucleosomal DNA. Although transcription can occur
without histone displacement if the histone octamer releases
some contacts with DNA and maintains others, at some
frequency all contacts might be released. The histone octamer
would then simply fall off. Additionally, localized remodeling
factors will disrupt nucleosome structure as they act. The in vitro
and in vivo observations can be reconciled if histone displacement
occurs occasionally as nucleosomes are disrupted. Constraints on nucleosomes in a compacted chromatin fiber (i.e., 'closed' chromatin) would limit histone displacement (Ahmad, 2002).
Although internucleosome forces within inactive chromatin are
uncharacterized, they have been inferred from numerous experiments,
including the tendency of nucleosomes within hetero-chromatin
to form extremely regular and fixed arrays. A
likely constraint in heterochromatin arises from the multimeric
associations that occur between heterochromatin-specific non-histone
chromatin proteins. Attention has focused on the heterochromatin
protein-1 (HP1). HP1 is recruited to heterochromatic
DNA by binding, through its chromodomain, to the H3 tail
when it is methylated at lysine-9 (H3-K9me). The chromo
shadow domain of HP1 mediates associations between HP1
molecules, and multimers of HP1 bound to methylated histone
tails provides one basis for constraining arrays of nucleosomes (Ahmad, 2002).
Although the state of chromatin in heterochromatin and in
actively transcribed regions is well known, less is known about
the chromatin fiber packaged by centromeric nucleosomes. However, these regions appear to be open. Centromeric DNA is
sensitive to micrococcal nuclease digestion both in budding yeast and in the central core region of fission yeast centromeres
where SpCCENP-A-containing nucleosomes reside, and
plant meiotic centromeres appear decondensed. In addition,
early replication is a feature of open chromatin, and
centromeric chromatin replicates before surrounding heterochromatin (Ahmad, 2002).
An open configuration may arise from at least three
sources. (1) All CenH3s lack a canonical H3 tail. Because
methyl-modification of lysine-9 appears to be the key epitope to
maintain heterochromatin, the lack of this site in centromeric
nucleosomes means that such regions cannot become heterochromatic. Indeed, the heterochromatin protein HP1 is not associated with chromatin packaged by CenH3s. (2) A recent study of Cid homologs in drosophilids has uncovered DNA minor-groove binding motifs in the Cid tail outside of the nucleosome core. Extension of the Cid tail along linker DNA between nucleosomes may inhibit compaction of the
nucleosome strand, thus maintaining these regions in an open
configuration. (3) Chromatin remodeling factors that
destabilize nucleosomes are found both at active genes and centromeres, and their activity will promote histone replacement. It is suggested that an open chromatin configuration is the common basis for RI deposition at centromeres and at actively transcribed genes (Ahmad, 2002).
If open chromatin were the sole basis for RI deposition, then we
would expect that active genes and centromeres would incorporate
both H3.3 and CenH3s. However, their deposition is
mutually exclusive. This exclusivity is likely to rely on multiple
mechanisms that act on all steps in nucleosome assembly. Factors that discriminate between H3.3 and Cid would be the
best candidates for directing these variants to their targets. However, the organization of the nucleus provides a clue as to
another way in which exclusive targeting may be accomplished. Centromeric DNA in Drosophila is flanked by repeated sequences
that are packaged into heterochromatin, and this forms
a compartment at interphase in which centromeres are embedded
in heterochromatin. The active rDNA genes are the
primary sites of H3.3 deposition and they are also found in a
distinct nuclear compartment, the nucleolus, next to the chromocenter (Ahmad, 2002).
This functional nuclear organization is very simple to
see in Drosophila, where all heterochromatin typically associates
into one large chromocenter, and the active rDNA arrays also
often associate to present one large nucleolus. In fact, this
general compartmentalization is almost invariant in eukaryotes,
and has led to the idea that heterochromatin somehow protects
centromeres and NORs. Although both Cid and H3.3
undergo RI deposition, their exclusive targeting could in part be
accomplished by restricting one or both variants within the
nucleus. For example, unincorporated (Cid:H4)2 tetramers
might be sequestered within the heterochromatic chromocenter. Cid deposition would then appear targeted to the centromere,
because this is the only site within the chromocenter with open
chromatin (Ahmad, 2002).
Whether (Cid:H4)2 tetramers are actually sequestered in this
way is unknown. Indeed, whether sequestering substrates can
have any effect on reactions within the nucleus has become a
pressing issue. Many nuclear components remain mobile,
but functional experiments argue that certain effects in the
nucleus actually only occur when components are sequestered. It is likely that some reactions in the nucleus are relatively
independent of localization because they associate efficiently
with their partners and their reactions proceed quickly. Conversely,
reactions that involve weak interactions or multiple steps
may require raising the effective concentration of their substrates
by nuclear sequestration (Ahmad, 2002).
It has been suggested that the heterochromatic compartment
is involved in histone traffic within the nucleus. The
basis of this hypothesis was the realization that Cid-containing
chromatin behaves unusually during S phase. Generally, the
deposition of H3 quickly follows DNA replication. However, the
replication of Cid-containing centromeric DNA occurs without
H3 deposition, implying that the normal coupling between
replication components and nucleosome assembly components
must be broken. Because this coupling is thought to result from
an interaction between chromatin assembly factor 1 histone complexes and PCNA, the simplest explanation for
uncoupling the two processes would be to sequester replicative
nucleosome assembly factors away from centromeres. It is imagined
that unincorporated H3-containing tetramers might be
sequestered in euchromatin in the first half of S phase, and would
thus never (productively) see the replication forks at centromeres
within the heterochromatic compartment. This uncoupling
might be necessary to prevent dilution of centromeric
nucleosomes by conventional nucleosomes that would assemble
after replication-coupled deposition. Genetic experiments in
budding yeast and Drosophila suggest that CenH3s and H3 do
compete for assembly (Ahmad, 2002).
One way that a competition between CenH3 and H3 histones
can be probed is to change their relative concentration. A tagged Cid protein exclusively deposits at centromeres when it is ectopically expressed at low levels
from a heat-shock-inducible promoter. However, it is
apparent that expression from this construct remains low. Re-engineering the transcriptional start region of the construct
to include a translational initiation consensus site now allows
overproduction of Cid in cells (Ahmad, 2002).
To analyze the behavior of excess quantities of Cid protein, an overexpression construct was introduced into Drosophila Kc cells. Cells receive varying amounts of transfected DNA, and
thus express Cid over a wide range of levels. In cells that express
low amounts of the ectopic protein, Cid localizes to centromeres,
as expected. However, a new localization pattern for Cid is seen
at high expression levels: the tagged protein localizes to centromeres
and throughout euchromatin. The incorporation pattern
of ectopic Cid is especially clear on mitotic chromosomes from
these transfections, where the tagged protein is incorporated
throughout the euchromatic arms as well as at centromeres. It is concluded from this result that excess Cid can be
deposited at sites other than centromeres. Normal cells must
have mechanisms to prevent euchromatic deposition, but over-expression
is sufficient, by itself, to overcome this restriction (Ahmad, 2002).
The mis-incorporation pattern of Cid shows an interesting
specificity: Cid can deposit at centromeres and euchromatin but
not in heterochromatin. Therefore, heterochromatin
must either lack the feature that tolerates mis-incorporation, or must
actively exclude Cid. It is argued that centromeres and
euchromatin share the feature of open chromatin, which is
proposed to be the first prerequisite for RI deposition of histone
variants. Indeed, the mis-incorporation of Cid into euchromatin
is replication-independent, because it occurs both when euchromatin
is replicating in early S phase, and in late S phase
when euchromatic replication is complete. It is suggested
that Cid is contaminating open chromatin in the euchromatic
compartment when it is overexpressed (Ahmad, 2002).
What normally prevents the deposition of Cid into euchromatin?
Endogenous Cid is present only at low levels, and
mis-incorporation could be avoided if Cid were sequestered
away from euchromatin in the nucleus. If unincorporated Cid
were sequestered in the heterochromatic chromocenter, it would
be unable to deposit in the closed chromatin of this compartment. Thus, sequestration might serve two purposes: deposition
in euchromatin would be prevented and deposition at centromeres
would be promoted. Overexpression of CenpA in
mammalian cells also mis-incorporates into euchromatin (Ahmad, 2002).
Although it has not been examined whether CenpA mis-incorporation
is replication-independent, this is expected to be the
case, because this is how CenpA deposits at centromeres (Ahmad, 2002).
The idea that histone variants may respect nuclear compartments
was first raised by experiments expressing heterologous
CenH3s in Drosophila Kc and human HeLa cells. These extremely diverged heterologous histones do not localize to centromeres in these cells, implying that there is some kind of specificity for depositing the correct CenH3 at centromeres (Ahmad, 2002).
Surprisingly, heterologous histones are preferentially enriched
in the heterochromatic blocks. It has been suggested that there is a default ability of cells to enrich diverged H3 variants in the heterochromatic compartment. Perhaps heterochromatic enrichment is a
normal first step in the deposition of the endogenous CenH3s (Ahmad, 2002).
Those experiments and overexpression results encourage the
view that nuclear compartments may guide histone variants to
the correct subset of their potential deposition sites. Compartment
effects may also affect the RI deposition of H3.3 in an
inverse way to Cid: i.e., sequestering to promote H3.3 deposition
at active genes, and preventing its deposition at centromeres (Ahmad, 2002).
Because H3.3 is largely identical to H3, the hypothetical element
that is recognized in H3 and results in its exclusion from
chromocenters during centromere replication may also be
present in H3.3. Perhaps this discrimination against canonical
H3 histones also serves to prevent the RI deposition of H3.3 at
centromeres (Ahmad, 2002).
RI assembly permits immediate chromatin repair. The unfolding
of chromatin during transcription may be damaging, in that the
forces RNA polymerases apply to their template DNA should at
least occasionally displace histone octamers from DNA. Additionally, histone octamers may sometimes be displaced by
chromatin remodeling factors associated with transcriptional
activity. In either case, these regions must be repackaged into
nucleosomes. Similarly, replacement of CenH3s may be required
to maintain the nucleosomal configuration of centromeres after
mitosis. Bundles of microtubules drag a chromosome to the pole
during anaphase, and the forces they apply may be sufficient
to occasionally pull off histone octamers. Chromatin would then
be stripped of some CenH3 histone octamers. RI deposition
allows repair of this damage. In fact, the RI deposition of CenpA
in mammalian cells seems to occur around the time of mitosis). The deposition of Cid in Drosophila cells occurs throughout
the cell cycle, but may only be required at two points: as
centromeric DNA replicates to double its chromatin, and after
mitosis to repair stripped chromatin (Ahmad, 2002).
The process of RI assembly at active genes provides a novel
level of control over histone modifications. Replacement of
nucleosomes in one modification state by new histones could
switch chromatin to an active state. Initiation of transcription
would start this process, and successive transits of RNA polymerases
would promote RI assembly. The replacement H3
histone in alfalfa is hyperacetylated, and RI assembly with
acetylated histones could enrich such modifications in active
chromatin. However, histone modification by methylation has
appeared more problematic. A number of histone methyl-transferases
(HMTs) have been characterized, but no
histone demethylase is known. Methylated lysine-9 in the H3 tail
(H3K9me) is a critical epitope for recruiting heterochromatic
chromatin proteins, because this is the binding site for HP1. HP1
recruits additional heterochromatic proteins including the Su-var3-9 HMT. Therefore, it is straightforward to imagine how
these recruited proteins could perpetuate a heterochromatic
state through replication-coupled nucleosome assembly and cell
division (Ahmad, 2002).
Because an irreversible methyl modification appears to specify
the heterochromatic state, it has been unknown how a heterochromatic
site could switch to an active state. One route for
switching might be to prevent the methylation of nucleosomes
assembled during replication. Successive cell cycles could then
dilute methylated nucleosomes, allowing eventual activation (Ahmad, 2002).
However, more rapid mechanisms for activating silenced chromatin
must exist. Induction of silenced genes can occur within a
single cell cycle; for example, X chromosomes become reactivated
and lose H3K9me during diplotene in the Caenorhabditis
ovary. Work using a reporter for heterochromatic
gene silencing suggests that switching to an active
state can occur in somatic cells without cell division. Thus,
H3K9me can be removed without replication-coupled nucleosome
assembly (Ahmad, 2002).
RI deposition implies that the entire heterochromatic nucleosome
may be unraveled and replaced. The process of
transcriptional activation may force the disassembly of H3K9me-containing
nucleosomes, followed by RI assembly of an unmarked
nucleosome. Although the fate of the
displaced methylated H3 is not known, it is know that RI deposition can
occur at any time in the cell cycle, and thus should be able to
rapidly derepress silencing. Conversely, an active gene
could be silenced by methylating the tail of H3.3, which presents
the same lysine-9 epitope. The stability of histone methylation
gives it a distinct advantage over other histone modifications for
heritable effects on chromatin. The possibility of RI deposition
circumvents the irreversible nature of methylation, thus retaining
the potential to switch the heritable chromatin state at a later
time (Ahmad, 2002).
It is concluded that H3 variants are used to package functionally specialized chromatin,
where they play vital functional roles. Localizing these
variants to centromeres and to transcriptionally active regions
utilizes an RI process that is distinct from the nonspecific,
replication-coupled method of packaging the bulk genome. It is
argued that RI deposition is the consequence of the
activities that impinge on these sites in the genome and creates an
open chromatin structure. This flexibility in histone deposition
may be necessary to maintain the nucleosomal structure of these
regions. In higher eukaryotes, the RI deposition process allows
specialized chromatin to be distinguished at the most basic level,
where histone variants are incorporated into chromatin. The
differences between the generic H3 -- which packages the bulk of
the genome -- and the H3 variants may contribute to the physical
properties of specialized regions and recruit particular non-histone
chromatin proteins. Because histones remain associated
with DNA through mitosis, these variants establish heritable
distinctions in chromatin (Ahmad, 2002).
Centromeres are a defining feature of eukaryotes, and all are
likely to have a CenH3. However, the utilization of two conserved
versions like H3 and H3.3 is not universal. For example,
budding yeast has only one canonical H3 histone, which undergoes
both replication-coupled and RI deposition. Surprisingly,
this is H3.3: phylogenetic analysis reveals that ascomycetes
have lost H3, whereas their sister clade basidiomycetes have both
H3 and H3.3, as do animals. Therefore, an H3.3
gene performs all general functions in some organisms. The
extraordinary conservation of H3.3, which is identical from
mollusks to mammals, speaks to its fundamental role in the
eukaryotic nucleus (Ahmad, 2002).
The conserved 3'-5' RNA exonuclease ERI1 is implicated in RNA interference inhibition, 5.8S rRNA maturation and histone mRNA maturation and turnover. The single ERI1 homologue in Drosophila melanogaster Snipper (Snp) is a 3'-5' exonuclease, but its in vivo function remains elusive. This study reports a Snp requirement for normal Drosophila development, since its perturbation leads to larval arrest and tissue-specific downregulation results in abnormal tissue development. Additionally, Snp directly interacts with histone mRNA, and its depletion results in drastic reduction in histone transcript levels. It is proposed that Snp protects the 3'-ends of histone mRNAs and upon its absence, histone transcripts are readily degraded. This in turn may lead to cell cycle delay or arrest, causing growth arrest and developmental perturbations (Alexiadis, 2017).
Acetylation and methylation: Covalent modifications of chromatin and DNA that establish and maintain the heterochromatin-induced silenced state A self-reinforcing network of interactions among the three best-characterized covalent modifications that mark heterochromatin (histone hypoacetylation, histone H3-Lys9 methylation, and cytosine methylation) suggests a mechanistic basis for spreading of heterochromatin over large domains and for stable epigenetic inheritance of the silent state. Early cytological studies have distinguished two types of chromatin: euchromatin and heterochromatin. Heterochromatin was originally defined as that portion of the genome that remains condensed and deeply staining (heteropycnotic) as the cell makes the transition from metaphase to interphase; such material is generally associated with the telomeres and pericentric regions of chromosomes. Subsequent work has identified a cluster of structural features that characterizes heterochromatin. While heterochromatic regions are rich in repetitive sequences and have a low gene density, they are not devoid of genes; it is estimated that there are ~40-50 genes within the pericentric heterochromatin of Drosophila. An altered packaging of heterochromatin, to a less-accessible form, has been demonstrated by probing with nucleases and other reagents such as prokaryotic DNA methyltransferases. The data suggest that while nucleosome arrays in euchromatin are irregular, punctuated by the nucleosome-free hypersensitive sites (HS sites) characteristic of active genes, the nucleosomes in heterochromatin have a regular spacing over large arrays, with a higher proportion of the DNA associated with the histone core rather than in the linker. Euchromatic regions silenced by nucleosome packaging is referred to as 'silent chromatin,' reserving the term 'heterochromatin' for the classically defined heterochromatin (Richards, 2002).
It is an interesting paradox that while the histones are among the most conserved proteins known in evolution, they are also among the most variable in posttranslational modification. The pattern of modifications has been suggested to act as an information code (the histone code), dictating both nucleosomal interactions and the association of nonhistone chromosomal proteins that collectively influence packaging and gene regulation. Modifications include acetylation, methylation, phosphorylation, ubiquitination, and ADP-ribosylation. Given the number of sites of posttranslational modification for each of the four core histones, an imposing number of differently modified nucleosomes is possible. The modification states of the N-terminal tails of histones H3 and H4 appear to play a major role in heterochromatin formation (Richards, 2002).
One modification of histones, hypoacetylation of lysine residues, is associated with both formation of heterochromatin and gene silencing. Early attempts to fractionate chromatin and characterize the components led to the suggestion that heterochromatic domains were associated with hypoacetylated histones, while euchromatic domains were associated with hyperacetylated histones. This distinction is observed not only between constitutive heterochromatin and euchromatin, but also in mapping studies comparing an active or inducible gene to flanking regions (Richards, 2002).
Histone H3 methylated at lysine 9 (H3-mLys9), a second modification of histones, has been identified as characteristic of the heterochromatic state. Immunofluorescent staining of Drosophila polytene chromosomes shows that the bulk of the H3-mLys9 is present in the pericentric heterochromatin and in a banded pattern on the fourth chromosome, known sites of repetitive DNA (Jacobs, 2001). Similarly, chromatin immunoprecipitation (ChIP) experiments demonstrate that H3-mLys9 is a prominent component of the silent mating type locus in fission yeast (Schizosaccharomyces pombe), while essentially absent from flanking regions containing inducible genes. Methylation of histone H3-Lys9 has also been associated with the silencing of euchromatic genes (Richards, 2002).
A third biochemical marker of heterochromatin is the most common form of DNA modification in eukaryotes, namely cytosine methylation. Although absent in some eukaryotes, this DNA modification is widely distributed in the eukaryotic kingdom. It is particularly prevalent in plants and mammals where it is an important epigenetic mark that contributes to the stability of pericentromeric heterochromatin and plays a central role in cementing and maintaining epigenetic expression states, not only in heterochromatin but in silenced euchromatic domains (Richards, 2002).
Hypoacetylation, particularly of histones H3 and H4, associated with heterochromatic domains from a range of organisms, has been studied in greatest detail in Saccharomyces cerevisiae. Many of the cis- and trans-acting factors necessary to establish and maintain the silent state at the telomeres and HML/HMR loci have been identified. These studies have demonstrated the need for hypoacetylated histones. Silencing is mediated by the multiprotein, nucleosome binding SIR(1-4) complex, recruited by interaction with specific DNA binding proteins. Sir3 and Sir4 interact specifically with the N-terminal tails of histones H3 and H4 in the hypoacetylated state. While the N-terminal tails of the histones are not required individually for growth in yeast, they do play an essential role in silencing, amino acids 4-20 of H3 and 16-29 of H4 being required. Certain sir3 alleles can suppress the silencing defect of histone H4 tail mutations, and Sir3 and Sir4 can bind to the amino termini of histones H3 and H4 in vitro, suggesting direct interaction. Recent studies using antibodies against different histone acetylated isoforms indicate that histones in the telomeric and HML/HMR heterochromatin are hypoacetylated at all modification sites (Richards, 2002 and references therein).
What is the mechanism for histone hypoacetylation specifically at the heterochromatic domains? This function is apparently provided, at least in part, by Sir2 (Drosophila homolog: Sir2), shown to have a NAD-dependent protein deacetylase activity. Sir2 can efficiently deacetylate histones in vitro, preferentially deacetylating histone H4 at Lys16, although direct action in vivo has not yet been reported. Enzymatic activity of Sir2 is required for silencing in the heterochromatic domains. The acetylation status of H4-Lys16 may be of particular importance. Lys16 is the preferred site of acetylation in monoacetylated H4 of euchromatin in yeast, and this is the only acetylatable H4 site whose mutation strongly affects Sir3 binding in heterochromatin. Deletion of sir3 results in increased histone acetylation in heterochromatic domains, as well as a loss in silencing. The results suggest an assembly model in which interaction of the Sir2-Sir4 complex with specific DNA binding proteins leads to local histone deacetylation, permitting binding of Sir3. It appears that binding of Sir3 to the hypoacetylated histone blocks reacetylation. Given the interactions between Sir2, Sir4, and Sir3, once initiated, such a complex could spread along the nucleosome array, generating and maintaining the altered modification state (Richards, 2002 and references therein).
In addition to the above, studies in S. pombe, Drosophila, and other organisms suggest that the histone acetylation level is used as a heritable mark of the chromatin state. Mutations in HDACs or treatment with trichostatin A (TSA), an inhibitor of some HDACs, frequently results in a loss of function in heterochromatic domains and a relaxation of silencing. For example, treatment with TSA results in functionally deficient centromeres and chromosome loss in S. pombe, concomitant with a loss of silencing for test genes within the centromeric heterochromatin. The hyperacetylated state is heritable following removal of TSA; it is linked in cis to the treated centromere locus, and correlates with inheritance of functionally defective centromeres, demonstrating an epigenetic phenomenon based on the chromatin structure. In contrast, acetylation is used as an inherited mark of activity. Histone H4-aLys16 is prominently associated with the dosage-compensated, 2-fold active X chromosome in males of Drosophila. This specific modification is due to MOF, an essential acetyltransferase of the dosage compensation complex that coats the male X chromosome. The complex remains associated with its target DNA throughout the cell cycle, providing the means to replicate the modification state. Recruitment of HATs to chromosome regions showing histone acetylation patterns corresponding to their own catalytic specificity has been observed, e.g., the histone acetyltransferase P/CAF binds preferentially to acetylated H4 and H3 peptides via a bromo domain. These observations provide evidence for use of the histone acetylation state as an epigenetic mark (Richards, 2002 and references therein).
How does hypoacetylation impact chromatin structure? In the case described above, the hypoacetylated histone tails interact specifically with the SIR complex. While Sir2 orthologs have been identified, few proteins with similarity to the other Sir proteins have been found in multicellular eukaryotes. Nonetheless, there may be an equivalent of the SIR complex that makes similar use of the histone hypoacetylation signal. However, a significant effect might be realized through the interaction of the histone H3/H4 tails with the DNA and/or other nucleosomes in the chromatin fiber. The regions of the histone H3 and H4 tails that contribute to DNA binding, as observed in the crystal structure, are necessary for silencing of basal transcription in vivo. However, these regions are distinct from those critical for repression at the HM loci and telomeres. It appears unlikely that simply weakening intranucleosomal histone-DNA interactions by histone acetylation could alleviate the inhibitory effect of heterochromatin structure on transcription. An alternative possibility was suggested by the original crystals of the nucleosome (using histones from Xenopus), where histone H4 amino acids 16-24 were observed to interact with the acidic region formed by histones H2A and H2B on the surface of the adjacent histone octamer. The eight H2A/H2B amino acids involved in forming this negatively charged patch are highly conserved. Acetylation of the H4 tail might disrupt this interaction, leading to a loss of compaction along the chromatin fiber. However, this disposition of the histone H4 tail is not seen in crystals of the nucleosome made using yeast histones, and additional studies are needed to resolve this interesting question (Richards, 2002 and references therein).
A key role for a second histone modification in the specification of heterochromatin is shown by the recent demonstration that mammalian homologs of Drosophila Su(var)3-9, including human SUV39H1 and murine Suv39h1, encode enzymes that specifically methylate histone H3 on lysine 9 (Rea, 2000). Su(var)3-9 was originally identified as a suppressor of PEV in Drosophila, indicating that the wild-type gene product is involved in heterochromatin formation (Tschiersch, 1994). A homolog in S. pombe, Clr4, is also a specific histone H3-Lys9 methyltransferase, suggesting that this activity is widely distributed and well conserved. clr4 mutants exhibit reduced heterochromatin formation at centromeres, with elevated mitotic chromosome loss and reduced silencing within both pericentromeric heterochromatin and the silent mating type locus. Similarly, mammalian Su(var)3-9-like proteins have been implicated in both centromere activity and gene silencing. Disruption of the murine Suv39h1 and Suv39h2 paralogs causes genome instability, chromosome mis-segregation, and male meiotic defects (Richards, 2002 and references therein).
Further, the Suv39h1/SUV39H1 proteins are found in association with M31, a mouse Heterochromatin protein 1 (HP1) homolog. HP1, perhaps the best-characterized protein found in heterochromatin, was identified in Drosophila melanogaster in a screen of monoclonal antibodies prepared against proteins tightly bound in the nucleus. Immunofluorescent staining of the polytene chromosomes shows HP1 concentrated in the pericentric heterochromatin, the telomeres, and a banded pattern across the small fourth chromosome, known sites of repetitive DNA with characteristics of heterochromatin. A few prominent HP1 sites are observed within the euchromatic arms (e.g., region 31). Homologs of HP1 are associated with pericentric heterochromatin in organisms from S. pombe to humans. The protein (206 amino acids in Drosophila) has a conserved N-terminal chromo domain (CD) followed by a variable hinge region and a conserved C-terminal chromo shadow domain (CSD). The chromo domain was first recognized by similarity with a domain in Polycomb, a protein associated with silencing of the homeotic genes during development; this domain has now been identified in many other chromosomal proteins. Both point mutations in the chromo domain and presumed null mutations (early truncation of the translation product) in the gene encoding HP1 [Su(var)2-5] result in a loss of silencing, while an additional dose will increase silencing of a variegating euchromatic gene, i.e., one placed in a heterochromatic environment. Interestingly, the converse is true for those few genes normally resident within the pericentric heterochromatin (e.g., light), which appear to be dependent on HP1 for normal activity. The conserved structure of HP1 suggests that it might serve as a bifunctional reagent, helping to organize and maintain heterochromatin structure. HP1 interacts with a number of other chromosomal proteins, including several involved in nuclear assembly, replication, and gene regulation. These interactions have generally been mapped to the chromo shadow domain. The chromo shadow domain can homodimerize, and the dimer has been suggested to be the interactive species (Richards, 2002 and references therein).
The HP1 chromo domain specifically binds histone H3 N-terminal tails methylated on lysine 9, and a variety of data suggest that this interaction is essential for maintenance of heterochromatin. The interaction appears quite specific; neither the chromo domain of Polycomb nor the chromo shadow domain of HP1 shows this interaction. The H3 tail fits within a groove established by conserved chromo domain residues; Su(var) mutation V26M results in an alteration of the structure and loss of H3-mLys9 binding. Studies in mammalian cells suggest that localization of HP1 in heterochromatin is dependent on the presence of histone H3-mLys9. However, HP1 association with heterochromatin in Drosophila can be driven either by the N-terminal portion (with the chromo domain) or the C-terminal portion (with the shadow domain), emphasizing the bifunctional nature of the protein. The above results argue that an interaction between the specifically modified histone H3 and HP1 is essential for maintaining a stable heterochromatin structure (Richards, 2002 and references therein).
Histone H3-Lys9 methylation is influenced by preexisting modifications of histone H3 and affects other histone modifications, implying a set of functional interactions. The relationship between hypoacetylation of H3/H4 and methylation of H3 has been clarified by studies of heterochromatin formation in S. pombe. clr1-clr4, clr6, swi6, and rik1 mutations all identify trans-acting factors necessary for silencing at the S. pombe mating type locus. Swi6 is a homolog of HP1, while clr1 and rik1 code for putative DNA binding proteins. The products of clr3 and clr6 are homologs of HDACs. Clr4 is the H3-Lys9 methyltransferase. These genes work together, acting on the entire silent mating type domain to maintain it in the repressed state. Clr3, an H3-specific deacetylase, and Rik1 are required for histone H3-Lys9 methylation by Clr4, and Swi6 localization is dependent on Clr4 and Rik1 (Richards, 2002 and references therein)
These observations suggest a progression of events leading to establishment of a distinctive heterochromatic structure based on the histone modification pattern. Deacetylation of histone H3 by Clr6 and/or Clr3 creates conditions favoring methylation at H3 Lys9 by the Clr4/Rik1 complex; methylation leads to binding of Swi6, establishing a chromatin configuration that is refractory to transcription and stably maintained. Mapping studies using chromatin immunoprecipitation show H3-mLys9 and Swi6 found throughout, and limited to, the 20 kb silent mating type domain. This 20 kb region is flanked by inverted repeats IR-L and IR-R, which appear to serve as barriers to the spread of silencing; removal of these repeats results in the appearance of H3-mLys9 and Swi6 on neighboring sequences. Silencing is dependent on the dosage of Swi6, which remains bound to the mating type region throughout the cell cycle and may itself be a marker for heterochromatin formation (Richards, 2002 and references therein).
The findings suggest a mechanism for maintaining heterochromatin structure following replication and for driving the spread of heterochromatin. During replication, the DNA must be 'unpackaged' and the daughter DNA molecules repackaged into nucleosomes. Parental histones are efficiently reutilized, distributed randomly to the two daughter DNA molecules; an equal amount of newly synthesized histone is required to complete assembly. Assuming that the histone H3-mLys9 in a heterochromatic domain is stable (no histone demethylases have been identified as yet), it will associate with HP1 through the chromo domain. The presence of HP1 will result in assembly of a modifying complex, presumably through the chromo shadow domain, that will deacetylate and specifically methylate the newly arrived histone, perpetuating the pattern of modification and HP1 binding to establish a heterochromatic structure. Recovery of a SUV39H1-HDAC1 complex from Drosophila embryo extracts that can methylate preacetylated histones supports such a model (Czermin, 2001). Formation of complexes that both recognize a particular pattern of histone modification and have the ability to achieve that pattern provides a mechanism for epigenetic inheritance of chromatin structure. The same machinery could account for spreading of heterochromatin, requiring that boundaries to such spread be established (Richards, 2002 and references therein).
Genetic analyses in S. pombe and Drosophila indicate that while the H3-mLys9/HP1 system is critical for heterochromatin formation and silencing in pericentric heterochromatin, it is of less importance at the telomeres, suggesting that an additional mechanism is used in those domains. Association of HP1 and a dependence on Su(var)3-9 activity have also been identified as critical in silencing particular euchromatic genes, both in mammalian systems and in Drosophila (Hwang, 2001). Interestingly, it appears that the histone H3-mLys9 modification at Rb-associated genes is quite limited; one nucleosome at the promoter is so modified, while an immediately upstream nucleosome is not, suggesting a difference in the capacity of the modified structure to spread. Histone H3-mLys9 is also associated with the inactive X chromosome in human cells, but no HP1 homologs have been identified preferentially associated with this domain. Whether differences in the degree of histone methylation or other modifications of histone H3 are important in determining any partner of H3-mLys9 in this case remains to be seen (Richards, 2002 and references therein).
A third silent chromatin mark, 5-methylcytosine (5mC), affects the DNA itself. Postreplicative methylation of cytosine is carried out by a diverse group of cytosine DNA methyltransferases (Dnmt's). Beyond this, little is known about the mechanisms that establish, maintain, and modify cytosine methylation patterns. At the whole genome level, it is clear that cytosine methylation patterns can be quite dynamic. The best example is the erasure and resetting of cytosine methylation in early mammalian development. However, large swings in cytosine methylation levels have not been detected during zebrafish development, and the evidence in plants is contradictory. Regardless of whether de novo methylation occurs every generation or in rare initiating events, certain DNA sequences must be targeted for cytosine methylation. At present, little is understood about the primary DNA sequence determinants for targeting, if any. Analysis of a Neurospora sequence prone to de novo methylation indicates the presence of redundant elements promoting methylation and suggests that TpA-rich sequences may be important. Unfortunately, similar detailed studies are not available in other organisms. Certain cytosine methyltransferases, such as mouse Dnmt3a and Dnmt3b, are specialized to carry out de novo methylation. However, these enzymes do not appear to have the intrinsic capacity for discrimination among primary nucleotide sequences, nor among higher-order structures. These considerations suggest that de novo cytosine methyltransferases might be taking cues from another epigenetic mark (Richards, 2002 and references therein).
Communication between the histone code and cytosine methylation may provide at least a partial answer to the long-standing question of how cytosine methylation patterns are established. The most direct evidence for a connection with histone methylation comes from genetic screens for cytosine hypomethylation mutants in Neurospora. The genome of this filamentous fungus contains 5-methylcytosine (~1.5 % of total C) concentrated in repetitive DNA (e.g., rRNA genes) and remnants from RIP activity (repeat induced point mutation, a hypermutation surveillance system that detects sequence duplications). Two Neurospora mutations completely abolish cytosine methylation in vegetative cells. One of these, dim-2, disrupts a gene encoding a cytosine methyltransferase. The other, dim-5, maps to a gene encoding a histone H3 methyltransferase. The predicted DIM-5 gene product contains a SET domain flanked by cysteine-rich elements and has sequence similarity to the histone methyltransferases Clr4 and Su(var)3-9, although it lacks a chromo domain. Recombinant DIM-5 protein exhibits histone methyltransferase activity in vitro. Strikingly, transformation of Neurospora with modified histone H3 genes with a substituted amino acid at Lys9 (the probable site of methylation by DIM-5) reduces cytosine methylation and relieves 5mC mediated gene silencing. Given that the dim-5 mutation appears to abolish all cytosine methylation, the results suggest that all DNA methylation in Neurospora takes its cue from histone H3-mLys9. It will be important to determine whether the histone methylation-DNA methylation connection is also found in other organisms, and if so, whether all cytosine methylation lies downstream of histone methylation. The dim-5 mutation causes more phenotypic defects than the dim-2 cytosine methyltransferase mutation, suggesting that a histone methylation deficiency has effects beyond those that result from loss of cytosine methylation (Richards, 2002 and references therein).
A connection between the histone code and the 5mC code is supported by other findings. The presence in flowering plants of cytosine methyltransferases that contain a chromo domain is particularly intriguing. Such 'chromo methyltransferases' (CMTs) might be recruited to a genomic region by nucleosomes containing histone H3-mLys9; thus, histone modification would provide a foundation for establishing DNA methylation patterns. However, the CMTs have not yet been demonstrated to bind methylated histone H3, nor is it clear that these methyltransferases possess de novo methyltransferase activity. Moreover, chromo methyltransferases have not been documented outside of plant species. Consequently, chromo methyltransferases are unlikely to be solely responsible for translating the histone methylation code into the 5mC epigenetic mark (Richards, 2002 and references therein).
Indirect models for the flow of information from histone H3-mLys9 to 5mC also need to be considered. The H3-mLys9 mark creates a foundation for HP1 interaction and subsequent heterochromatin formation. Cytosine methylation may be targeted to heterochromatin due to any number of characteristics, including nonhistone chromosomal protein content, subnuclear localization, or DNA replication timing. Disruption of heterochromatin by loss of the H3-mLys9 mark may lead to loss of 5mC through a number of intermediary steps. A 'chromatin first/cytosine methylation second' model is consistent with the demonstration that loss or alteration of cytosine methylation can be caused by mutations in SWI2/SNF2-like proteins in Arabidopsis, mice, and humans (Richards, 2002 and references therein).
Once 5mC patterns have been established, they must be maintained in order to serve as an inherited epigenetic code. The potential of cytosine methylation as a mitotic memory device was first described in the 'maintenance methylation' model. The essential feature of the model is clonal inheritance of the 5mC patterns through mitotic, and possibly meiotic, divisions based on the symmetrical nature of the sequences modified (e.g., CpG) and the specificity of 'maintenance' DNA methyltransferases for hemimethylated DNA. The basic tenets of the maintenance methylation model have been supported by a wealth of evidence. The bulk of cytosine methylation occurs very shortly after DNA replication, catalyzed by methyltransferases that have hemimethylated substrate preferences, recruited to the vicinity of the replication fork by interaction with PCNA. However, the classic maintenance methylation model is inadequate to explain the variability of 5mC patterns within individuals and omits some of the known components of the cytosine methylation system. Not all cytosine methylation occurs at short symmetrical sequences, so a simple maintenance methyltransferase, making reference solely to cytosine methylation on the template strand, cannot perpetuate methylation patterns. Maintenance of 5mC patterns at nonsymmetrical sites might involve reiterated de novo methylation and may represent an additional tier of DNA methylation superimposed on the pattern of 5mC at symmetrical sites. The machinery necessary to maintain 5mC at asymmetric sites has not been firmly established, but clues are emerging. Dnmt3a has been implicated in the synthesis of 5mC at asymmetric sites in mice. In Neurospora, a single cytosine methyltransferase, DIM-2, is responsible for all vegetative 5mC, including both symmetrical and asymmetrical sites. In plants, 5mC in asymmetrical sequences has been associated with chromo methyltransferases and RNA-dependent DNA methylation (Richards, 2002 and references therein).
The classical methylation maintenance model accounts for loss of 5mC through a passive mechanism: DNA replication in the absence of maintenance methylation. Cytological data using immuno-detection of 5mC argue that passive demethylation causes the dramatic erasure of DNA methylation patterns in early mammalian development. However, observation of 5mC loss in the absence of DNA replication has suggested an active demethylation mechanism as well. A 5mC-DNA glycosylase might also contribute to the dramatic swings in cytosine methylation seen in mammalian development (Richards, 2002 and references therein).
The execution of gene silencing from the 5mC mark involves modulation of another epigenetic mark: hypoacetylation of histones. Two independent pathways have been discovered in vertebrates connecting 5mC to histone deacetylation. The first uses methyl cytosine binding proteins, MeCP, or MBD (methyl binding domain) proteins as adaptors connecting 5mC to histone deacetylase complexes. Several MBD/MeCP protein-HDAC complexes have been identified in mammalian cells. These complexes act to reduce local histone acetylation levels using the 5mC marks on the DNA as a guide (Richards, 2002 and references therein).
A second pathway, also uncovered in mammals, operates through a physical interaction between the maintenance cytosine methyltransferase DNMT1 and HDACs. The catalytic domain of DNMT1 is not necessary for this interaction, suggesting that this cytosine methyltransferase is actually a transcriptional corepressor independent of its ability to methylate DNA. This interaction could act to reinforce inheritance of silent chromatin by facilitating histone deacetylation at the replication forks, where DNMT1 acts to maintain the 5mC epigenetic mark on methylated DNA sequences (Richards, 2002 and references therein).
Epigenetic information may also flow from the histone acetylation state back to cytosine methylation. The HDAC inhibitor TSA leads to cytosine hypomethylation at specific sequences in Neurospora, and a similar effect has been noted in mammalian cells. The loss of DNA methylation may be related to transcriptional activation, but other mechanisms have been proposed, including activation of cytosine demethylases. Inhibition of histone deacetylation does not lead to global loss of DNA methylation, however. For example, disruption of a histone deacetylase gene in plants did not lead to a generalized loss of 5mC despite a 10-fold elevation in histone H4 acetylation. Regardless of the significance of the retrograde signaling, the well-established flow of information from 5mC to histone deacetylation closes the loop of a self-reinforcing cycle for those organisms that utilize cytosine methylation (Richards, 2002).
The cycle of epigenetic marks discussed here suggests that initiation of heterochromatin formation, or similar silencing of euchromatic domains, requires acquisition of at least one epigenetic mark. What is known about entry into the cycle? In S. cerevisiae, protein interactions with specific cis-acting DNA sequences, such as E and I at the HM loci, or telomeric repeats, provide the foundation to recruit the SIR silencing complexes. The EF2-Rb-SUV39H1-HP1 interaction in mammals also implicates specific DNA sequences (binding sites for EF2) as initiation sites for silencing. Silencing within the mating type locus of S. pombe appears to be controlled both by local elements (REII and mat3 silencer) operating similarly to E and I in S. cerevisiae and by packaging of the domain as a whole, dependent on a block of repetitive DNA. In other organisms, the repetitive nature of the locus, rather than the primary DNA sequence, may be a trigger. The mechanisms at work are not clear, but hints can be derived from the repeat sensing/silencing phenomena in filamentous fungi, MIP (methylation induced premeiotically) in Ascobolus and RIP in Neurospora. In these systems, repeats appear to be recognized by a DNA-DNA pairing mechanism. In Ascobolus, cytosine methylation can be transferred between alleles, accompanying meiotic pairing and recombination events. RNA signals may provide another entré into the cycle of epigenetic silencing. Two noncoding RNA species, Xist and Tsix, are pivotal for initiation and choice in X chromosome inactivation in mice, where H3-Lys9 methylation is an early event. RNA may also have a role in initiating silent chromatin formation by directing the acquisition of cytosine methylation marks. Resolution of this question will be one of the major goals of future research (Richards, 2002 and references therein).
Once a genomic region has been targeted for silencing by acquisition of one or more covalent epigenetic marks, a silent chromatin identity can be propagated. The general features of the system include (1) positive signaling between the different covalent epigenetic marks and (2) enzymatic complexes/pathways that recognize each mark and catalyze the formation of the same mark. For example, in yeast, the histone H3/H4 deacetylation mark is recognized by Sir3, leading to recruitment of the Sir2 histone deacetylase. The histone H3-mLys9 mark is recognized by HP1, which can apparently recruit the histone methyltransferase activity of Su(var)3-9 homologs. The third self-reinforcing loop is carried out by maintenance cytosine methyltransferases, which have a substrate preference for hemimethylated DNA. The modification pathways operating on each covalent mark also interact and reinforce each other (Richards, 2002 and references therein).
In organisms lacking 5mC, a histone modification code appears to be sufficient to mark and perpetuate silent chromatin domains. The feedback loop between histone methylation and histone deacetylation, coupled with mechanisms to maintain these modifications, apparently provides stable silencing. In fact, S. cerevisiae appears to utilize neither DNA modification nor the HP1/histone H3-mLys9 complex, relying solely on deacetylation of histones H3/H4 as an epigenetic mark to maintain silencing. The transmission of chromatin states requires that at least one of the covalent marks be inherited through mitotic, and possibly meiotic, cell divisions. All three of these marks meet the criteria of persistence through mitosis (Richards, 2002).
While self-reinforcing mechanisms may be advantageous to ensure maintenance of silencing forgenomic sequences to be archived for the long-term in a nonexpressed state (e.g., transposons, pericentromeric repeats), there may be a need to reconfigure silenced chromatin as a prerequisite to expression of specific genes (e.g., mating type switching). In this case, what general mechanisms can be used to break the heterochromatin reinforcing cycle? Removal of the histone H3-mLys9 mark may require turnover of the entire protein, since no histone demethylase has yet been identified. Histones, however, are generally very stable. In comparison, the 5mC mark is more easily erased by passive or active demethylation mechanisms. The most malleable mark is the deacetylation of histones, the levels of which are set by the competing activities of histone acetylases and histone deacetylases (Richards, 2002).
Heterochromatin-mediated repression is essential for controlling the expression of transposons and for coordinated cell type-specific gene regulation. The small ovary (sov) locus was identified in a screen for female-sterile mutations in Drosophila melanogaster, and mutants show dramatic ovarian morphogenesis defects. The null sov phenotype is lethal and maps to the uncharacterized gene CG14438, which encodes a nuclear zinc-finger protein that colocalizes with the essential Heterochromatin Protein 1 (HP1a). Sov functions to repress inappropriate gene expression in the ovary, silence transposons, and suppress position-effect variegation in the eye, suggesting a central role in heterochromatin stabilization (Benner, 2019).
Polycomb response elements (PREs) are cis-regulatory sequences required for Polycomb repression of Hox genes in Drosophila. PREs function as potent silencers in the context of Hox reporter genes and they have been shown to partially repress a linked miniwhite reporter gene. The silencing capacity of PREs has not been systematically tested and, therefore, it has remained unclear whether only specific enhancers and promoters can respond to Polycomb silencing. Using a reporter gene assay in imaginal discs, it has been shown that a PRE from the Drosophila Hox gene Ultrabithorax potently silences different heterologous enhancers and promoters that are normally not subject to Polycomb repression. Silencing of these reporter genes is abolished in PcG mutants and excision of the PRE from the reporter gene during development results in loss of silencing within one cell generation. Together, these results suggest that PREs function as general silencer elements through which PcG proteins mediate transcriptional repression (Sengupta, 2004).
A 1.6 kb fragment encompassing the PRE from the Ubx
upstream control region was tested for its capacity to prevent transcriptional activation by enhancers from genes that are normally not under PcG control. For this
purpose, three different enhancers were tested in a lacZ reporter
gene assay in imaginal discs: dppWE, the imaginal disc
enhancer from the decapentaplegic (dpp) gene; vgQE the quadrant enhancer from the
vestigial (vg) gene; and vgBE, the vg D/V boundary enhancer. If linked to a reporter gene, each of these enhancers directs a distinct pattern of expression in the wing imaginal disc and activation by each enhancer is regulated by transcription factors that are
controlled by a different signaling pathway. Specifically, the dpp
enhancer contains binding sites for the Ci protein and is activated in
response to hedgehog signaling, the vg quadrant enhancer contains binding sites
for the Mad transcriptional regulator and is activated in response to
dpp signaling, and the vg boundary enhancer contains binding sites for the Su(H) transcription factor and is regulated by Notch signaling.
The dppWE, vgQE and
vgBE enhancers were individually inserted into a lacZ reporter gene construct that contained the PRE fragment and either a TATA box
minimal promoter from the hsp70 gene (here referred to as
TATA), or a 4.1 kb fragment of the proximal Ubx promoter
(here referred to as UbxP), fused to lacZ. In each construct, the PRE fragment was flanked by FRT sites that permit excision of the PRE fragment by flp recombinase. Several
independent transgenic lines for each of the six PRE transgenes were generated. From individual transgene insertions, derivative transgenic lines were then generated by flp-mediated excision of the PRE in the germline. Thus
expression of individual transgene insertions could be compared in the presence and absence of the PRE by staining wing imaginal discs for ß-galactosidase (ß-gal) activity. In the absence of the PRE, each of the three enhancers tested directs ß-gal expression in a characteristic previously characterized pattern. Each enhancer activated expression in the same pattern from
either the TATA box minimal promoter or the Ubx promoter with some
minor, promoter-specific differences with respect to the expression levels. By contrast, in most of the
parental transformant lines, i.e., those carrying the corresponding reporter gene with the PRE, ß-gal expression is completely suppressed. These observations suggest that the PRE fragment very potently silences each of the six reporter genes. It is noted, however, that, at some transgene insertion sites, efficiency of silencing by the PRE fragment appeared to be impeded by flanking chromosomal sequences; in these cases, it was found that ß-gal expression is activated even in the presence of the PRE (Sengupta, 2004).
To test whether silencing of the reporter genes by the PRE depends on PcG gene function, the PRE-containing transgenes
>PRE>dppWE-TATA-lacZ and
>PRE>vgQE-Ubx-lacZ were introduced into larvae that carried mutations in the PcG gene Suppressor of zeste 12 [Su(z)12]. Su(z)12 encodes a core component of the Esc-E(z) histone methyltransferase. Silencing of both transgenes is lost in
Su(z)122/Su(z)123 mutant larvae, and the
transgenes express ß-gal expression at levels comparable with the
transgene derivatives that lack the PRE fragment. Taken together, these observations suggest that the 1.6 kb PRE fragment from Ubx is a very potent general transcriptional silencer element that represses transcription in a PcG protein-dependent manner. Thus, it appears that this PRE acts indiscriminately to block transcriptional activation by a variety of different activator proteins (Sengupta, 2004),
To test the long-term requirement for the PRE for silencing of these
reporter genes, the PRE was excised during larval development and ß-gal expression was then
monitored at different time points after excision.
Forty-eight hours after induction of flp expression, all six reporter genes
showed robust derepression of ß-gal, suggesting that, in each
case, removal of the PRE results in the loss of PcG silencing.
Among the different enhancer-promoter combinations used in this study, the
dppW enhancer fused to the TATA box minimal promoter
appears to direct the highest levels of lacZ expression;
>PRE>dppWTZ transformant lines consistently
show the strongest ß-gal staining after excision of the PRE. Therefore >PRE>dppW-TZ transformants were analyzed at 4, 8, 12 and 24 hours
after induction of flp expression to study the kinetics of this derepression.
No ß-gal signal was detected at 4 hours or even at 8 hours after flp
induction, but 12 hours after flp induction, all discs showed robust ß-gal expression. Thus, even
in the case of the most potent enhancer-promoter combination used (i.e.
dppW enhancer and TATA box minimal promoter), a
delay of 12 hours between flp induction and ß-gal expression was observed. Since the average cell cycle length of imaginal disc cells in third instar larvae is 12 hours, this implies that most disc cells have undergone a full division cycle within this period. Derepression of the reporter gene in
this experiment requires several steps: (1) excision of the PRE by the flp
recombinase; (2) dissociation of the PRE and PcG proteins attached to it
-- possibly by disrupting PcG protein complexes formed between the PRE
and factors bound at the promoter, and (3) transcriptional activation by factors binding to the enhancer in the construct. It is possible that one or several steps in this process require a specific process during the cell cycle (e.g., passage through S phase) (Sengupta, 2004),
These experiments here show that three reporter genes, each containing a different enhancer linked to a canonical TATA box promoter, are completely silenced by a PRE placed upstream of the enhancer. The data suggest that PcG proteins that act through this PRE prevent indiscriminately activation by a variety of different transcription factors. The PcG machinery thus does not seem to require any specific enhancer and/or promoter sequences for repression (Sengupta, 2004),
Two points deserve to be discussed in more detail. The first concerns the stability of silencing imposed by a PRE. Previous studies have suggested that transcriptional activation in the early embryo could prevent the establishment of PcG silencing by PREs. More specifically, early transcriptional activation of Hox
genes by blastoderm enhancers may play an important role in preventing the
establishment of permanent PcG silencing in segment primordia in which Hox
genes need to be expressed at later developmental stages.
Importantly, none of the three enhancers used in this study is active in the early embryo. Moreover, these enhancers probably do not contain binding sites for specific transcriptional repressors, such as the gap repressors, which are required for establishment of PcG silencing at some PREs in the early embryo. It is therefore imagined that, in these constructs, PcG silencing complexes assemble by default on the 1.6 kb Ubx PRE in the early embryo and that PcG
silencing is thus firmly established by the stage when the imaginal discs
enhancers would become active. Silencing by the PRE during larval stages
therefore appears to be dominant overactivation and cannot be overcome by any of the enhancers used in this study. There is other evidence in support of the idea that PcG silencing during larval development is more stable than in embryos. In particular, a PRE reporter gene that contains a Gal4-inducible promoter is only transiently activated if a pulse of the transcriptional activator Gal4 is supplied during larval development; by contrast, a pulse of Gal4 during embryogenesis switches the PRE into an 'active mode' that supports
transcriptional activation throughout development.
Furthermore, recent studies in imaginal discs suggest that there is a
distinction between transcriptional repression and the inheritance of the
silenced state; the silenced state can be propagated for some period even if repression is lost. Specifically, loss of Hox gene silencing after removal of PcG proteins in proliferating cells can be reversed if the depleted PcG protein is resupplied within a few cell generations. Taken
together, it thus appears that PcG silencing during postembryonic development is a remarkably stable process. Finally, the results reported in this study also imply that, once PcG silencing is established, Hox genes can `make use of virtually any type of transcriptional activator to maintain their expression; PcG silencing will ensure that activation by these factors only occurs in cells in which the Hox gene should be active. The analysis of Ubx control sequences supports this view; if individually linked to a reporter gene, most late-acting enhancers direct expression both within as well as outside of the normal Ubx expression domain (Sengupta, 2004),
The second point to discuss concerns the repression mechanism used by
PcG proteins. Biochemical purification of PRC1 has revealed that several TFIID
components co-purify with the PcG proteins that constitute the core of PRC1. Moreover, formaldehyde crosslinking experiments in tissue culture cells showed that TFIID components are associated with promoters, even if these are repressed by PcG proteins. This suggests that PcG protein complexes anchored at the
PRE interact with general transcription factors bound at the promoter. One
possibility would be that PcG repressors directly target components of the
general transcription machinery to prevent transcriptional activation by
enhancer-binding factors. Three distinct activators act
through the three enhancers used in this study and, according to these results, none of them is able to overcome the block imposed by the PcG machinery. But how do the known
activities of PcG protein complexes [i.e., histone methylation by the Esc-E(z) complex and inhibition of chromatin remodeling by PRC1] fit into this
scenario? Both these activities may be required for the repression process by altering the structure of chromatin around the transcription start site and thus prevent the formation of productive RNA Pol II complexes. Other scenarios are possible. For example, histone methylation may primarily serve to mark the chromatin for binding of PRC1 through Pc, and PRC1 components such as Psc then perform the actual repression process.
Whatever the exact repression mechanism may be, the PRE-excision experiment shows that this repression is lost within one cell generation after removal of the PRE. This implies that changes in the chromatin generated by the action of PcG proteins cannot be propagated by the flanking chromatin (Sengupta, 2004).
Chromatin is important for the regulation of transcription and other functions, yet the diversity of chromatin composition and the distribution along chromosomes are still poorly characterized. By integrative analysis of genome-wide binding maps of 53 broadly selected chromatin components in Drosophila cells, this study shows that the genome is segmented into five principal chromatin types (see Chromatin types are characterized by distinctive protein combinations and histone modifications) that are defined by unique yet overlapping combinations of proteins and form domains that can extend over > 100 kb. A repressive chromatin type was identified that covers about half of the genome and lacks classic heterochromatin markers. Furthermore, transcriptionally active euchromatin consists of two types that differ in molecular organization and H3K36 methylation and regulate distinct classes of genes. Finally, evidence is provided that the different chromatin types help to target DNA-binding factors to specific genomic regions. These results provide a global view of chromatin diversity and domain organization in a metazoan cell (Filion, 2010).
By systematic integration of 53 protein location maps this study found that the Drosophila genome is packaged into a mosaic of five principal chromatin types, each defined by a unique combination of proteins. Extensive evidence demonstrates that the five types differ in a wide range of characteristics besides protein composition, such as biochemical properties, transcriptional activity, histone modifications, replication timing, DNA binding factor (DBF) targeting, as well as sequence properties and functions of the embedded genes. This validates the classification by independent means and provides important insights into the functional properties of the five chromatin types (Filion, 2010).
Identifying five chromatin states out of the binding profiles of 53 proteins comes out as a surprisingly low number (one can form approximately 1016 subsets of 53 elements). It is emphasized that the five chromatin types should be regarded as the major types. Some may be further divided into sub-types, depending on how fine-grained one wishes the classification to be. For example, within each of the transcriptionally active chromatin types, promoters and 3' ends of genes exhibit (mostly quantitative) differences in their protein composition and thus could be regarded as distinct sub-types. However, these local differences are minor relative to the differences between the five principal types that are described in this study. It cannot be excluded that the accumulation of binding profiles of additional proteins would reveal other novel chromatin types. It is also anticipated that the pattern of chromatin types along the genome will vary between cell types. For example, many genes that are embedded in 'BLACK' chromatin (defined in Kc167 cells) are activated in some other cell types. Thus, the chromatin of these genes is likely to switch to an active type (Filion, 2010).
While the integration of data for 53 proteins provides substantial robustness to the classification of chromatin along the genome, a subset of only five marker proteins (histone H1, PC, HP1, MRG15 and BRM), which together occupy 97.6% of the genome, can recapitulate this classification with 85.5% agreement. Assuming that no unknown additional principal chromatin types exist in some cell types, DamID or ChIP of this small set of markers may thus provide an efficient means to examine the distribution of the five chromatin types in various cells and tissues, with acceptable accuracy (Filion, 2010).
Previous work on the expression of integrated reporter genes had suggested that most of the fly genome is transcriptionally repressed, contrasting with the low coverage of PcG and HP1-marked chromatin. BLACK chromatin, which consists of a previously unknown combination of proteins and covers about half of the genome, may account for these observations. Essentially all genes in BLACK chromatin exhibit extremely low expression levels, and transgenes inserted in BLACK chromatin are frequently silenced, indicating that BLACK chromatin constitutes a strongly repressive environment. Importantly, BLACK chromatin is depleted of PcG proteins, HP1, SU(VAR)3-9 and associated proteins, and is also the latest to replicate, underscoring that it is different from previously characterized types of heterochromatin (identified as BLUE and GREEN chromatin in this study) (Filion, 2010).
The proteins that mark BLACK domains provide important clues to the molecular biology of this type of chromatin. Loss of Lamin (LAM), Effete (EFF) or histone H1 causes lethality during Drosophila development. Extensive in vitro and in vivo evidence has suggested a role for H1 in gene repression, most likely through stabilization of nucleosome positions. The enrichment of LAM points to a role of the nuclear lamina in gene regulation in BLACK chromatin, consistent with the long-standing notion that peripheral chromatin is silent. Depletion of LAM causes derepression of several LAM-associated genes (Shevelyov, 2009), while artificial targeting of genes to the nuclear lamina can reduce their expression, suggesting a direct repressive contribution of the nuclear lamina in BLACK chromatin. D1 is a little-studied protein with 11 AT-hook domains. Overexpression of D1 causes ectopic pairing of intercalary heterochromatin (Smith, 2010), suggesting a role in the regulation of higher-order chromatin structure. SUUR specifically regulates late replication on polytene chromosomes (Zhimulev, 2003), which is of interest because BLACK chromatin is particularly late-replicating. EFF is highly similar to the yeast and mammalian ubiquitin ligase Ubc4 that mediates ubiquitination of histone H3, raising the possibility that nucleosomes in BLACK chromatin may carry specific ubiquitin marks. These insights suggest that BLACK chromatin is important for chromosome architecture as well as gene repression and provide important leads for further study of this previously unknown yet prevalent type of chromatin (Filion, 2010).
In RED and YELLOW chromatin most genes are active, and the overall expression levels are similar between these two chromatin types. However, RED and YELLOW chromatin differ in many respects. One of the conspicuous distinctions is the disparate levels of H3K36me3 at active transcription units. This histone mark is thought to be laid down in the course of transcription elongation and may block the activity of cryptic promoters inside the transcription unit. Why active genes in RED chromatin lack H3K36me3 remains to be elucidated (Filion, 2010).
The remarkably high protein occupancy in RED chromatin suggests that RED domains are 'hubs' of regulatory activity. This may be related to the predominantly tissue-specific expression of genes in RED chromatin, which presumably requires many regulatory proteins. It is noted that the DamID assay integrates protein binding events over nearly 24 hours, so it is likely that not all proteins bind simultaneously; some proteins may bind only during a specific stage of the cell cycle. It is highly unlikely that the high protein occupancy in RED chromatin originates from an artifact of DamID, e.g. caused by a high accessibility of RED chromatin. First, all DamID data are corrected for accessibility using parallel Dam-only measurements. Second, several proteins, such as EFF, SU(VAR)3-9 and histone H1 exhibit lower occupancies in RED than in any other chromatin type. Third, ORC also shows a specific enrichment in RED chromatin, even though it was mapped by ChIP, by another laboratory and on another detection platform. Fourth, DamID of Gal4-DBD does not show any enrichment in RED chromatin (Filion, 2010).
RED chromatin resembles DBF binding hotspots that were previously discovered in a smaller-scale study in Drosophila cells. Discrete genomic regions targeted by many DBFs have recently also been found in mouse ES cells , hence it is tempting to speculate that an equivalent of RED chromatin may also exist in mammalian cells. Housekeeping and dynamically regulated genes in budding yeast also exhibit a dichotomy in chromatin organization which may be related to the distinction between YELLOW and RED chromatin. The observations that RED chromatin is generally the earliest to replicate and strongly enriched in ORC binding, suggest that this chromatin type may be not only involved in transcriptional regulation but also in the control of DNA replication (Filion, 2010).
This analysis of DBF binding indicates that the five chromatin types together act as a guidance system to target DBFs to specific genomic regions. This system directs DBFs to certain genomic domains even though the DBF recognition motifs are more widely distributed. It is proposed that targeting specificity is at least in part achieved through interactions of DBFs with particular partner proteins that are present in some of the five chromatin types but not in others. The observation that yeast Gal4-DBD binds its motifs with nearly equal efficiency in all five chromatin types suggests that differences in compaction among the chromatin types represent overall a minor factor in the targeting of DBFs. Although additional studies will be needed to further investigate the molecular mechanisms of DBF guidance, the identification of five principal types of chromatin provides a firm basis for future dissection of the roles of chromatin organization in global gene regulation (Filion, 2010).
Chromosomes are the physical realization of genetic information and thus form the basis for its readout and propagation. This study presents a high-resolution chromosomal contact map derived from a modified genome-wide chromosome conformation capture approach applied to Drosophila embryonic nuclei. The data show that the entire genome is linearly partitioned into well-demarcated physical domains that overlap extensively with active and repressive epigenetic marks. Chromosomal contacts are hierarchically organized between domains. Global modeling of contact density and clustering of domains show that inactive domains are condensed and confined to their chromosomal territories, whereas active domains reach out of the territory to form remote intra- and interchromosomal contacts. Moreover, specific long-range intrachromosomal contacts between Polycomb-repressed domains were systematically identified. Together, these observations allow for quantitative prediction of the Drosophila chromosomal contact map, laying the foundation for detailed studies of chromosome structure and function in a genetically tractable system (Sexton, 2012).
A simplified Hi-C procedure was developed for minimally biased profiling of chromosomal contacts on a genomic scale. Using this technique, chromosomal architecture in Drosophila melanogaster embryonic nuclei was comprehensively and accurately characterized. The chromosomal contact map relaxes the classical trade-off between coverage and resolution in the study of chromosome structure. The data provide sufficient resolution to observe local contact profiles derived from 4C and consistently deliver such resolution for essentially any genomic locus. The effective resolution limitations of the map depend on the features being studied. Demarcation of physical domains can be achieved within a precision of one or a few DpnII fragments (i.e., of ~1 kb), as many fragments with high expected contact probability contribute to their identification. On the other hand, detection of long-range contacts with statistical confidence greatly depends on their absolute intensity compared to the background, which decays significantly with genomic separation. For example, based on the current sequencing depth, the decay in background contact probability with genomic distance, and the average DpnII restriction site density, it is estimated that a contact with 4-fold enrichment over the background could be confidently detected at a resolution of ~10 kb for genomic separations of 100 kb, a resolution of ~30 kb for genomic separations of 1 Mb, and ~125 kb for interchromosomal coassociations. Regardless of these considerations, and despite the fact that the experiment assayed a large and heterogeneous set of nuclei, the derived Hi-C map reveals a clear structure and allows for multiple chromosome folding principles to be explored systematically. The implications of the Drosophila map are therefore far reaching, and the analysis presented in this study can be viewed as a baseline on which further efforts directed to understand genomic and epigenomic patterns at particular cell states or genetic backgrounds can be developed (Sexton, 2012).
The Hi-C map is rich in local and global structure, describing contact frequencies that vary within five orders of magnitude. This studied has tried to explain the distributions of contact frequencies in the map using quantitative models based on the simplest principles and has tried to justify any progressive increases in model complexity by proven discrepancies between the data and a simpler version of the model. One of the most remarkable patterns observed in the map was the partitioning of chromosomes into physical domains, which showed up in the matrix as diagonal submatrices with high contact intensities. A quantitative probabilistic model was used to show that contacts inside these domains are governed by a distinct regime that cannot be attributed to denser contacts or more compact chromosomal structure alone. Further analysis showed that physical domains form the backbones of a hierarchical chromosome structure, as the contact intensities between genomic elements are mostly determined by the identities of the domains containing them, rather than the element's location within the domain. Previous lower-resolution exploration of human chromosome architecture identified a global power law decay of contact frequency with genomic separation, and this was used to propose a fractal globule model of chromosome folding. Although a roughly similar global decay curve was found for Drosophila chromatin, higher-resolution analysis of contact decays within the context of physical domains challenges this model and suggests that, in scales of 10-100 kb, the predominant factor affecting chromosome folding is the modular organization. This promotes hierarchical chromosomal organization as an attractive paradigm to facilitate functional epigenetic organization but leaves open questions about the scales at which it may be observed in different genomes that vary significantly in size and gene density (Sexton, 2012).
Remarkably, the physical domains inferred from the Hi-C contact map were compatible with numerous linear epigenetic profiles describing enrichment for histone modification or DNA-binding factors. Thus the physical domains, which are key fundamental units of chromosome folding, are reflected and possibly caused by their underlying epigenetic marks. Large silent chromosomal regions that are either enriched with repressive histone marks (H3K27me3 or HP1/H3K9me2) or void of any detectable epigenetic enrichment were shown to form modular chromosomal entities, which are interspersed with small domains associated with active chromosomal marks. By analyzing the epigenomic marks at the borders of physical domains, it was observed that a transition in transcriptional activity (as indicated by peaks of H3K4me3) is sometimes sufficient to disturb the compaction of flanking repressive chromatin domains. This may result in the formation of 'punctuated' repressed domains, with active genes forming 'passive' physical boundaries. However, in most cases, this study finds that insulator proteins, particularly CP190 and Chromator, sharply demarcate the borders of physical domains. This is especially apparent at borders marked by both CP190 and H3K4me3, as CP190 binds precisely at the physical domain boundary, with the H3K4me3 peak shifted ~500 bp toward the Active domain. Interestingly, a recent study suggested that binding of the 'accessory' insulator protein CP190 is required for a functional insulator. In agreement with this, it was found that CP190 correlates most strongly with physical boundary domains, whereas many regions bound by the DNA sequence-specific binding insulator proteins CTCF and Su(Hw) are not linked to physical domain boundaries. Chromator emerged from this analyses as another major factor organizing physical domains. Although little is known about the function of the mitotic spindle protein during interphase, Chromator has been shown to be necessary for the maintenance of polytene chromosome structure. The current findings appear to extend the structural function of Chromator to diploid embryonic nuclei. By providing an architectural context to epigenomic chromatin domains, the Hi-C map thus provides a reference epigenomic model, directing future efforts for analyses of the correlations between hundreds of measured linear epigenomic profiles (Sexton, 2012).
Chromosomes clearly fold in a complicated, heterogeneous regime. In order to make any reasoned claims about the significance of previously reported individual cases of long-range chromatin interactions, it is important to first understand the basic principles of what defines 'standard' folding of a chromosome fiber. This Hi-C dataset allows formulation and testing hypotheses on chromatin folding with progressively more complex quantitative models. First, this study has account for heterogeneity in contact density, facilitating identification of physical chromatin modules and their hierarchical pattern of folding. Next, it was possible able to group physical domains into two clusters (annotated postfactum as active or inactive) based on their intrachromosomal contacts and to generally describe interdomain contacts as those within or between clusters. This supported and extended previous findings on the relationship between transcriptional activity and position within chromosome territories. Although the combined model explains much of the chromosome folding behavior, specific long-range chromatin interactions were still apparent. One group of functional long-range interactions that has already been investigated and is clearly visible in the Hi-C map associates PcG-regulated genes that co-occupy Polycomb bodies (Sexton, 2012).
In summary, this Hi-C study has provided a fundamental chromatin interaction map framework, providing the basis for mathematical models to assess the link between chromosome structure and function. The characterization of hierarchically folded discrete physical modules, which may be epigenetically defined, forms a hitherto unappreciated base from which more complicated chromosome topologies can arise. It is posited that this and future Hi-C datasets, combined with specific perturbation experiments, will inform more sophisticated mathematical models of chromosome folding, forming a foundation for new important insights into what shapes nuclear structure and how this in turn affects genome function (Sexton, 2012).
The mechanisms responsible for the establishment of physical domains in metazoan chromosomes are poorly understood. This study finds that physical domains in Drosophila chromosomes are demarcated at regions of active transcription and high gene density that are enriched for transcription factors and specific combinations of insulator proteins. Physical domains contain different types of chromatin defined by the presence of specific proteins and epigenetic marks, with active chromatin preferentially located at the borders and silenced chromatin in the interior. Domain boundaries participate in long-range interactions that may contribute to the clustering of regions of active or silenced chromatin in the nucleus. Analysis of transgenes suggests that chromatin is more accessible and permissive to transcription at the borders than inside domains, independent of the presence of active or silencing histone modifications. These results suggest that the higher-order physical organization of chromatin may impose an additional level of regulation over classical epigenetic marks (Hou, 2012).
The use of Hi-C to map intra- and inter-chromosomal interactions in metazoan genomes has given important insights into the organization of the chromatin fiber in eukaryotic nuclei. One important conclusion from these studies is that eukaryotic chromosomes are organized into a series of chromatin domains, perhaps formed by a series of local interactions among various regulatory sequences and the genes they control. Long-range interactions between chromatin domains may result in additional levels of folding to create larger domains. These results complement and converge with evidence suggesting that specific sequences come together in the nucleus in the process of, or with the purpose of, carrying out various nuclear processes. For example, actively transcribed genes and their regulatory sequences have been shown to colocalize at transcription factories, whereas genes silenced by PcG proteins converge at repressive factories termed Pc bodies. It is unclear whether these associations are a consequence of self-organizing principles with no functional outcomes, i.e., they result from interactions among multiprotein complexes present at active or silenced genes, or they play a functional role in gene expression and are mediated by structural proteins specifically involved in mediating inter- and intrachromosomal interactions (Hou, 2012).
A critical roadblock in understanding the principles governing the folding of metazoan genomes is the identification of proteins or forces responsible for the formation of chromosome domains and the boundaries that separate these structures. Results from the analysis of mixed-cell populations in Drosophila embryos indicate a correlation between the formation of domain boundaries and the presence of insulator proteins and the transcription factor Chromator. Similar results in mouse and human cells find a high degree of correlation between the presence of CTCF and housekeeping genes and the formation of domain boundaries (Hou, 2012).
To further explore the mechanisms of physical domain partition in metazoans, a Hi-C analysis was carried using Drosophila Kc167 cells. Physical domains did not exactly correlate with functional domains defined by epigenetic marks. Furthermore, domain boundaries usually form at regions enriched for active histone modifications such as H3K4me3 but also form in regions enriched for silencing marks such as H3K27me3 and LAM. The common theme among domain boundaries, even those present in regions enriched for H3K27me3 and LAM, is a high density of actively transcribed genes. The likely causal role of transcription in the establishment of domains boundaries is underscored by the formation of multiple small physical domains in regions of the genome enriched for active genes. Regions of the genome enriched for silenced chromatin form large domains, with boundaries between these domains often forming when closely spaced and transcribed genes are present at the domain borders. The high correlation between gene density, transcription, and the formation of domain boundaries helps explain why these domains are conserved across different cell types of the same or different species (Hou, 2012).
In agreement with these observations, RNAPII, transcription factors, and insulator proteins are also found enriched at the borders of domains. Drosophila insulator proteins, with the exception of Su(Hw), are preferentially located adjacent to promoter regions of actively transcribed genes. It is then possible that insulators play an active role in the formation of domain boundaries and that the observed increase in actively transcribed genes in these regions is a consequence of their close association with insulator proteins. Alternatively, active transcription in regions of high gene density may be the driving force behind the formation of physical domains, and the enrichment of insulator proteins at the boundaries may be a result of their presence adjacent to these genes. Given the demonstrated role of insulators in mediating interactions between different sequences in the genome, it is possible that a combination of these two possibilities is actually responsible for domain formation. An interesting observation that may offer additional clues as to the role of insulators in the formation of physical domains is the specific enrichment of clusters of insulator proteins at the boundaries. Drosophila insulator proteins Su(Hw), BEAF, and CTCF bind specific DNA sequences and recruit CP190 and Mod(mdg4); these two proteins then interact with each other and/or themselves to bridge contacts between distant sites. The presence of multiple insulator DNA binding proteins would, presumably, make for a stronger insulator, able to mediate more frequent long-distance interactions. This hypothesis is supported by the observation that long-distance interactions involving domain boundaries are significantly higher than expected. These interactions can bring together highly transcribed regions, offering a mechanism to explain the formation of transcription factories (Hou, 2012).
An important question is whether this differential compaction of the chromatin between the inside and the borders of physical domains has an effect on gene expression. This issue was addressed by examining the insertion frequency and the expression levels of a large collection of P element transgenes. The frequency of transgene insertion is much higher at the borders of the domains than in the interior, independent of the type of chromatin, suggesting that the DNA inside physical domains is more compacted than at the borders. Furthermore, independent of the epigenetic marks present in the chromatin, transgenes inserted in the region surrounding the domain boundaries are less repressed than those inserted in the domain interior. Therefore, the physical compaction of DNA arising from the higher-order organization of the chromatin may add a different layer of regulatory information superimposed on that resulting from classical epigenetic marks (Hou, 2012).
Dynamic regulation of chromatin structure is required to modulate the transcription of genes in eukaryotes. However, the factors that contribute to the plasticity of heterochromatin structure are elusive. This study reports that cyclin-dependent kinase 12 (CDK12), a transcription elongation-associated RNA polymerase II (RNAPII) kinase, antagonizes heterochromatin enrichment in Drosophila chromosomes. Notably, loss of CDK12 induces the ectopic accumulation of heterochromatin protein 1 (HP1) on euchromatic arms, with a prominent enrichment on the X chromosome. Furthermore, ChIP and sequencing analysis reveals that the heterochromatin enrichment on the X chromosome mainly occurs within long genes involved in neuronal functions. Consequently, heterochromatin enrichment reduces the transcription of neuronal genes in the adult brain and results in a defect in Drosophila courtship learning. Taken together, these results define a previously unidentified role of CDK12 in controlling the epigenetic transition between euchromatin and heterochromatin and suggest a chromatin regulatory mechanism in neuronal behaviors (Pan, 2015).
ATP-dependent chromatin remodeling complexes are multi-protein machines highly conserved across eukaryotic genomes. They control sliding and displacing of the nucleosomes, modulating histone-DNA interactions and making nucleosomal DNA more accessible to specific binding proteins during replication, transcription, and DNA repair, which are processes involved in cell division. The SRCAP and p400/Tip60 chromatin remodeling complexes in humans and the related Drosophila Tip60 complex belong to the evolutionary conserved INO80 family, whose main function is promoting the exchange of canonical histone H2A with the histone variant H2A in different eukaryotic species. Some subunits of these complexes were additionally shown to relocate to the mitotic apparatus and proposed to play direct roles in cell division in human cells. However, whether this phenomenon reflects a more general function of remodeling complex components and its evolutionary conservation remains unexplored. This study has combined cell biology, reverse genetics, and biochemical approaches to study the subcellular distribution of a number of subunits belonging to the SRCAP and p400/Tip60 complexes and assess their involvement during cell division progression in HeLa cells. Interestingly, beyond their canonical chromatin localization, the subunits under investigation accumulate at different sites of the mitotic apparatus (centrosomes, spindle, and midbody), with their depletion yielding an array of aberrant outcomes of mitosis and cytokinesis, thus causing genomic instability. Importantly, this behavior was conserved by the Drosophila melanogaster orthologs tested, despite the evolutionary divergence between fly and humans has been estimated at approximately 780 million years ago. Overall, these results support the existence of evolutionarily conserved diverse roles of chromatin remodeling complexes, whereby subunits of the SRCAP and p400/Tip60 complexes relocate from the interphase chromatin to the mitotic apparatus, playing moonlighting functions required for proper execution of cell division (Messina, 2022).
Coordinated spatio-temporal regulation of the determination and differentiation of neural stem cells is essential for ">brain development. Failure to integrate multiple factors leads to defective brain structures or tumour formation. Previous studies suggest changes of chromatin state are needed to direct neural stem cell differentiation, but the mechanisms are unclear. Analysis of Snf5-related 1 (Snr1), the Drosophila orthologue of SMARCB1, an ATP-dependent chromatin remodelling protein, identified a key role in regulating the transition of neuroepithelial cells into neural stem cells and subsequent differentiation of neural stem cells into the cells needed to build the brain. loss of Snr1 in neuroepithelial cells leads to premature neural stem cell formation. Additionally, loss of Snr1 in neural stem cells results in inappropriate perdurance of neural stem cells into adulthood. Snr1 reduction in neuroepithelial or neural stem cells leads to the differential expression of target genes. Snr1 is associated with the actively transcribed chromatin region of these target genes. Thus, Snr1 likely regulates the chromatin state in neuroepithelial cells and maintains chromatin state in neural stem cells for proper brain development (Keegan, 2023)
Drosophila INterspersed Elements (DINEs) constitute an abundant but poorly understood group of Helitrons present in several Drosophila species. The general structure of DINEs includes two conserved blocks that may or not contain a region with tandem repeats in between. These central tandem repeats (CTRs) are similar within species but highly divergent between species. This study identified a subset of DINEs, termed DINE-TR1, which contain homologous CTRs of approximately 150 bp. DINE-TR1 are found in the sequenced genomes of several Drosophila species. However, interspecific high sequence identity (~88%) is limited to the first approximately 30 bp of each tandem repeat. Sequence analysis suggests vertical transmission. CTRs found within DINE-TR1 have independently expanded into satellite DNA-like arrays at least twice within Drosophila. By analyzing the genome of Drosophila virilis and Drosophila americana, it was shown that DINE-TR1 is highly abundant in pericentromeric heterochromatin boundaries, some telomeric regions and in the Y chromosome. It is also present in the centromeric region of one autosome from D. virilis and dispersed throughout several euchromatic sites in both species. DINE-TR1 was found to be abundant at piRNA clusters, and small DINE-TR1-derived RNA transcripts (~25 nt) are predominantly expressed in the testes and the ovaries, suggesting active targeting by the piRNA machinery. These features suggest potential piRNA-mediated regulatory roles for DINEs at local and genome-wide scales in Drosophila (Dias, 2015).
Nuclear pore complexes have emerged in recent years as chromatin-binding nuclear scaffolds, able to influence target gene expression. However, how nucleoporins (Nups) exert this control remains poorly understood. This study shows that ectopically tethering Drosophila Nups, especially Sec13, to chromatin is sufficient to induce chromatin decondensation. This decondensation is mediated through chromatin-remodeling complex PBAP, as PBAP is both robustly recruited by Sec13 and required for Sec13-induced decondensation. This phenomenon is not correlated with localization of the target locus to the nuclear periphery, but is correlated with robust recruitment of Nup Elys. Furthermore, this study identified a biochemical interaction between endogenous Sec13 and Elys with PBAP, and a role for endogenous Elys in global as well as gene-specific chromatin decompaction. Together, these findings reveal a functional role and mechanism for specific nuclear pore components in promoting an open chromatin state (Kuhn, 2019).
Recent advances enabled by the Hi-C technique have unraveled many principles of chromosomal folding that were subsequently linked to disease and gene regulation. In particular, Hi-C revealed that chromosomes of animals are organized into Topologically Associating Domains (TADs), evolutionary conserved compact chromatin domains that influence gene expression. Mechanisms that underlie partitioning of the genome into TADs remain poorly understood. To explore principles of TAD folding in Drosophila melanogaster, Hi-C and PolyA+ RNA-seq was performed in four cell lines of various origins (S2, Kc167, DmBG3-c2, and OSC). Contrary to previous studies, this study found that regions between TADs (i.e. the inter-TADs and TAD boundaries) in Drosophila are only weakly enriched with the insulator protein dCTCF, while another insulator protein Su(Hw) is preferentially present within TADs. However, Drosophila inter-TADs harbor active chromatin and constitutively transcribed (housekeeping) genes. Accordingly, it was found that binding of insulator proteins dCTCF and Su(Hw) predicts TAD boundaries much worse than active chromatin marks do. Interestingly, inter-TADs correspond to decompacted interbands of polytene chromosomes, whereas TADs mostly correspond to densely packed bands. Collectively, these results suggest that TADs are condensed chromatin domains depleted in active chromatin marks, separated by regions of active chromatin. The mechanism of TAD self-assembly is proposed based on the ability of nucleosomes from inactive chromatin to aggregate, and lack of this ability is found in acetylated nucleosomal arrays. Finally, this hypothesis was tested by polymer simulations, and it was found that TAD partitioning may be explained by different modes of inter-nucleosomal interactions for active and inactive chromatin (Ulianov, 2015).
Recently developed 3C-based methods coupled with high-throughput sequencing have enabled genome-wide investigation of chromatin organization. Studies performed in human, mouse, Drosophila, yeasts, Arabidopsis and several other species have unraveled general principles of genome folding. Chromosomes in mammals and Drosophila are organized hierarchically. At the megabase scale, mammalian chromosomes are partitioned into active and inactive compartments. At the sub-megabase scale, these compartments are subdivided into a set of self-interacting domains called Topologically Associating Domains (TADs); TADs themselves are often hierarchical and are split into smaller domains. Similar to mammals, Drosophila chromosomes are partitioned into TADs that are interspaced with short boundaries or longer inter-TAD regions (inter-TADs) (Ulianov, 2015).
Partitioning of mammalian genomes into TADs appears to be largely cell-lineage independent and evolutionary conserved. Disruption of certain TAD boundaries leads to developmental defects in humans and mice. TADs correlate with units of replication timing regulation in mammals and colocalize with epigenetic domains (either active or repressed) in Drosophila. The internal structure of TADs was reported to change in response to environmental stress, during cell differentiation, and embryonic development. In addition, comparative Hi-C analysis has demonstrated that genomic rearrangements between related mammalian species occur predominantly at TAD boundaries. Consequently, TADs appear to evolve primarily as constant and unsplit units. Previous studies in Drosophila embryonic nuclei and embryo-derived Kc167 cells detected TADs of various sizes roughly corresponding to epigenetic domains. Additionally, long-range genomic contacts and clustering of pericentromeric regions were revealed, and TAD boundaries were found to be enriched with active chromatin marks and insulator proteins. Both active and inactive TADs were identified, and their spatial segregation was observed (Ulianov, 2015).
Despite extensive studies, mechanisms underlying TAD formation remain obscure. Architectural proteins, including cohesin and CTCF, are often found at TAD boundaries; thus, they have been proposed to play a key role in the demarcation of TADs. However, several studies suggest that other mechanisms may be responsible for partitioning and formation of TADs. Firstly, depletion of various insulator proteins did not affect the profile of chromosome partitioning into TADs, but rather decreased intra-TAD interactions. Secondly, CTCF may mediate loops that occur between the start and the end of the so-called 'loop domains'. However, domains of similar sizes but without a loop were observed as well (so-called 'ordinary domains'. Thirdly, polymer simulations of a permanent chromatin loop yield a noticeable interaction between the loop bases on a simulated Hi-C map, but without a characteristic square shape of a TAD. Loops of this kind are thought to occur between insulator proteins such as Su(Hw) in the 'topological insulation' model. Finally, chromosomal domains similar to TADs in the bacterium Caulobacter crescentus are demarcated by actively transcribed genes, and are not affected by the knockout of SMC, a homolog of cohesin subunits (Ulianov, 2015).
This study presents evidences that question the role of insulators in the organization of TAD boundaries in Drosophila . The results suggest that TADs are self-organized and potentially highly dynamic structures formed by numerous transient interactions between nucleosomes of inactive chromatin, while inter-TADs and TAD boundaries contain highly acetylated nucleosomes that are less prone to interactions. Finally, a polymer model of TAD formation is developed based on the two types of nucleosomes, and it was found that a polymer composed of active and inactive chromatin blocks forms TADs on a simulated Hi-C map (Ulianov, 2015).
This study and others (Hou, 2012; Sexton, 2012) revealed that boundaries and inter-TADs in Drosophila, as opposed to TADs, are strongly enriched with active chromatin and its individual marks, as well as with active transcription and with constitutively transcribed housekeeping genes. Consequently, active chromatin marks, in the simplest case only total transcription and H3K4me3 (a mark of active promoters), can relatively well predict a TAD/inter-TAD profile. The existence of long inter-TADs composed of active chromatin is per se an argument for the ability of this type of chromatin to separate TADs. Furthermore, the current observations demonstrate that the presence of active chromatin and transcribed regions within TAD undermines the TAD integrity making TAD less compact and generating weak boundaries inside TAD. Consequently, a bona fide TAD is inactive; TADs containing active chromatin become less dense, acquire weak internal boundaries and eventually split into smaller TADs that are composed of inactive chromatin. The observation that the majority of housekeeping genes are located within inter- TADs and TAD boundaries suggests that evolutionary conservation and cell-type independence of TAD/inter-TAD profiles may be explained by conservation of positions of housekeeping genes along the chromosomes (Ulianov, 2015).
It is noted that chromosomal interaction domains similar to TADs have been observed in the bacterium Caulobacter crescentus, where they are demarcated by sites of active transcription. Although the basic level of chromosomal folding is different in bacteria and eukaryotes, the model proposed in (Le, 2013) and the model stem from common principles. In Caulobacter, active transcription is thought to disrupt the fiber of supercoils (plectonemes) by creating a stretch of non-packaged DNA, free of plectonemes, which spatially separates chromosomal regions flanking it. In the model, transcription disrupts chromatin organization by introducing a 'non-sticky' region of chromatin, which is less compact and more unfolded in space, and thus spatially separates two flanking regions. Computer modeling shows that stickiness of non-acetylated (inactive) nucleosomes and the absence of stickiness for acetylated (active) nucleosomes are sufficient for chromatin partitioning into TADs and inter-TADs. Self-association of nucleosomes may be explained by the interaction of positively charged histone tails (in particular, the tail of histone H4) of one nucleosome with the acidic patch of histones H2A/H2B at an adjacent nucleosome. Acetylation of histone tails, which is typical of active chromatin, may interfere with inter-nucleosomal associations. In addition to a high level of histone acetylation, other features of active chromatin including lower nucleosome density in inter-TADs, manifested as the decreased histone H3 occupancy, might contribute to the generation of TAD profiles (Ulianov, 2015).
It should be mentioned that a significant difference between the polymer simulations and models previously suggested by the Cavalli and Vaillant groups (Jost, 2014) is the use of saturating interactions between inactive nucleosomes. In the case of volume interactions, all nucleosomes of the same type adjacent in 3D space will attract each other; in the case of saturating interactions, each molecule may attract only one neighbor. Using volume interactions leads to the formation of a single dense blob, and does not produce TADs in a simulated Hi-C map.It is noted that the saturating nature of interactions between nucleosomes is based on the known properties of nucleosomal particles. Previous studies considered a variety of mechanisms that may lead to the formation of TADs. In particular, Barbieri (2012) studied segregation of two TADs using cubic lattice simulations of a short 152-monomer chain consisting of two TADs, assuming that inter- monomer interactions could only form between monomers belonging to the same TAD. In the current model, this study shows that TADs emerge without requiring such specific interactions; any two regions of sticky monomers separated by a non-sticky linker would form TADs. Another study proposed that transcription-induced supercoiling may be responsible for the formation of TADs (Benedetti, 2014). Although this model is consistent with the current observation that sites of active transcription demarcate TAD boundaries, there is limited evidence that supercoiling of chromatinized DNA exists in Drosophila and other organisms. On the contrary, the current model is based on known biochemical properties of nucleosomes (Ulianov, 2015).
The fact that a minor fraction of TADs is built mostly from active chromatin apparently contradicts the current model, suggesting that additional ways of chromatin self-organization could exist. One possibility is the establishment of long-range contacts between enhancers and their cognate promoters, as well as loops between pairs of insulators. Such loops formed inside active unstructured chromatin linkers (i.e., inter-TADs) could probably be sufficient to compact them and thus to fold into TADs (Ulianov, 2015).
TAD profiles of X chromosomes are almost identical in the male and female cell lines, that is in agreement with recently published observations (Ramírez, 2015). Thus, it seems that hyperacetylation of male X-chromosomes due to dosage compensation does not generate new TAD boundaries. However, it should be noted that MOF histone acetyltransferase of the MSL complex introduces only the H4K16ac mark. Although this modification is important to prevent inter-nucleosomal interactions, acetylation at other histone positions and H2B ubiquitylation contribute as well. Additionally, H4K16 acetylation generated by the dosage compensation system occurs preferentially at regions enriched with transcribed genes and hence within inter-TADs (Ulianov, 2015).
The current analysis does not support the previously reported (Hou, 2012; Sexton, 2012) strong enrichments of insulator proteins Su(Hw) and dCTCF at TAD boundaries in Drosophila. To assess the possible reasons of this divergence, the dCTCF distribution was re-analyzed with respect to TAD positions in the current dataset using the raw ChIP-seq data. No strong difference was observed in the dCTCF coverage in TADs and inter-TADs. Interestingly, this study obtained the same result while analyzing dCTCF and Su(Hw) binding within TAD boundaries identified by Hou (2012). However, a strong enrichment of dCTCF at TAD boundaries was observed when the peak distribution was analyzed instead of read coverage. Additionally, the effect was much weaker when modENCODE peaks were used. Hence, the discrepancy may be caused by a different peak calling procedure in modENCODE and in Hou. (2012). The biological significance of these observations remains to be determined. It is noted that disruption of the cohesin/CTCF complex in mammals, as well as depletion of the Vtd (also known as Rad21) cohesin subunit in Drosophila, did not lead to disappearance of TAD boundaries, but rather only slightly decreased interactions inside TADs (in mammals) and reduced TAD boundary strength in the Drosophila genome. These observations favor a role for the cohesin/CTCF complex, which is known to form loops, in chromatin compaction inside the TADs (Ulianov, 2015).
Binding of insulator proteins might contribute to establishing TAD boundaries through introducing active chromatin marks. Indeed, when inserted into an ectopic position, a classical insulator triggers hyperacetylation of the local chromatin domain and recruits chromatin-remodeling complexes. However, absence of strong enrichment of dCTCF at TAD boundaries and preferential location of Su(Hw) inside TADs mean that at least dCTCF- and Su(Hw)-dependent insulators are not the major determinants of TAD boundaries and inter-TADs (Ulianov, 2015).
TADs are predicted based on the analysis of averaged data from a cell population. Although they are usually represented as large chromatin globules, direct experimental evidence for the existence of such globules in individual cells is controversial. Using confocal and 3D-SIM microscopy, ~1-Mb globular domains have been observed within chromosomal territories. However, using STORM microscopy, chromatin in individual mammalian cells has been found to be organized into 'clutches' composed of several nucleosomes, and that increased histone acetylation dramatically reduces size of these clutches. It is thus possible that sub-megabase TADs revealed by Hi-C represent a set of nucleosome clutches separated by relatively short spacers of various sizes. These short clutches may occupy various positions within TADs in different cells and stochastically assemble to form short-living aggregates. The stochastic nature of TADs is supported by computer simulations (Ulianov, 2015).
Topologically associating domains, or TADs, are functional units that organize chromosomes into 3D structures of interacting chromatin. While the mechanisms of TAD formation have been well-studied, current knowledge on the patterns of TAD evolution across species is limited. Due to the integral role TADs play in gene regulation, their structure and organization is expected to be conserved during evolution. However, more recent research suggests that TAD structures diverge relatively rapidly. This study used Hi-C chromosome conformation capture to measure evolutionary conservation of whole TADs and TAD boundary elements between D. melanogaster and D. triauraria, two early-branching species from the melanogaster species group which diverged ∼15 million years ago. The majority of TADs were found to have been reorganized since the common ancestor of D. melanogaster and D. triauraria, via a combination of chromosomal rearrangements and gain/loss of TAD boundaries. TAD reorganization between these two species is associated with a localized effect on gene expression, near the site of disruption. By separating TADs into subtypes based on their chromatin state, we find that different subtypes are evolving under different evolutionary forces. TADs enriched for broadly expressed, transcriptionally active genes are evolving rapidly, potentially due to positive selection, whereas TADs enriched for developmentally-regulated genes remain conserved, presumably due to their importance in restricting gene-regulatory element interactions. These results provide novel insight into the evolutionary dynamics of TADs and help to reconcile contradictory reports related to the evolutionary conservation of TADs and whether changes in TAD structure affect gene expression (Torosin, 2020).
Topologically associating domains (TADs) were recently identified as fundamental units of three-dimensional eukaryotic genomic organization, although knowledge of the influence of TADs on genome evolution remains preliminary. To study the molecular evolution of TADs in Drosophila species, a new reference-grade genome assembly and accompanying high-resolution TAD map for D. pseudoobscura was constructed. Comparison of D. pseudoobscura and D. melanogaster, which are separated by ∼49 million years of divergence, showed that ∼30%-40% of their genomes retain conserved TADs. Comparative genomic analysis of 17 Drosophila species revealed that chromosomal rearrangement breakpoints are enriched at TAD boundaries but depleted within TADs. Additionally, genes within conserved TADs show lower expression divergence than those located in nonconserved TADs. Furthermore, it was found that a substantial proportion of long genes (>50 kbp) in D. melanogaster (42%) and D. pseudoobscura (26%) constitute their own TADs, implying transcript structure may be one of the deterministic factors for TAD formation. By using structural variants (SVs) identified from 14 D. melanogaster strains, its three closest sibling species from the D. simulans species complex, and two obscura clade species, evidence was uncovered of selection acting on SVs at TAD boundaries, but with the nature of selection differing between SV types. Deletions are depleted at TAD boundaries in both divergent and polymorphic SVs, suggesting purifying selection, whereas divergent tandem duplications are enriched at TAD boundaries relative to polymorphism, suggesting they are adaptive. These findings highlight how important TADs are in shaping the acquisition and retention of structural mutations that fundamentally alter genome organization (Liao, 2021).
Technological advances have lead to the creation of large epigenetic datasets, including information about DNA binding proteins and DNA spatial structure. Hi-C experiments have revealed that chromosomes are subdivided into sets of self-interacting domains called Topologically Associating Domains (TADs). TADs are involved in the regulation of gene expression activity, but the mechanisms of their formation are not yet fully understood. This study focused on machine learning methods to characterize DNA folding patterns in Drosophila based on chromatin marks across three cell lines. This study presents linear regression models with four types of regularization, gradient boosting, and recurrent neural networks (RNN) as tools to study chromatin folding characteristics associated with TADs given epigenetic chromatin immunoprecipitation data. The bidirectional long short-term memory recurrent neural network architecture produced the best prediction scores and identified biologically relevant features. Distribution of protein Chriz (Chromator) and histone modification H3K4me3 were selected as the most informative features for the prediction of TADs characteristics. This approach may be adapted to any similar biological dataset of chromatin features across various cell lines and species. The code for the implemented pipeline, Hi-ChiP-ML, is publicly available (Rozenwald, 2020).
Several distinct activities and functions have been described for chromatin insulators, which separate genes along chromosomes into functional units. This paper describes a novel mechanism of functional separation whereby an insulator prevents gene repression. When the homie insulator is deleted from the end of a Drosophila even skipped (eve) locus, a flanking P-element promoter is activated in a partial eve pattern, causing expression driven by enhancers in the 3' region to be repressed. The mechanism involves transcriptional read-through from the flanking promoter. This conclusion is based on the following. Read-through driven by a heterologous enhancer is sufficient to repress, even when homie is in place. Furthermore, when the flanking promoter is turned around, repression is minimal. Transcriptional read-through that does not produce anti-sense RNA can still repress expression, ruling out RNAi as the mechanism in this case. Thus, transcriptional interference, caused by enhancer capture and read-through when the insulator is removed, represses eve promoter-driven expression. We also show that enhancer-promoter specificity and processivity of transcription can have decisive effects on the consequences of insulator removal. First, a core heat shock 70 promoter that is not activated well by eve enhancers did not cause read-through sufficient to repress the eve promoter. Second, these transcripts are less processive than those initiated at the P-promoter, measured by how far they extend through the eve locus, and so are less disruptive. These results highlight the importance of considering transcriptional read-through when assessing the effects of insulators on gene expression (Fujioka, 2021).
The relationship between chromatin organization and gene regulation remains unclear. While disruption of chromatin domains and domain boundaries can lead to misexpression of developmental genes, acute depletion of regulators of genome organization has a relatively small effect on gene expression. It is therefore uncertain whether gene expression and chromatin state drive chromatin organization or whether changes in chromatin organization facilitate cell-type-specific activation of gene expression. Using the dorsoventral patterning of the Drosophila melanogaster embryo as a model system, this study provides evidence for the independence of chromatin organization and dorsoventral gene expression. Tissue-specific enhancers were defined and link them to expression patterns using single-cell RNA-seq. Surprisingly, despite tissue-specific chromatin states and gene expression, chromatin organization is largely maintained across tissues. These results indicate that tissue-specific chromatin conformation is not necessary for tissue-specific gene expression but rather acts as a scaffold facilitating gene expression when enhancers become active (Ing-Simmons, 2021).
Acquisition of cell fate is thought to rely on the specific interaction of remote cis-regulatory modules (CRMs), for example, enhancers and target promoters. However, the precise interplay between chromatin structure and gene expression is still unclear, particularly within multicellular developing organisms. This study employed Hi-M, a single-cell spatial genomics approach, to detect CRM-promoter looping interactions within topologically associating domains (TADs) during early Drosophila development. By comparing cis-regulatory loops in alternate cell types, it was shown that physical proximity does not necessarily instruct transcriptional states. Moreover, multi-way analyses reveal that multiple CRMs spatially coalesce to form hubs. Loops and CRM hubs are established early during development, before the emergence of TADs. Moreover, CRM hubs are formed, in part, via the action of the pioneer transcription factor Zelda and precede transcriptional activation. This approach provides insight into the role of CRM-promoter interactions in defining transcriptional states, as well as distinct cell types (Espinola 2021).
This study used a high-resolution, imaging-based, single-cell spatial genomics approach (Hi-M) to link chromosome topology and transcriptional regulation during early Drosophila development. This approach has notable advantages, such as the detection of multi-way interactions and transcriptional output with spatial resolution. Extensive interaction networks were revealed within developmental TADs primarily involving CRMs. Critically, these networks arise thanks to the spatial clustering of multiple enhancers (CRM hubs) and are mostly invariant during cell fate specification and gene activation. Networks of pairwise CRM contacts and CRM hubs arise during early development, before the onset of gene expression and before the emergence of TADs, and require the pioneering activity of the transcription factor Zld (Espinola 2021).
One of the important results of the present study is that physical proximity between multiple CRMs and promoters is observed with very similar frequencies in cells with distinct fates and appeared during early embryogenesis. These results are consistent with those obtained at later stages of Drosophila embryogenesis, showing that enhancers located at considerably larger distances (~100 kb) can also form binary loops that are present in cells from different tissues. Similarly, E-P interactions at the mouse Hoxd locus were detected in tissues where target genes were not expressed. The results are further supported by a companion paper that applied Hi-C and micro-C to study tissue-specific Drosophila chromosome organization at similar stages of development. From a developmental perspective, the formation of loops between promoters and distal regulatory elements in cells where genes need to be repressed can be seen as a ‘dangerous liaison'. Indeed, once a loop has been established, transcriptional activation could rapidly occur in cells where that specific promoter should be kept inactive (Espinola 2021).
This apparent dichotomy can, however, be rationalized in terms of the spatiotemporal patterning of the cis-regulatory logic of TFs during embryogenesis. For instance, in M (mesodermal) cells, most doc CRMs are bound by the spatially localized transcriptional repressor Sna, which acts as a silencer in the M. In this case, communication between promoters and distal CRMs may reinforce transcriptional repression. This interpretation is in agreement with the finding that many enhancers can act as silencers in alternate cell types during Drosophila development; however, other silencing mechanisms may also be at play. Thus, it is hypothesized that the optimal mechanism to ensure rapid and efficient activation or repression during development may involve two steps: the rapid priming of key CRMs via ubiquitously maternally deposited pioneer factors (for example, Zld), followed by regulation of transcriptional output by spatially and temporally localized transcriptional activators and repressors. In this model, three-dimensional (3D) chromatin architecture plays a double role because 3D contacts could serve to reinforce both activation and repression at a particular developmental stage while allowing for flexibility at later stages. For example, a repressive CRM loop in a tissue at an early developmental stage may switch to a CRM loop with activation capacities at later stages by changing TF occupancy. Future experiments testing whether CRM loops and hubs display more differences in active and repressed tissues at later stages of development will be important to test these hypotheses (Espinola 2021).
Previous studies suggested that invariant enhancer-promoter (E-P loops may be pre-established and stable (Allahyar, 2018; Dudelaar, Tsai, 2019). In agreement with these results, the data indicate that E-P loops can form early, well before the onset of gene expression. However, in all cases, low frequencies were measured of looping interactions between functional elements. These results are consistent with previous measurements of absolute contact frequencies within TADs and between E-Ps. Thus, these results indicate that different sets of multi-way E-E and E-P contacts are present in different cells, and that these contacts may be highly dynamic (Espinola 2021).
Recent studies reported the existence of enhancer hubs: spatially localized clusters containing multiple enhancers that may facilitate transcriptional activation by creating a local microenvironment whereby transcriptional resources are shared, akin to early models of 'transcription hubs'. Formation of enhancer hubs may require interactions between components of the transcriptional machinery, which could contribute to, or result from, the assembly of phase-separated condensates. In this model, enhancers need not directly touch their target promoters but merely come into close proximity (~200-300 nm). Overall, these findings and models are consistent with the observation that multiple endogenous CRMs within a TAD come together in space to form hubs in single, actively transcribing nuclei. The formation of similar hubs was also observed in inactive nuclei, suggesting that repressive elements may also form spatially localized clusters of transcriptional repressors to share resources and reinforce their silencing activities. CRM hubs are formed at early stages of development in pluripotent cells. Thus, a model is favored in which preferential CRM interaction networks are pre-formed at early stages and are subsequently specified (into activation or repression hubs) during nc14 or later (Espinola 2021).
In Drosophila, TADs emerge concomitant with the major wave of zygotic gene activation. Previous studies reported the existence of chromatin loops typically at considerably large genomic distances spanning two or more TADs or concerning Polycomb-binding sites. In the present study, it was observed that chromatin loops between CRMs within Drosophila TADs are widespread, mimicking the common CTCF-mediated chromatin loops present within mammalian TADs. In addition, this study found that multiple CRMs can cluster together to form cis-regulatory hubs located within TADs, suggesting a mechanism to sequester enhancers in space to reduce the activation of genes in neighboring TADs. Importantly, formation of CRM hubs precedes the emergence of TADs, consistent with the finding in mammalian cells that subsets of E-P contacts arise rapidly after mitosis before TADs are re-formed. Thus, the results suggest that CRM hubs and TADs probably form by different mechanisms. Overall, it is hypothesized that CRM hubs represent an additional functional level of genome organization, independent of TADs. This additional layer can also be regulated by priming of enhancers and promoters by paused polymerases or pioneer factors, as well as by chromatin marks. As interactions between Zld CRMs appear before TADs, it is unclear how specificity of CRM interactions may be regulated to favor intra-TAD contacts (Espinola 2021).
It is interesting that it is observed that interactions between Zld-bound CRMs, as well as interactions between CRMs and cognate promoters, are established very early in pluripotent nuclei, before cell fate commitments. These long-range interactions occur between related CRMs (within doc- and sna-TADs) as well as between unrelated but Zld-bound CRMs, suggesting that a common link could be their regulation by broad factors such as Zld. Critically, preferential contacts involving Zld-bound CRMs were considerably attenuated upon Zld depletion. If has been recently shown that Zld forms nuclear hubs in early Drosophila embryos, and that Zld hubs are re-established by the end of mitosis, before transcriptional activation. Taken together, these results suggest that Zld fosters the formation of CRM hubs by rendering chromatin accessible during early development, as a first step of cell specification to ensure maximum plasticity. Future work involving the detection of a larger number of CRMs will be needed to elucidate the factors and mechanisms involved in spatial clustering of developmental CRMs into nuclear microenvironments (Espinola 2021).
While the biochemistry of gene transcription has been well studied, understanding of how this process is organised in 3D within the intact nucleus is less well understood. This study investigated the structure of actively transcribed chromatin and the architecture of its interaction with active RNA polymerase. For this analysis, super-resolution microscopy was used to image the Drosophila melanogaster Y loops which represent huge, several megabases long, single transcription units. The Y loops provide a particularly amenable model system for transcriptionally active chromatin. It was found that, although these transcribed loops are decondensed they are not organised as extended 10nm fibres, but rather they largely consist of chains of nucleosome clusters. The average width of each cluster is around 50nm. Foci of active RNA polymerase are generally located off the main fibre axis on the periphery of the nucleosome clusters. Foci of RNA polymerase and nascent transcripts are distributed around the Y loops rather than being clustered in individual transcription factories. However, as the RNA polymerase foci are considerably less prevalent than the nucleosome clusters, the organisation of this active chromatin into chains of nucleosome clusters is unlikely to be determined by the activity of the polymerases transcribing the Y loops. These results provide a foundation for understanding the topological relationship between chromatin and the process of gene transcription (Ball, 2023).
Genome organization is driven by forces affecting transcriptional state, but the relationship between transcription and genome architecture remains unclear. This study identified the Drosophila transcription factor Motif 1 Binding Protein (M1BP) in physical association with the gypsy chromatin insulator core complex, including the universal insulator protein CP190. M1BP is required for enhancer-blocking and barrier activities of the gypsy insulator as well as its proper nuclear localization. Genome-wide, M1BP specifically colocalizes with CP190 at Motif 1-containing promoters, which are enriched at topologically associating domain (TAD) borders. M1BP facilitates CP190 chromatin binding at many shared sites and vice versa. Both factors promote Motif 1-dependent gene expression and transcription near TAD borders genome-wide. Finally, loss of M1BP reduces chromatin accessibility and increases both inter- and intra-TAD local genome compaction. These results reveal physical and functional interaction between CP190 and M1BP to activate transcription at TAD borders and mediate chromatin insulator-dependent genome organization (Bag, 2021).
In eukaryotic cells, the three-dimensional organization of the genome plays a critical role in achieving proper spatial and temporal patterns of gene expression during development. Chromatin insulators are DNA-protein complexes involved in the establishment, maintenance, and regulation of nuclear organization to modulate gene expression. Insulators regulate interactions between cis-regulatory elements such as enhancers and promoters and demarcate silent and active chromatin regions to ensure their proper regulation. They can inhibit the interaction between an enhancer and a promoter when positioned between the two elements and can act as a barrier to stop repressive chromatin from spreading over active genes. Furthermore, chromatin insulators can promote intra- and inter-chromosomal looping to control topology of the genome. Certain insulator proteins are highly enriched at the self-interacting boundaries of topologically associating domains (TADs) throughout the genome. In mammals, only a single insulator protein, CCCTC-binding Factor (CTCF), has thus far been identified, and CTCF indeed is enriched at TAD borders and is required for TAD formation. In contrast, Drosophila melanogaster CTCF is not particularly enriched at TAD borders, and a recent study indicates that CTCF plays a limited role in TAD formation in flies. In fact, Drosophila harbors a variety of insulator protein complexes, all of which contain the protein Centrosomal protein 190 (CP190). CP190 is highly enriched at TAD borders, suggesting a possible role in TAD formation. Another notable feature of genome organization that has been explored in detail in Drosophila is the key role of transcription and the presence of constitutively active genes at TAD borders. General inhibition of transcription using chemical treatments or heat shock results in disruption of TADs and compartments, but the mechanistic details of how transcription contributes to genome organization are yet to be elucidated (Bag, 2021).
The Drosophila gypsy insulator, also known as the Suppressor of Hairy wing [Su(Hw)] insulator, was the first characterized CP190-containing insulator complex. The zinc-finger DNA-binding protein Su(Hw) provides binding specificity of the complex, and both CP190 and the Modifier of mdg4 [Mod(mdg4)] 67.2 kDa isoform [Mod(mdg4)67.2] contain an N-terminal Broad-Complex, Tramtrack, and Bric a brac (BTB) domain that can homodimerize or heterodimerize to facilitate insulator-insulator interactions and promote formation of long range insulator-mediated loops. Initially, the gypsy insulator complex was characterized as binding the 5'-untranslated region of the gypsy retroelement. However, the core complex also binds thousands of endogenous sites throughout the genome and can function similarly at least at a subset of those sites. Moreover, the three gypsy insulator core components do not colocalize absolutely at all binding sites throughout the genome, and each protein can interact with other insulator proteins. In diploid interphase nuclei, gypsy insulator proteins coalesce into large foci termed insulator bodies. These structures can be induced by stress, and insulator bodies have also been proposed to serve as storage depots for insulator proteins. Nevertheless, there is a high correlation between proper insulator function and insulator body localization. In summary, the gypsy insulator complex contributes to higher order nuclear organization on several levels (Bag, 2021).
CP190 also associates with a variety of additional DNA-binding proteins that likely impart specificity of the respective complex. The BED finger-containing proteins BEAF-32, Ibf1, and Ibf2 interact with CP190 and promote insulator function (Cuartero, 2014). Three additional zinc-finger proteins Pita, ZIPIC, and CTCF also interact with CP190 and contribute to insulator activity. Recently, the zinc-finger protein CLAMP was demonstrated to positively affect gypsy insulator activity and to colocalize particularly with CP190 at promoters throughout the genome. In addition, previous work showed that CP190 preferentially binds Motif 1-containing promoters, but the functional significance of this observation is currently unknown. The precise functions of CP190, its associated factors, as well as their relationship with transcription regulation have not yet been elucidated (Bag, 2021).
Motif 1 binding protein (M1BP) is a ubiquitously expressed transcriptional activator that is required for the expression of predominantly constitutive genes. A zinc-finger DNA-binding protein, M1BP, specifically binds to the core promoter element Motif 1 consensus sequence that is distinct from the canonical TATA box and mainly controls the expression of constitutively active genes that are transiently paused (Li, 2013). For example, M1BP interacts with the TATA-binding protein-related factor 2 (TRF2) to activate transcription of ribosomal protein genes in a Motif 1-dependent manner (Baumann, 2017a). Finally, recent studies found that Motif 1 and M1BP are highly enriched at TAD boundaries along with CP190 and BEAF-32 (Cubenas-Potts, 2017; Ramirez, 2018; Sexton, 2012; Hou, 2012). Depletion of M1BP led to increased inter-chromosomal Hi-C contacts; however, concomitant cell cycle disruption precluded interpretation of these results (Ramirez, 2018). The possible role of M1BP-dependent transcriptional regulation in genome organization has not yet been interrogated in detail (Bag, 2021).
This study identified M1BP as a physical interactor and positive regulator of the gypsy insulator complex. Depletion of M1BP decreases gypsy-dependent enhancer blocking and barrier activities and reduces the association of the core insulator complex with the gypsy insulator sequence. ChIP-seq analysis reveals extensive genome-wide overlap of M1BP particularly with promoter-bound CP190, and depletion of M1BP results in extensive loss of CP190 chromatin association genome-wide. Depletion of CP190 also disrupts M1BP binding at many of its binding sites. Nascent euRNA-seq (neuRNA-seq) analysis of M1BP- or CP190-depleted cells indicates that both factors co-regulate a similar set of genes genome-wide. In particular, loss of gene activation correlates with disrupted M1BP and CP190 binding, and these events are frequently observed at TAD borders. Depletion of M1BP disrupts gypsy insulator body localization within the nucleus and alters both inter- and intra-TAD local genome compaction. Finally, knockdown of M1BP decreases chromatin accessibility at its binding sites, including genes that it activates and regions in proximity of TAD borders. Taken together, these findings identify a mechanistic relationship between M1BP and CP190 to activate Motif 1-dependent transcription as well as to promote chromatin insulator activity and nuclear organization (Bag, 2021).
This study has shown that M1BP is required for proper gypsy insulator function and insulator body formation and that M1BP and CP190 together activate transcription at TAD borders. M1BP was shown to physically associates with CP190 as well as core gypsy components and promotes enhancer blocking and barrier activities. Genome-wide M1BP colocalizes mainly with CP190 at Motif 1-containing promoters, which are enriched at TAD borders. M1BP is required for CP190 binding at many sites throughout the genome and vice versa, and loss of either factor reduces gene expression at TAD borders. M1BP is required for proper nuclear localization of insulator bodies, and loss of M1BP increases local genome compaction across TAD borders as well as within large TADs. Finally, M1BP promotes local chromatin accessibility at its binding sites, including transcriptionally activated genes and regions near TAD borders. Taken together, these findings suggest that M1BP may play a role in 3D genome organization through a CP190- and transcription-dependent mechanism (Bag, 2021).
As M1BP is ubiquitously expressed throughout development, effects on insulator activity and complex localization after M1BP depletion were observed in all tissues and stages of development tested. M1BP associates physically with chromatin at the Su(Hw)-binding sites of the gypsy insulator in conjunction with core insulator proteins, and M1BP is required for the binding of all three factors. These findings suggest that M1BP directly affects insulator activity by aiding the recruitment of gypsy core components to gypsy insulator sites, all three of which are required for proper insulator activity. Interestingly, Motif 1 is not present at this sequence, and M1BP binding at this site is also dependent on the presence of CP190. One scenario is that binding of the two factors could be cooperative, and M1BP recruitment may additionally heo stabilize the multimerization and/or higher order organization of insulator complexes. Consistent with this hypothesis, depletion of M1BP results in increased numbers of smaller insulator bodies, similar to the effect of complete loss of the BTB-containing core insulator protein Mod(mdg4)67.2. Another possibility is that depletion of M1BP results in cellular stress that induces insulator body formation. Since CP190 is a universal insulator protein in Drosophila, mislocalization of CP190 may result in, or at least serve as an indicator of, disrupted genome organization when M1BP is depleted (Bag, 2021).
Although M1BP physically interacts with each of the core gypsy insulator components, it was observed that M1BP colocalizes mainly with just CP190 throughout the genome, particularly at Motif 1-containing promoters. Distinct binding of the transcriptional activator M1BP compared to Su(Hw) and Mod(mdg4)67.2 genome-wide is not entirely surprising considering sub-stoichiometric levels of co-immunoprecipitation that could also reflect interaction off of chromatin. Furthermore, it has been observed that Su(Hw) binding can correlate with transcriptional repression rather than insulator activity, which may depend on the presence, or absence, of particular interacting proteins. Importantly, recruitment of CP190 is dependent on M1BP and vice versa at many co-occupied sites genome-wide. Why M1BP is only partially dependent on CP190 for binding is unclear, but these results are consistent with the known ability of M1BP to bind DNA directly (Li, 2013), whereas CP190 is believed to require interaction with a specific DNA-binding protein in order to associate with chromatin. Aside from the insulator protein BEAF-32, which binds an AT-rich dual core sequence, a large extent of overlap between M1BP and other DNA-binding insulator proteins that have been shown to be involved in recruiting CP190 to DNA was not generally observed. Future studies may reveal a possible functional relationship between M1BP and BEAF-32 in insulator activity or regulation of gene expression (Bag, 2021).
The results suggest that M1BP promotes gypsy insulator function through interaction with CP190 in a manner distinct from CP190 interaction with the zinc-finger DNA-binding protein CLAMP. A large extent of genome-wide overlap between M1BP and CLAMP, a recently identified positive regulator of gypsy insulator activity, was not observed. The sequence binding specificity of CLAMP is similar to that of GAF45, while M1BP and GAF bind to and regulate distinct sets of promoters (Li, 2013). Although either CLAMP or M1BP depletion reduces gypsy enhancer blocking and barrier activities as well as alter insulator body localization, unlike M1BP, CLAMP depletion does not affect CP190 chromatin association throughout the genome. In fact, CP190 depletion had a substantial effect on CLAMP chromatin association, again suggesting that CP190 may affect the ability of certain DNA-binding proteins, including M1BP, to associate with chromatin through cooperative or higher order physical interactions (Bag, 2021).
The results suggest that CP190 may play a more direct role in transcriptional regulation than previously appreciated, in part through interaction with M1BP. Genome-wide profiling studies have shown that CP190 preferentially associates with promoters genome-wide. CP190 was found to be enriched particularly at active promoters and was shown to affect steady-state gene expression when depleted; however, direct and indirect effects as well as transcriptional and posttranscriptional effects could not be separated. In order to avoid the complication of interpreting steady-state gene expression profiles, neuRNA-seq was performed after either CP190 or M1BP depletion in order to measure newly synthesized transcripts. Intriguingly, nascent RNA expression profiles of CP190 or M1BP-depleted cells showed a remarkably high level of correlation. Since both M1BP and CP190 are particularly associated with promoters of genes that require either factor for adequate expression, it is likely that both factors mainly function in transcriptional activation rather than repression. M1BP was previously shown to activate transcriptionally paused genes, and the ATAC-seq analysis of M1BP-depleted cells supports this conclusion and further demonstrates that M1BP promotes chromatin accessibility surrounding the TSS. Interestingly, depletion of CP190 has no effect on promoter accessibility. Furthermore, this study found that CP190 is specifically required for Motif 1-dependent expression of two previously characterized ribosomal protein genes. Whether TRF2 recruitment to Motif 1-containing promoters is affected by CP190 as well as the precise mechanism by which CP190 contributes to M1BP-dependent transcription will be important topics of further study (Bag, 2021).
This study found that genes that require M1BP and CP190 for adequate expression are frequently located at TAD borders, as are both proteins. It has previously been proposed that constitutively active transcription, particularly at/near TAD borders (also referred to as 'compartmental domains'), may be a defining or at least key feature of overall genome organization in cells and throughout development. CP190 was previously shown to be associated particularly with Motif 1-containing promoters (Rach, 2011), and CP190 was observed to be specifically enriched at TAD borders. Recently, Motif 1 was also found to be an enriched sequence at TAD borders present in Kc cells (Ramirez, 2018) and is apparent when TAD borders first appear in embryonic development, findings consistent with these previous studies. A previous study of M1BP involvement in genome organization provided limited evidence using Hi-C to suggest that chromosome intermingling may be increased after M1BP depletion (Ramirez, 2018). However, the extent and duration of M1BP depletion in their study caused a major disruption of cell cycle and cellular growth, thus obscuring interpretation of those results. No changes were observed in CT intermingling in Oligopaint FISH experiments in M1BP-depleted G1 cells, nor did was any difference observed in distance between two distant regions on the same chromosome. Therefore, large-scale changes in genome organization were not observed after M1BP depletion (Bag, 2021).
Increased local genome compaction was observed after M1BP depletion. This finding led to a test of whether TAD borders may be specifically disrupted, perhaps leading to fusion of neighboring TADs. Because Oligopaint FISH probes are limited to a minimum of 30 kb, this study was restricted to intra-TAD analysis of larger TADs, which are typically lower in transcriptional activity (PcG, inactive, null). Increased compaction was found to occur both across TAD borders and within large TADs after M1BP depletion. These effects occur in the vicinity of altered local transcription and reduced chromatin accessibility particularly near TAD borders, suggesting that M1BP-dependent transcriptional changes might alter local chromatin structure that culminates in changes in genome compaction in surrounding regions not restricted to TAD borders. However, the extent of reduced chromatin accessibility observed in M1BP-depleted cells by ATAC-seq is modest relative to the genomic sequence space interrogated by FISH, suggesting that loss of accessibility likely does not directly explain increased local genome compaction. In contrast, CP190 depletion affected transcription to a similar degree, although not identically, yet did not result in changes in local compaction or extensive loss of chromatin accessibility genome-wide. These differences perhaps reflect the multifunctional nature of CP190 as a universal insulator protein contributing to opposing forces, or alternatively, effects on transcription in M1BP-depleted cells may be functionally unrelated to increased genome compaction. This work shows the requirement of M1BP for accurate CP190 binding throughout the genome as well as gypsy-dependent chromatin insulator activity and nuclear localization through interaction with CP190 and other core insulator proteins. Overall, the results provide evidence that M1BP and CP190 at TAD borders and perhaps their ability to activate transcription of constitutively expressed genes located at TAD borders, combined with the capacity of M1BP to promote chromatin accessibility, may play a role in genome organization (Bag, 2021).
M1BP has been shown to activate transcription of genes at which RNA Pol II is transiently paused in the promoter proximal region, and these promoter regions may themselves possess insulator activity. Intriguingly, a previous study showed that stalled Hox promoters, including Motif 1-containing Abd-B, possess intrinsic enhancer blocking insulator activity47. However, the second paused promoter identified in this study, Ubx, does not harbor Motif 1; thus, M1BP may not necessarily be involved in the enhancer blocking activity of all stalled promoters. Recently, it was shown that M1BP promoter binding can prime the recruitment of the Hox protein Abd-A to the promoter in order to release paused Pol II and activate transcription (Chopra, 2009). This study found that M1BP is similarly required for CP190 recruitment at a large number of sites throughout the genome and thus propose that M1BP and CP190 are together required to maintain active gene expression near TAD borders. Active transcription and increased accessibility at these sites may be needed for higher order chromatin organization such as TAD insulation, formation of active compartmental domains, and/or proper local genome structure. Future studies will elucidate the precise mechanisms by which M1BP, CP190, and transcription contribute to higher order chromatin organization (Bag, 2021).
Drosophila bithorax complex (BX-C) is one of the best model systems for studying the role of boundaries (insulators) in gene regulation. Expression of three homeotic genes, Ubx, abd-A, and Abd-B, is orchestrated by nine parasegment-specific regulatory domains. These domains are flanked by boundary elements, which function to block crosstalk between adjacent domains, ensuring that they can act autonomously. Paradoxically, seven of the BX-C regulatory domains are separated from their gene target by at least one boundary, and must "jump over" the intervening boundaries. To understand the jumping mechanism, the Mcp boundary was replaced with Fab-7 and Fab-8. Mcp is located between the iab-4 and iab-5 domains, and defines the border between the set of regulatory domains controlling abd-A and Abd-B. When Mcp is replaced by Fab-7 or Fab-8, they direct the iab-4 domain (which regulates abd-A) to inappropriately activate Abd-B in abdominal segment A4. For the Fab-8 replacement, ectopic induction was only observed when it was inserted in the same orientation as the endogenous Fab-8 boundary. A similar orientation dependence for bypass activity was observed when Fab-7 was replaced by Fab-8. Thus, boundaries perform two opposite functions in the context of BX-C-they block crosstalk between neighboring regulatory domains, but at the same time actively facilitate long distance communication between the regulatory domains and their respective target genes (Postaka, 2018).
Boundaries flanking the Abd-B regulatory domains must block crosstalk between adjacent regulatory domains but at the same time allow more distal domains to jump over one or more intervening boundaries and activate Abd-B expression. While several models have been advanced to account for these two paradoxical activities, replacement experiments argued that both must be intrinsic properties of the Abd-B boundaries. Thus Fab-7 and Fab-8 have blocking and bypass activities in Fab-7 replacement experiments, while heterologous boundaries including multimerized dCTCF sites and Mcp from BX-C do not. One idea is that Fab-7 and Fab-8 are simply 'permissive' for bypass. They allow bypass to occur, while boundaries like multimerized dCTCF or Mcp are not permissive in the context of Fab-7. Another is that they actively facilitate bypass by directing the distal Abd-B regulatory domains to the Abd-B promoter. Potentially consistent with an 'active' mechanism that involves boundary pairing interactions, the bypass activity of Fab-8 and to a lesser extent Fab-7 is orientation dependent (Postaka, 2018).
In the studies reported it this study have tested these two models further. For this purpose the Mcp boundary was used for in situ replacement experiments. Mcp defines the border between the regulatory domains that control expression of abd-A and Abd-B. In this location, it is required to block crosstalk between the flanking domains iab-4 and iab-5, but it does not need to mediate bypass. In this respect, it differs from the boundaries that are located within the set of regulatory domains that control either abd-A or Abd-B, as these boundaries must have both activities. If bypass were simply passive, insertion of a 'permissive' Fab-7 or Fab-8 boundary in either orientation in place of Mcp would be no different from insertion of a generic 'non-permissive' boundary such as multimerized dCTCF sites. Assuming that Fab-7 and Fab-8 can block crosstalk out of context, they should fully substitute for Mcp. In contrast, if bypass in the normal context involves an active mechanism in which more distal regulatory domains are brought to the Abd-B promoter, then Fab-7 and Fab-8 replacements might also be able to bring iab-4 to the Abd-B promoter in a configuration that activates transcription. If they do so, then this process would be expected to show the same orientation dependence as is observed for bypass of the Abd-B regulatory domains in Fab-7 replacements (Postaka, 2018).
Consistent with the idea that a boundary located at the border between the domains that regulate abd-A and Abd-B need not have bypass activity, it was found that multimerized binding sites for the dCTCF protein fully substitute for Mcp. Like the multimerized dCTCF sites, Fab-7 and Fab-8 are also able to block crosstalk between iab-4 and iab-5. In the case of Fab-7, its' blocking activity is incomplete and there are small clones of cells in which the mini-y reporter is activated in A4. In contrast, the blocking activity of Fab-8 is comparable to the multimerized dCTCF sites and the mini-y reporter is off throughout A4. One plausible reason for this difference is that Mcp and the boundaries flanking Mcp (Fab-4 and Fab-6) utilize dCTCF as does Fab-8, while this architectural protein does not bind to Fab-7 (Postaka, 2018).
Importantly, in spite of their normal (or near normal) ability to block crosstalk, both boundaries still perturb Abd-B regulation. In the case of Fab-8, the misregulation of Abd-B is orientation dependent just like the bypass activity of this boundary when it is used to replace Fab-7. When inserted in the reverse orientation, Fab-8 behaves like multimerized dCTCF sites and it fully rescues the Mcp deletion. In contrast, when inserted in the forward orientation, Fab-8 induces the expression of Abd-B in A4 (PS9), and the misspecification of this parasegment. Unlike classical Mcp deletions or the McpPRE replacement described in this study, expression of the Abd-B gene in PS9 is driven by iab-4, not iab-5. This conclusion is supported by two lines of evidence. First, the mini-y reporter inserted in iab-5 is off in PS9 cells indicating that iab-5 is silenced by PcG factors as it should be in this parasegment. Second, the ectopic expression of Abd-B is eliminated when the iab-4 regulatory domain is inactivated (Postaka, 2018).
These results, taken together with previous studies, support a model in which the chromatin loops formed by Fab-8 inserted at Mcp in the forward orientation brings the enhancers in the iab-4 regulatory domain in close proximity to the Abd-B promoter, leading to the activation of Abd-B in A4 (PS9). In contrast, when inserted in the opposite orientation, the topology of the chromatin loops formed by the ectopic Fab-8 boundary are not compatible with productive interactions between iab-4 and the Abd-B promoter. Moreover, it would appear that boundary bypass for the regulatory domains that control Abd-B expression is not a passive process in which the boundaries are simply permissive for interactions between the regulatory domains and the Abd-B promoter. Instead, it seems to be an active process in which the boundaries are responsible for bringing the regulatory domains into contact with the Abd-B gene. It also seems likely that bypass activity of Fab-8 (and also Fab-7) may have a predisposed preference, namely it is targeted for interactions with the Abd-B gene. This idea would fit with transgene bypass experiments, which showed that both Fab-7 and Fab-8 interacted with an insulator like element upstream of the Abd-B promoter, AB-I, while the Mcp boundary didn't (Postaka, 2018).
Similar conclusions can be drawn from the induction of Abd-B expression in A4 (PS9) when Fab-7 is inserted in place of Mcp. Like Fab-8, this boundary inappropriately targets the iab-4 regulatory domain to Abd-B. Unlike Fab-8, Abd-B is ectopically activated when Fab-7 is inserted in both the forward and reverse orientations. While the effects are milder in the reverse orientation, the lack of pronounced orientation dependence is consistent with experiments in which Fab-7 was inserted at its endogenous location in the reverse orientation. Unlike Fab-8 only very minor iab-6 bypass defects were observed. In addition to the activation of Abd-B in A4 (PS9) the Fab-7 Mcp replacements also alter the pattern of Abd-B regulation in more posterior segments. In the forward orientation, A4 and A5 are transformed towards an A6 identity, while A6 is also misspecified. Similar though somewhat less severe effects are observed in these segments when Fab-7 is inserted in the reverse orientation. At this point the mechanisms responsible for these novel phenotypic effects are uncertain. One possibility is that pairing interactions between the Fab-7 insert and the endogenous Fab-7 boundary disrupt the normal topological organization of the regulatory domains in a manner similar to that seen in boundary competition transgene assays. An alternative possibility is that Fab-7 targets iab-4 to the Abd-B promoter not only in A4 (PS9) but also in cells in A5 (PS10) and A6 (PS11). In this model, Abd-B would be regulated not only by the domain that normally specifies the identity of the parasegment (e.g., iab-5 in PS10), but also by interactions with iab-4. This dual regulation would increase the levels of Abd-B, giving the weak GOF phenotypes. Potentially consistent with this second model, inactivating iab-4 in the McpF8 replacement not only rescues the A4 (PS9) GOF phenotypes but also suppresses the loss of anterior trichomes in the A6 tergite (Postaka, 2018).
Insulators play important roles in genome structure and function in eukaryotes. Interactions between a DNA binding insulator protein and its interacting partner proteins define the properties of each insulator site. The different roles of insulator protein partners in the Drosophila genome and how they confer functional specificity remain poorly understood. The Suppressor of Hairy wing [Su(Hw)] insulator is targeted to the nuclear lamina, preferentially localizes at euchromatin/heterochromatin boundaries, and is associated with the gypsy retrotransposon. Insulator activity relies on the ability of the Su(Hw) protein to bind the DNA at specific sites and interact with Mod(mdg4)67.2 and CP190 partner proteins. HP1 and insulator partner protein 1 (HIPP1) is a partner of Su(Hw), but how HIPP1 contributes to the function of Su(Hw) insulator complexes is unclear. This study demonstrates that HIPP1 colocalizes with the Su(Hw) insulator complex in polytene chromatin and in stress-induced insulator bodies. The overexpression of either HIPP1 or Su(Hw) or mutation of the HIPP1 crotonase-like domain (CLD) causes defects in cell proliferation by limiting the progression of DNA replication. This study also showed that HIPP1 overexpression suppresses the Su(Hw) insulator enhancer-blocking function, while mutation of the HIPP1 CLD does not affect Su(Hw) enhancer blocking. These findings demonstrate a functional relationship between HIPP1 and the Su(Hw) insulator complex and suggest that the CLD, while not involved in enhancer blocking, influences cell cycle progression (Stow, 2022).
Mounting evidence implicates liquid-liquid phase separation (LLPS), the condensation of biomolecules into liquid-like droplets in the formation and dissolution of membraneless intracellular organelles (MLOs). Cells use MLOs or condensates for various biological processes, including emergency signaling and spatiotemporal control over steady-state biochemical reactions and heterochromatin formation. Insulator proteins are architectural elements involved in establishing independent domains of transcriptional activity within eukaryotic genomes. In Drosophila, insulator proteins form nuclear foci known as insulator bodies in response to osmotic stress. However, the mechanism through which insulator proteins assemble into bodies is yet to be investigated. This study identified signatures of LLPS by insulator bodies, including high disorder tendency in insulator proteins, scaffold-client-dependent assembly, extensive fusion behavior, sphericity, and sensitivity to 1,6-hexanediol. The cohesin subunit Rad21 is shown to be a component of insulator bodies, adding to the known insulator protein constituents and γH2Av. These data suggest a concerted role of cohesin and insulator proteins in insulator body formation and under physiological conditions. A mechanism is proposed whereby these architectural proteins modulate 3D genome organization through LLPS (Amankwaa, 2022).
The Drosophila Boundary Element-Associated Factor of 32 kDa (BEAF) binds in promoter regions of a few thousand mostly housekeeping genes. This study shows that BEAF physically interacts with the polybromo subunit (Pbro) of PBAP, a SWI/SNF-class chromatin remodeling complex. BEAF also shows genetic interactions with Pbro and other PBAP subunits. The effect of this interaction on gene expression and chromatin structure was examined using precision run-on sequencing and micrococcal nuclease sequencing after RNAi-mediated knockdown in cultured S2 cells. The results are consistent with the interaction playing a subtle role in gene activation. Fewer than 5% of BEAF-associated genes were significantly affected after BEAF knockdown. Most were downregulated, accompanied by fill-in of the promoter nucleosome-depleted region and a slight upstream shift of the +1 nucleosome. Pbro knockdown caused downregulation of several hundred genes and showed a correlation with BEAF knockdown but a better correlation with promoter-proximal GAGA factor binding. Micrococcal nuclease sequencing supports that BEAF binds near housekeeping gene promoters while Pbro is more important at regulated genes. Yet there is a similar general but slight reduction of promoter-proximal pausing by RNA polymerase II and increase in nucleosome-depleted region nucleosome occupancy after knockdown of either protein. The possibility is discussed of redundant factors keeping BEAF-associated promoters active and masking the role of interactions between BEAF and the Pbro subunit of PBAP in S2 cells. Facilitates Chromatin Transcription (FACT) and Nucleosome Remodeling Factor (NURF) were identified as candidate redundant factors (McKowen, 2022).
The spatial organization of chromatin at the scale of topologically associating domains (TADs) and below displays large cell-to-cell variations. Up until now, how this heterogeneity in chromatin conformation is shaped by chromatin condensation, TAD insulation, and transcription has remained mostly elusive. This study used Hi-M, a multiplexed DNA-FISH imaging technique providing developmental timing and transcriptional status, to show that the emergence of TADs at the ensemble level of the doc locus partially segregates the conformational space explored by single nuclei during the early development of Drosophila embryos. Surprisingly, a substantial fraction of nuclei display strong insulation even before TADs emerge. Moreover, active transcription within a TAD leads to minor changes to the local inter- and intra-TAD chromatin conformation in single nuclei and only weakly affects insulation to the neighboring TAD. Overall, these results indicate that multiple parameters contribute to shaping the chromatin architecture of single nuclei at the TAD scale (Gotz, 2022).
DNA within chromosomes in the nucleus is non-randomly organized into chromosome territories, compartments and topologically associated domains (TADs). Chromosomal rearrangements have the potential to alter chromatin organization and modify gene expression leading to selection against these structural variants. Drosophila pseudoobscura has a wealth of naturally occurring gene arrangements that were generated by overlapping inversion mutations caused by two chromosomal breaks that rejoin the central region in reverse order. Unlike humans, Drosophila inversion heterozygotes do not have negative effects associated with crossing over during meiosis because males use achiasmate mechanisms for proper segregation, and aberrant recombinant meiotic products generated in females are lost in polar bodies. As a result, Drosophila populations are found to harbour extensive inversion polymorphisms. It is not clear, however, whether chromatin architecture constrains which inversions breakpoints persist in populations. This study mapped the breakpoints of seven inversions in D. pseudoobscura to the TAD map to determine if persisting inversion breakpoints are more likely to occur at boundaries between TADs. The results show that breakpoints occur at TAD boundaries more than expected by chance. Some breakpoints may alter gene expression within TADs supporting the hypothesis that position effects contribute to inversion establishment (Wright, 2022).
Past studies offer contradictory claims for the role of genome organization in the regulation of gene activity. This study shows through high-resolution chromosome conformation analysis that the Drosophila genome is organized by two independent classes of regulatory sequences, tethering elements and insulators. Quantitative live imaging and targeted genome editing demonstrate that this two-tiered organization is critical for the precise temporal dynamics of Hox gene transcription during development. Tethering elements mediate long-range enhancer-promoter interactions and foster fast activation kinetics. Conversely, the boundaries of topologically associating domains (TADs) prevent spurious interactions with enhancers and silencers located in neighboring TADs. These two levels of genome organization operate independently of one another to ensure precision of transcriptional dynamics and the reliability of complex patterning processes (Batut, 2022).
Genome organization is emerging as a potentially important facet of gene regulation. Because transcriptional enhancers often reside far from their target promoters, chromatin folding may guide the timely and specific establishment of regulatory interactions. Although long-range enhancer-promoter contacts are prevalent, it remains unclear whether they actually determine transcriptional activity. Boundary elements partition chromosomes into topologically associating domains (TADs), whose importance for gene regulation remains controversial. There is also an unresolved dichotomy between elements that promote and prevent enhancer-promoter interactions, because CTCF binding sites have been implicated in both. This study shows that distinct classes of regulatory elements mediate these opposing functions genome-wide: Dedicated tethering elements foster appropriate enhancer-promoter interactions and are key to fast activation kinetics, whereas insulators prevent spurious interactions and regulatory interference between neighboring TADs (Batut, 2022).
This study characterized genome organization at single-nucleosome resolution in developing Drosophila embryos using Micro-C. Focus was placed on the critical ~60-min period preceding gastrulation, when the fate map of the embryo is established by localized transcription of a cascade of patterning genes, culminating with the Hox genes that specify segment identity. Analysis of the Antennapedia gene complex (ANT-C), one of two Hox gene clusters and an archetype of regulatory precision, reveals an intricate hierarchical organization. Insulators partition the locus into a series of TADs, whereas tethering elements mediate specific intra-TAD focal contacts between promoters of Scr and Antp and their distal regulatory regions (Batut, 2022).
The Sex combs reduced (Scr) gene, contained within a 90-kb TAD, is regulated by an early embryonic enhancer (Scr EE) located 35 kb upstream of the promoter. This enhancer bypasses an intervening TAD that contains ftz-a highly expressed pair-rule gene-to selectively regulate Scr transcription. A distal tethering element (DTE) situated 6 kb upstream of the enhancer anchors a focal contact with a promoter-proximal tether. These tethering elements correspond to sequences previously shown by reporter assays to modulate enhancer-promoter selectivity. The DTE lacks any intrinsic enhancer activity, suggesting a specific role in fostering long-range enhancer-promoter interactions (Batut, 2022).
Similarly, the Antennapedia (Antp) P1 early enhancer is associated with a DTE directly adjacent to it, which forms a focal interaction with a tethering element near the P1 promoter, 38 kb away. Upon deletion of the DTE, the focal interaction is lost, and enhancer-promoter interactions are disrupted. Antp activation is substantially delayed but transcription levels in active nuclei are normal, and transcription appears to fully recover after this initial lag (Batut, 2022).
These observations show that DTEs specifically determine the dynamics of transcriptional activation in development. This temporal precision may be critical for the programming of cellular identities within stringent developmental windows. It is proposed that tethering elements foster physical interactions between promoters and remote enhancers to prime genes for rapid activation; they may also modulate other aspects of enhancer-promoter communication through interactions with core transcription complexes (Batut, 2022).
In addition to fostering preferential associations with target promoters, DTEs also suppress 'backward' interactions of associated enhancers with distal regions of their TADs. Both effects probably synergize to increase the specificity of enhancer-promoter communication. Although DTE deletions have a strong impact on local genome organization, they have little effect on the overall structure of TADs, suggesting that insulators and tethering elements operate largely independently of one another. To better understand the relationship between long-range enhancer-promoter interactions and TAD structures, this study systematically disrupted each of the TAD boundaries across the Dfd-Scr-Antp interval (Batut, 2022).
Deletion of the Dfd 3' insulator causes a wholesale fusion of the Dfd TAD with the adjacent miR-10 TAD and reduces transcription of the Dfd gene. Notably, it does not appear to weaken interactions between the Dfd promoter and enhancer, suggesting that TAD boundaries play no role in fostering appropriate regulatory interactions. Rather, the 3' insulator specifically prevents inappropriate contacts with the miR-10 regulatory region (Batut, 2022).
The disruption of Scr TAD boundaries is also consistent with this model. Deletion of the Scr 3' insulator is recessive lethal, probably because of the loss of essential 7SL genes, and could not be analyzed by Micro-C. But a targeted deletion of the Antp 3' intronic insulator is viable and causes a partial fusion of the Scr and Antp P2 TADs. The persistence of a residual boundary can be explained by the presence of a secondary insulator located ~4 kb away. Deletion of either Scr TAD boundary severely reduces Scr transcription. Notably, disruption of the Scr-Antp boundary does not weaken the interaction of the DTE with the Scr promoter, suggesting that reduced Scr expression is not due to diminished enhancer-promoter interactions. This partial fusion of the Scr and Antp P2 TADs has, at most, only a marginal impact on Antp transcription, revealing that boundary deletions can have sharply asymmetric regulatory effects on flanking TADs (Batut, 2022).
Because TAD boundary deletions do not alter appropriate enhancer-promoter interactions, an alternative explanation was sought for reduced Scr transcription arising from disruptions of the ftz TAD. SF1 removal exposes the Scr promoter to interactions with the ftz regulatory region, which may thus directly interfere with Scr transcription. By contrast, SF2 removal allows ftz regulatory sequences to interact with the EE enhancer, but not directly with the Scr promoter, which may explain its more subtle transcriptional impact. In the absence of SF1, the severely narrowed Scr domain and distinctive ectopic stripes suggest both activation and silencing by ftz enhancers. A prime suspect for this altered expression pattern is the AE1 enhancer, which binds both activators and the Hairy repressor. Indeed, the AE1 element functions as a potent silencer within the Scr expression domain, and Scr transcription faithfully mirrors AE1 activity upon SF1 removal. It is concluded that the primary function of insulators is to prevent regulatory interference between TADs, and this can explain even surprising quantitative differences in the transcriptional effects of boundary deletions (Batut, 2022).
To assess the functional importance of tethering elements and insulators, this study analyzed the number of teeth on the sex combs of adult males, a quantitative phenotype under sexual selection governed by Scr expression. All relevant deletions reduce the average number of teeth, and the magnitude of the transcriptional defects is highly predictive of the severity of the morphological phenotypes. These observations demonstrate the importance of genome structure for the control of transcriptional dynamics and the precision of developmental patterning (Batut, 2022).
Taken together, these observations support a general model in which genome organization canalizes regulatory interactions through two classes of organizing elements with diametrically opposing functions. A dedicated class of tethering elements, often physically distinct from enhancers, foster enhancer-promoter interactions and are key to fast transcriptional activation kinetics during development. It is anticipated that similar mechanisms will prove to be an important property of vertebrate genomes, where large distances often separate genes from their regulatory sequences. By contrast, TAD boundaries have a pervasive role in enforcing regulatory specificity by preventing interference between neighboring TADs (Batut, 2022).
Although prior studies have emphasized the spatial regulation of gene expression, temporal dynamics have proven far more elusive. Quantitative measurements in live embryos revealed clear delays in the onset of transcription upon deletion of tethering elements. The Trl protein, which binds most of these sequences, has been proposed to act as a DNA looping factor. It is suggested that tethering elements 'jump-start' expression by establishing enhancer-promoter loops before activation, though it is likely that they also serve a broader function. Indeed, it is intriguing that the Scr DTE coincides with a classical Polycomb response element. This is consistent with a possible role for Polycomb repressive complex 1 (PRC1) components in the establishment of enhancer-promoter loops and suggests that focal contacts constitute a versatile topological infrastructure used by a variety of regulatory mechanisms. This study shows that genome organization shapes transcription dynamics through two complementary mechanisms: Tethering elements foster appropriate enhancer-promoter interactions, whereas TAD boundaries prevent inappropriate associations (Batut, 2022).
The prevailing view of metazoan gene regulation is that individual genes are independently regulated by their own dedicated sets of transcriptional enhancers. Past studies have reported long-range gene-gene associations, but their functional importance in regulating transcription remains unclear. This study used quantitative single-cell live imaging methods to provide a demonstration of co-dependent transcriptional dynamics of genes separated by large genomic distances were found in living Drosophila embryos. Extensive physical and functional associations of distant paralogous genes, including co-regulation by shared enhancers and co-transcriptional initiation over distances of nearly 250 kilobases. Regulatory interconnectivity depends on promoter-proximal tethering elements, and perturbations in these elements uncouple transcription and alter the bursting dynamics of distant genes, suggesting a role of genome topology in the formation and stability of co-transcriptional hubs. Transcriptional coupling is detected throughout the fly genome and encompasses a broad spectrum of conserved developmental processes, suggesting a general strategy for long-range integration of gene activity (Levo, 2022).
Gene regulation is thought to fundamentally differ in prokaryotes and eukaryotes. In the former, tightly clustered genes engaged in a common process are regulated by a shared switch located near the core promoter (e.g., bacterial operons). This type of organization facilitates coordinated transcriptional responses to different environmental stimuli. In higher eukaryotes, individual genes are regulated by multiple enhancers scattered across large genomic distances to produce complex profiles of expression. However, eukaryotic genomes abound with divergent duplicated genes (aka paralogs) that are engaged in common developmental and cellular processes and display overlapping patterns of expression in time and space. These genes are sometimes found in close linear proximity, but are more commonly separated by large distances (20 kb to 250 kb or more). This study explored the possibility that such genes are regulated by shared switches, despite their genomic separation (Levo, 2022).
A surprisingly large fraction of cell fate specification genes in the developing fly embryo are organized as pairs or triplets of distal genes that exhibit overlapping spatiotemporal pattens of expression. Micro-C chromosome conformation capture assays performed during the critical period of cell fate specification (2-3 hrs after fertilization) revealed extensive connectivity between the promoter regions of these genes. Automated analysis of whole genome Micro-C maps identified ~200 long-range focal contacts (i.e. high connectivity between noncontiguous DNA sequences), with nearly half corresponding to promoter-promoter associations (Levo, 2022).
Most of these promoter-promoter contacts correspond to paralogous genes, while a smaller number correspond to widely separated alternative promoters for individual genes. The former class of interconnected genes include a variety of segmentation genes, such as the gap genes knirps-related (knrl)/knirps (kni), the pair-rule genes sloppy-paired 1/2, and the segment polarity genes engrailed/invected. Many dorsal-ventral patterning genes also display this organization, including Dorsocross1/2/3, thisbe/pyramus and scylla (scyl)/charybde (chrb). Interconnected paralogs are also seen for regulatory genes controlling a variety of developmental processes at later stages of the life cycle including neurogenesis and the morphogenesis of adult appendages (e.g., Sox21/Dichaete and bric-a-brac1/2) (Levo, 2022).
This study was able to identify putative shared enhancers for over three-fourths of the inter-connected paralogs displaying overlapping patterns of expression. These enhancers reside in regions of open chromatin and map within 20kb of one of the gene pairs (or trios). In some cases multiple shared enhancers appear to function in an additive pattern to produce composite co-expression profiles, as seen for the segmentation genes slp1 and slp2. It is estimated that 30% of segmentation genes, and at least 11% of all genes showing localized expression in the early embryo, contain distant interconnected paralogs. This long-range coupling challenges the current view of eukaryotic gene regulation, whereby individual genes are controlled by their own dedicated sets of enhancers (Levo, 2022).
To explore the possibility that distant paralogs are coordinately regulated by shared enhancers a comprehensive analyses was conducted of knrl/kni and scyl/chrb, which are regulated by two of the major patterning systems in early embryos, Bicoid (anterior-posterior) and BMP signaling (dorsoventral), respectively. They also possess both common and distinctive properties, such as similarities in overall organization but widely differing genomic distances, 74kb for knrl/kni and 235kb for scyl/chrb. To investigate co-transcriptional gene activity, in time and space, this study employed live single cell transcription imaging. Stem loops were inserted into the respective endogenous transcription units using CRISPR-targeted genome editing. Importantly, homozygous fly lines containing these stem loops are viable, suggesting little impact on the normal activities of the host genes. Simultaneous live transcription imaging in 2-3 hr embryos reveals overlapping expression patterns, and concordant activities within individual nuclei (Levo, 2022).
Quantitative analysis of individual nuclei identified physical proximity of co-expressed transcription foci. Consistent with previously documented distances of ~350nm for long range enhancer-promoter interactions, this study found that knrl and kni are separated by a mean distance of ~320nm, while the more distantly mapping scyl and chrb foci are separated by ~470nm. Nonetheless, these distances are significantly smaller than those seen for uncoupled control genes, both at the population level and for individual nuclei tracked over time (scyl/chrb vs chrb/CG11652. Strikingly, this study detected co-occurring transcriptional initiation events within a time scale of ~90 seconds for both knrl/kni
(74kb) and scyl/chrb (235kb). A higher frequency of knrl and kni co-initiation events was observed when the two genes are linked in cis as compared with a trans-homolog arrangement. More generally, both gene pairs show higher frequencies of co-initiation as compared with randomized controls. These observations suggest interconnectivity in the transcriptional dynamics of distant genes (Levo, 2022).
A combination of genome editing, Micro-C contact maps and quantitative live imaging was used to explore the basis for transcriptional co-activation of knrl/kni and scyl/chrb. Shared enhancers were first identified driving localized patterns of expression common to each gene pair; focus was placed on a shared anterior stripe enhancer located upstream of knrl and a shared dorsal midline enhancer located upstream of scyl. For the newly identified anterior stripe enhancer a targeted deletion provides direct evidence that it regulates both the distal kni gene in addition to proximal knrl. Mutant embryos exhibit a loss of both expression patterns in the anterior stripe, and deficiency homozygotes are lethal (Levo, 2022).
The Micro-C maps provide sufficient resolution to distinguish the shared enhancers from the sequences directly underlying long-range focal contacts between gene pairs. The latter sequences contain a distinctive signature of transcription factors (TFs), including Trithorax-like/GAF, CLAMP, and Ph, seen across all interconnected genes. Based on the binding peaks of these TFs within distinct regions of open chromatin, it was possible to subdivide these sequences into a series of discrete elements, that are hereafter designate 'tethering elements'. It is postulated that these elements contribute to physical and functional associations between the promoter regions of interconnected genes. Notably, they do not bind CTCF, although binding is detected in the vicinity of the tethering elements proximal to knrl and scyl. Additionally, tethering elements do not show enhancer activities when attached to reporter genes and tested in transgenic embryos. Targeted replacements of tethering elements (hereafter 'removal') resulted in severely diminished contacts with distal genes, yet did not significantly alter either of the corresponding TADs. Next the transcriptional consequences were considered of removing different tethering elements, beginning with knrl/kni (Levo, 2022).
Removal of the knrl tethering elements resulted in a severe loss of knrl expression, likely due to local effects on promoter function, possibly involving previously established roles of GAF/Trl. More surprisingly, a significant reduction was also observed in kni transcription, 74kb away. A loss of kni activity in the anterior stripe is also seen upon a reciprocal removal of the kni tethering element, although expression in posterior regions governed by kni-proximal enhancers is retained. The targeted removal of the knrl tethering elements does not alter the enhancer sequence, but nonetheless causes a severe loss in viability, approaching the phenotype observed upon removing the enhancer. This phenotype is probably due to reduced kni transcription since deletion of the knrl transcription start site (TSS) produces milder effects. Moreover, diminished viability associated with a large deletion in knrl that removes the shared enhancer, tethering elements, TSS and 5' coding regions, is rescued by inserting the anterior stripe enhancer upstream of kni. This insertion also rescues the loss in transcription that occurs when the kni tethering element is removed. These observations point to a role of promoter-proximal tethering elements in tuning the co-activation of knrl/kni by the shared enhancer over large linear distances. This is supported by genetic complementation experiments, which indicate increased viability of the cis configuration of the shared enhancer and tethering elements as compared with the trans arrangement of regulatory elements (Levo, 2022).
In order to obtain a more detailed understanding of the nature of this long-range tuning quantitative analyses of kni transcription was performed in individual nuclei of live embryos upon removal of knrl tethering elements. While there is only a minor diminishment in transcription levels within active nuclei, a significant reduction was observed in the number of instantaneously active nuclei. This loss appears to be stochastic within the normal limits of the anterior stripe, arising from both a pronounced delay in the onset of kni transcription as well as altered transcriptional bursting dynamics, with reduced durations of active (ON) periods of Pol II release. These observations suggest that enhancer-promoter communication is less stable upon removal of promoter-proximal tethering elements. This view is strengthened by the analysis of the scyl/chrb locus where shared enhancers work over 'vertebrate-style' distances of nearly 250kb (Levo, 2022).
The organization of tethering elements in the 5' scyl regulatory region provided an opportunity to distinguish the activities of enhancer-proximal and promoter-proximal elements. As seen for knrl/kni, removal of both tethers results in a severe loss of scyl transcription, as well as marked reduction in chrb transcription. There is only a modest effect on the levels of chrb transcription in active nuclei, but a massive diminishment in the number of instantaneously active nuclei. Only a third of the expected number of nuclei exhibit chrb transcription throughout the one-hour interval of analysis. Active nuclei display reduced ON periods, as seen for knrl/knrl, but also extended OFF periods, possibly related to the significantly larger distance separating scyl and chrb. The removal of the enhancer-proximal tether results in a selective reduction of chrb transcription without significantly altering scyl transcription. This represents a significant decoupling in the co-transcriptional dynamics of scyl and chrb expression, with a reduced number of co-active nuclei at any given timepoint. These observations lend additional support to the proposal that tethering elements contribute to coordinated expression of distant paralogs (Levo, 2022).
In summary, this study has presented evidence for coordinate regulation of distant genes by shared enhancers. Distant paralogs were shown to interact in 3D over large genomic distances through associations of discrete promoter-proximal tethering elements that underly co-dependent transcriptional dynamics of the interconnected genes. The term 'topological operon' is proposed to highlight co-regulation by shared enhancers, evocative of the shared switches used by bacterial operons (Levo, 2022).
The co-transcriptional dynamics observed within topological operons are consistent with the occurrence of co-transcriptional hubs containing shared pools of transcriptional activators and Pol II. The large distances separating co-transcribing loci and the short timescales of co-initiation events could be manifestations of molecular crowding within shared transcriptional microenvironments. Further support stems from small deletions that impair transcription of the proximal gene and lead to an increase in the transcription of the distal gene (e.g, knrl TSS or scyl tether. These could reflect instances of promoter competition for shared but limiting transcriptional resources within a common hub (Levo, 2022).
While this study has emphasized co-activation, topological operons might also foster co-repression of interconnected genes in inactive tissues since tethering elements often bind subunits of the PRC1 Polycomb complex. Furthermore, long-range connectivity within topological operons appear to afford a greater degree of regulatory flexibility than that permitted by polycistronic genes within bacterial operons. For example, kni is regulated in the presumptive abdomen by nearby enhancers that produce only weak and sporadic activation of knrl. Consistent with recent studies suggesting a general maintenance of long-range associations across tissues, this study found physical proximity of co-expressed transcription foci in the anterior stripe and abdominal domains. It is conceivable that even subtle changes in 3D organization are sufficient to mediate distinct modes of co-regulation in different tissues. This regulatory flexibility is also seen for other cases of long-range associations (e.g., globin42 and HoxD43), and might reflect the greater demands imposed by complex cell types (Levo, 2022).
Topological operons account for a substantial fraction of gene activity in the early Drosophila embryo. They also account for a variety of developmental processes during later stages of the Drosophila life cycle. Many of these genes have known orthologs in vertebrates, including those regulating the patterning of the central nervous system (ac, D, en, ems), eye development (Vsx2), TOR signaling (scylla), cardiovascular development (H15) and morphogenesis of adult appendages (bab1/2) (Levo, 2022).
Several recent studies have uncovered widespread gene-gene associations in different human tissues, including distant paralogs. They share a strong correlation in chromatin modifications and are enriched for matching eQTLs, raising the possibility that they may be transcriptionally coupled as seen in this study. Identification of promoter-proximal tethering elements, distinct from enhancers, provides a new perspective for cross-regulatory influences of distant promoters. The contributions of tethering elements to long-range promoter coupling and enhancer-promoter interactions in Drosophila also provide a foundation for the characterization of comparable elements in vertebrates (Levo, 2022).
Topological operons might not be restricted to paralogous genes, and it remains to be seen whether they also interconnect unrelated genes encoding different components of common biological pathways, as seen for bacterial operons. It is anticipated that topological operons are likely to be a general feature of metazoan genomes, providing a strategy to integrate and coordinate the activities of distant regulatory genes engaged in complex cellular and developmental processes (Levo, 2022).
In higher eukaryotes, distance enhancer-promoter interactions are organized by topologically associated domains, tethering elements, and chromatin insulators/boundaries. While insulators/boundaries play a central role in chromosome organization, the mechanisms regulating their functions are largely unknown. This study has taken advantage of the well-characterized Drosophila bithorax complex (BX-C) to study one potential mechanism for controlling boundary function. The regulatory domains of BX-C are flanked by boundaries, which block crosstalk with their neighboring domains and also support long-distance interactions between the regulatory domains and their target gene. As many lncRNAs have been found in BX-C, this study asked whether readthrough transcription (RT) can impact boundary function. For this purpose, advantage was taken of two BX-C boundary replacement platforms, Fab-7(attP50) and F2(attP), in which the Fab-7 and Fub boundaries, respectively, are deleted and replaced with an attP site. Boundary elements, promoters, and polyadenylation signals arranged in different combinations were introduced and then assayed for boundary function. The results show that RT can interfere with boundary activity. Since lncRNAs represent a significant fraction of Pol II transcripts in multicellular eukaryotes, it is therefore possible that RT may be a widely used mechanism to alter boundary function and regulation of gene expression (Kyrchanova, 2023).
Previous studies have identified topologically associating domains (TADs) as basic units of genome organization. This study presents evidence of a previously unreported level of genome folding, where distant TAD pairs, megabases apart, interact to form meta-domains. Within meta-domains, gene promoters and structural intergenic elements present in distant TADs are specifically paired. The associated genes encode neuronal determinants, including those engaged in axonal guidance and adhesion. These long-range associations occur in a large fraction of neurons but support transcription in only a subset of neurons. Meta-domains are formed by diverse transcription factors that are able to pair over long and flexible distances. Evidence is presented that two such factors, GAF and CTCF, play direct roles in this process. The relative simplicity of higher-order meta-domain interactions in Drosophila, compared with those previously described in mammals, allowed the demonstration that genomes can fold into highly specialized cell-type-specific scaffolds that enable megabase-scale regulatory associations (Mohana, 2023).
Insulators are architectural elements implicated in the organization of higher-order chromatin structures and transcriptional regulation. However, it is still unknown how insulators contribute to Drosophila telomere maintenance. Although the Drosophila telomeric retrotransposons HeT-A and TART occupy a common genomic niche, they are regulated independently. TART elements are believed to provide reverse transcriptase activity, whereas HeT-A transcripts serve as a template for telomere elongation. Thia study reporta that insulator complexes associate with TART and contribute to its transcriptional regulation in the Drosophila germline. Chromatin immunoprecipitation revealed that the insulator complex containing BEAF32, Chriz, and DREF proteins occupy the TART promoter. BEAF32 depletion causes derepression and chromatin changes at TART in ovaries. Moreover, an expansion of TART copy number was observed in the genome of the BEAF32 mutant strain. BEAF32 localizes between the TART enhancer and promoter, suggesting that it blocks enhancer-promoter interactions. This study found that TART repression is released in the germ cysts as a result of the normal reduction of BEAF32 expression at this developmental stage. It is suggested that coordinated expression of telomeric repeats during development underlies telomere elongation control (Sokolova, 2023).
The boundaries of topologically associating domains (TADs) are delimited by insulators and/or active promoters; however, how they are initially established during embryogenesis remains unclear. This was examined during the first hours of Drosophila embryogenesis. DNA-FISH confirms that intra-TAD pairwise proximity is established during zygotic genome activation (ZGA) but with extensive cell-to-cell heterogeneity. Most newly formed boundaries are occupied by combinations of CTCF, BEAF-32, and/or CP190. Depleting each insulator individually from chromatin revealed that TADs can still establish, although with lower insulation, with a subset of boundaries (~10%) being more dependent on specific insulators. Some weakened boundaries have aberrant gene expression due to unconstrained enhancer activity. However, the majority of misexpressed genes have no obvious direct relationship to changes in domain-boundary insulation. Deletion of an active promoter (thereby blocking transcription) at one boundary had a greater impact than deleting the insulator-bound region itself. This suggests that cross-talk between insulators and active promoters and/or transcription might reinforce domain boundary insulation during embryogenesis (Cavalheiro, 2023).
Chromatin insulators are responsible for orchestrating long-range interactions between enhancers and promoters throughout the genome and align with the boundaries of Topologically Associating Domains (TADs). This study demonstrates an association between gypsy insulator proteins and the phosphorylated histone variant H2Av (γH2Av), normally a marker of DNA double strand breaks. Gypsy insulator components colocalize with γH2Av throughout the genome, in polytene chromosomes and in diploid cells in which Chromatin IP data shows it is enriched at TAD boundaries. Mutation of insulator components su(Hw) and Cp190 results in a significant reduction in γH2Av levels in chromatin and phosphatase inhibition strengthens the association between insulator components and γH2Av and rescues γH2AvH2Av localization in insulator mutants. It was also shown that γH2Av, but not H2Av, γH2A is a component of insulator bodies, which are protein condensates that form during osmotic stress. Phosphatase activity is required for insulator body dissolution after stress recovery. Together, these results implicate the H2A variant with a novel mechanism of insulator function and boundary formation (Simmons, 2022).
Topologically associating domains (TADs) are thought to play an important role in preventing gene misexpression by spatially constraining enhancer-promoter contacts. The deleterious nature of gene misexpression implies that TADs should, therefore, be conserved among related species. Several early studies comparing chromosome conformation between species reported high levels of TAD conservation; however, more recent studies have questioned these results. Furthermore, recent work suggests that TAD reorganization is not associated with extensive changes in gene expression.This study investigated the evolutionary conservation of TADs among 11 species of Drosophila. Hi-C data was used to identify TADs in each species and employ a comparative phylogenetic approach to derive empirical estimates of the rate of TAD evolution. Surprisingly, it was found that TADs evolve rapidly. However, it was also found that the rate of evolution depends on the chromatin state of the TAD, with TADs enriched for developmentally regulated chromatin evolving significantly slower than TADs enriched for broadly expressed, active chromatin. It was also found that, after controlling for differences in chromatin state, highly conserved TADs do not exhibit higher levels of gene expression constraint. These results suggest that, in general, most TADs evolve rapidly and their divergence is not associated with widespread changes in gene expression. However, higher levels of evolutionary conservation and gene expression constraints in TADs enriched for developmentally regulated chromatin suggest that these TAD subtypes may be more important for regulating gene expression, likely due to the larger number of long-distance enhancer-promoter contacts associated with developmental genes (Torosin, 2022).
Drosophila bithorax complex (BX-C) is one of the best model systems for studying the role of boundaries (insulators) in gene regulation. Expression of three homeotic genes, Ubx, abd-A, and Abd-B, is orchestrated by nine parasegment-specific regulatory domains. These domains are flanked by boundary elements, which function to block crosstalk between adjacent domains, ensuring that they can act autonomously. Paradoxically, seven of the BX-C regulatory domains are separated from their gene target by at least one boundary, and must "jump over" the intervening boundaries. To understand the jumping mechanism, the Mcp boundary was replaced with Fab-7 and Fab-8. Mcp is located between the iab-4 and iab-5 domains, and defines the border between the set of regulatory domains controlling abd-A and Abd-B. When Mcp is replaced by Fab-7 or Fab-8, they direct the iab-4 domain (which regulates abd-A) to inappropriately activate Abd-B in abdominal segment A4. For the Fab-8 replacement, ectopic induction was only observed when it was inserted in the same orientation as the endogenous Fab-8 boundary. A similar orientation dependence for bypass activity was observed when Fab-7 was replaced by Fab-8. Thus, boundaries perform two opposite functions in the context of BX-C-they block crosstalk between neighboring regulatory domains, but at the same time actively facilitate long distance communication between the regulatory domains and their respective target genes (Postaka, 2018).
Boundaries flanking the Abd-B regulatory domains must block crosstalk between adjacent regulatory domains but at the same time allow more distal domains to jump over one or more intervening boundaries and activate Abd-B expression. While several models have been advanced to account for these two paradoxical activities, replacement experiments argued that both must be intrinsic properties of the Abd-B boundaries. Thus Fab-7 and Fab-8 have blocking and bypass activities in Fab-7 replacement experiments, while heterologous boundaries including multimerized dCTCF sites and Mcp from BX-C do not. One idea is that Fab-7 and Fab-8 are simply 'permissive' for bypass. They allow bypass to occur, while boundaries like multimerized dCTCF or Mcp are not permissive in the context of Fab-7. Another is that they actively facilitate bypass by directing the distal Abd-B regulatory domains to the Abd-B promoter. Potentially consistent with an 'active' mechanism that involves boundary pairing interactions, the bypass activity of Fab-8 and to a lesser extent Fab-7 is orientation dependent (Postaka, 2018).
In the studies reported it this study have tested these two models further. For this purpose the Mcp boundary was used for in situ replacement experiments. Mcp defines the border between the regulatory domains that control expression of abd-A and Abd-B. In this location, it is required to block crosstalk between the flanking domains iab-4 and iab-5, but it does not need to mediate bypass. In this respect, it differs from the boundaries that are located within the set of regulatory domains that control either abd-A or Abd-B, as these boundaries must have both activities. If bypass were simply passive, insertion of a 'permissive' Fab-7 or Fab-8 boundary in either orientation in place of Mcp would be no different from insertion of a generic 'non-permissive' boundary such as multimerized dCTCF sites. Assuming that Fab-7 and Fab-8 can block crosstalk out of context, they should fully substitute for Mcp. In contrast, if bypass in the normal context involves an active mechanism in which more distal regulatory domains are brought to the Abd-B promoter, then Fab-7 and Fab-8 replacements might also be able to bring iab-4 to the Abd-B promoter in a configuration that activates transcription. If they do so, then this process would be expected to show the same orientation dependence as is observed for bypass of the Abd-B regulatory domains in Fab-7 replacements (Postaka, 2018).
Consistent with the idea that a boundary located at the border between the domains that regulate abd-A and Abd-B need not have bypass activity, it was found that multimerized binding sites for the dCTCF protein fully substitute for Mcp. Like the multimerized dCTCF sites, Fab-7 and Fab-8 are also able to block crosstalk between iab-4 and iab-5. In the case of Fab-7, its' blocking activity is incomplete and there are small clones of cells in which the mini-y reporter is activated in A4. In contrast, the blocking activity of Fab-8 is comparable to the multimerized dCTCF sites and the mini-y reporter is off throughout A4. One plausible reason for this difference is that Mcp and the boundaries flanking Mcp (Fab-4 and Fab-6) utilize dCTCF as does Fab-8, while this architectural protein does not bind to Fab-7 (Postaka, 2018).
Importantly, in spite of their normal (or near normal) ability to block crosstalk, both boundaries still perturb Abd-B regulation. In the case of Fab-8, the misregulation of Abd-B is orientation dependent just like the bypass activity of this boundary when it is used to replace Fab-7. When inserted in the reverse orientation, Fab-8 behaves like multimerized dCTCF sites and it fully rescues the Mcp deletion. In contrast, when inserted in the forward orientation, Fab-8 induces the expression of Abd-B in A4 (PS9), and the misspecification of this parasegment. Unlike classical Mcp deletions or the McpPRE replacement described in this study, expression of the Abd-B gene in PS9 is driven by iab-4, not iab-5. This conclusion is supported by two lines of evidence. First, the mini-y reporter inserted in iab-5 is off in PS9 cells indicating that iab-5 is silenced by PcG factors as it should be in this parasegment. Second, the ectopic expression of Abd-B is eliminated when the iab-4 regulatory domain is inactivated (Postaka, 2018).
These results, taken together with previous studies, support a model in which the chromatin loops formed by Fab-8 inserted at Mcp in the forward orientation brings the enhancers in the iab-4 regulatory domain in close proximity to the Abd-B promoter, leading to the activation of Abd-B in A4 (PS9). In contrast, when inserted in the opposite orientation, the topology of the chromatin loops formed by the ectopic Fab-8 boundary are not compatible with productive interactions between iab-4 and the Abd-B promoter. Moreover, it would appear that boundary bypass for the regulatory domains that control Abd-B expression is not a passive process in which the boundaries are simply permissive for interactions between the regulatory domains and the Abd-B promoter. Instead, it seems to be an active process in which the boundaries are responsible for bringing the regulatory domains into contact with the Abd-B gene. It also seems likely that bypass activity of Fab-8 (and also Fab-7) may have a predisposed preference, namely it is targeted for interactions with the Abd-B gene. This idea would fit with transgene bypass experiments, which showed that both Fab-7 and Fab-8 interacted with an insulator like element upstream of the Abd-B promoter, AB-I, while the Mcp boundary didn't (Postaka, 2018).
Similar conclusions can be drawn from the induction of Abd-B expression in A4 (PS9) when Fab-7 is inserted in place of Mcp. Like Fab-8, this boundary inappropriately targets the iab-4 regulatory domain to Abd-B. Unlike Fab-8, Abd-B is ectopically activated when Fab-7 is inserted in both the forward and reverse orientations. While the effects are milder in the reverse orientation, the lack of pronounced orientation dependence is consistent with experiments in which Fab-7 was inserted at its endogenous location in the reverse orientation. Unlike Fab-8 only very minor iab-6 bypass defects were observed. In addition to the activation of Abd-B in A4 (PS9) the Fab-7 Mcp replacements also alter the pattern of Abd-B regulation in more posterior segments. In the forward orientation, A4 and A5 are transformed towards an A6 identity, while A6 is also misspecified. Similar though somewhat less severe effects are observed in these segments when Fab-7 is inserted in the reverse orientation. At this point the mechanisms responsible for these novel phenotypic effects are uncertain. One possibility is that pairing interactions between the Fab-7 insert and the endogenous Fab-7 boundary disrupt the normal topological organization of the regulatory domains in a manner similar to that seen in boundary competition transgene assays. An alternative possibility is that Fab-7 targets iab-4 to the Abd-B promoter not only in A4 (PS9) but also in cells in A5 (PS10) and A6 (PS11). In this model, Abd-B would be regulated not only by the domain that normally specifies the identity of the parasegment (e.g., iab-5 in PS10), but also by interactions with iab-4. This dual regulation would increase the levels of Abd-B, giving the weak GOF phenotypes. Potentially consistent with this second model, inactivating iab-4 in the McpF8 replacement not only rescues the A4 (PS9) GOF phenotypes but also suppresses the loss of anterior trichomes in the A6 tergite (Postaka, 2018).
Chromatin topology is intricately linked to gene expression, yet its functional requirement remains unclear. This study comprehensively assessed the interplay between genome topology and gene expression using highly rearranged chromosomes (balancers) spanning ~75% of the Drosophila genome. Using transheterozyte (balancer/wild-type) embryos, allele-specific changes were measured in topology and gene expression in cis, while minimizing trans effects. Through genome sequencing, eight large nested inversions, smaller inversions, duplications and thousands of deletions were identified. These extensive rearrangements caused many changes to chromatin topology, disrupting long-range loops, topologically associating domains (TADs) and promoter interactions, yet these are not predictive of changes in expression. Gene expression is generally not altered around inversion breakpoints, indicating that mis-appropriate enhancer-promoter activation is a rare event. Similarly, shuffling or fusing TADs, changing intra-TAD connections and disrupting long-range inter-TAD loops does not alter expression for the majority of genes. These results suggest that properties other than chromatin topology ensure productive enhancer-promoter interactions (Ghavi-Helm, 2019).
Position effect variegation (PEV) in Drosophila results from new juxtapositions of euchromatic and heterochromatic chromosomal regions, and manifests as striking bimodal patterns of gene expression. The semirandom patterns of PEV, reflecting clonal relationships between cells, have been interpreted as gene-expression states that are set in development and thereafter maintained without change through subsequent cell divisions. Many properties of PEV are not predicted from currently accepted biochemical and theoretical models. This work investigated the time at which expressivity of silencing is set and finds that it is determined before heterochromatin exists. A mathematical simulation and a corroborating experimental approach are employed to monitor switching (i.e., gains and losses of silencing) through development. In contrast to current views, this study finds that gene silencing is incompletely set early in embryogenesis, but nevertheless is repeatedly lost and gained in individual cells throughout development. The data support an alternative to locus-specific 'epigenetic' silencing at variegating gene promoters that more fully accounts for the final patterns of PEV (Bughio, 2019).
Deciphering the rules of genome folding in the cell nucleus is essential to understand its functions. Recent chromosome conformation capture (Hi-C) studies have revealed that the genome is partitioned into topologically associating domains (TADs), which demarcate functional epigenetic domains defined by combinations of specific chromatin marks. However, whether TADs are true physical units in each cell nucleus or whether they reflect statistical frequencies of measured interactions within cell populations is unclear. Using a combination of Hi-C, three-dimensional (3D) fluorescent in situ hybridization, super-resolution microscopy, and polymer modeling, this study provides an integrative view of chromatin folding in Drosophila. Repressed TADs form a succession of discrete nanocompartments, interspersed by less condensed active regions. Single-cell analysis revealed a consistent TAD-based physical compartmentalization of the chromatin fiber, with some degree of heterogeneity in intra-TAD conformations and in cis and trans inter-TAD contact events. These results indicate that TADs are fundamental 3D genome units that engage in dynamic higher-order inter-TAD connections. This domain-based architecture is likely to play a major role in regulatory transactions during DNA-dependent processes (Szabo, 2018).
Eukaryotic chromatin is organized in contiguous domains that differ in protein binding, histone modifications, transcriptional activity, and in their degree of compaction. Genome-wide comparisons suggest that, overall, the chromatin organization is similar in different cells within an organism. This study compared the structure and activity of the 61C7-61C8 interval in polytene and diploid cells of Drosophila. By in situ hybridization on polytene chromosomes combined with high-resolution microscopy, the boundaries were mapped of the 61C7-8 interband and of the 61C7 and C8 band regions, respectively. The results demonstrate that the 61C7-8 interband is significantly larger than estimated previously. This interband extends over 20 kbp and is in the range of the flanking band domains. It contains several active genes and therefore can be considered as an open chromatin domain. Comparing the 61C7-8 structure of Drosophila S2 cells and polytene salivary gland cells by ChIP for chromatin protein binding and histone modifications, a highly consistent domain structure was observed for the proximal 13 kbp of the domain in both cell types. However, the distal 7 kbp of the open domain differs in protein binding and histone modification between both tissues. The domain contains four protein-coding genes in the proximal part and two noncoding transcripts in the distal part. The differential transcriptional activity of one of the noncoding transcripts correlates with the observed differences in the chromatin structure between both tissues (Zielke, 2015).
Drosophila melanogaster polytene chromosomes display specific banding pattern; the underlying genetic organization of this pattern has remained elusive for many years. This paper analyzed 32 cytology-mapped polytene chromosome interbands. Molecular locations of these interbands was estimated, their molecular and genetic organization was described and it was demonstrated that polytene chromosome interbands contain the 5' ends of housekeeping genes. As a rule, interbands display preferential 'head-to-head' orientation of genes. They are enriched for 'broad' class promoters characteristic of housekeeping genes and associate with open chromatin proteins and Origin Recognition Complex (ORC) components. In two regions, 10A and 100B, coding sequences of genes whose 5'-ends reside in interbands map to constantly loosely compacted, early-replicating, so-called 'grey' bands. Comparison of expression patterns of genes mapping to late-replicating dense bands vs genes whose promoter regions map to interbands shows that the former are generally tissue-specific, whereas the latter are represented by ubiquitously active genes. Analysis of RNA-seq data (modENCODE-FlyBase) indicates that transcripts from interband-mapping genes are present in most tissues and cell lines studied, across most developmental stages and upon various treatment conditions. A special algorithm was developed to computationally process protein localization data generated by the modENCODE project; it was shown that Drosophila genome has about 5700 sites that demonstrate all the features shared by the interbands cytologically mapped to date (Zhimulev, 2014).
Drosophila polytene chromosomes have served as the best available model of eukaryotic interphase chromosome. They are prominent for their banding pattern formed by dark transverse stripes (called bands), which encompass large chunks of chromatin material. These bands alternate with fine, lighter-colored stripes that have less material and are more loosely packed. Such light-transparent structures between bands are known as interbands (Zhimulev, 2014).
Genetic organization of bands and interbands defined as the pattern that sets positioning of genes and genetic features relatively to the structural elements of a chromosome, is still largely elusive. This is due to the fact that despite the availability of the Drosophila genome, methods to even approximately map band/interband borders on a physical map are still lacking (Zhimulev, 2014).
Yet, many interesting hypotheses regarding the genetic organization of bands and interbands in polytene chromosomes have been put forth. Some of the points of these hypotheses were experimentally validated, so it is important to consider them below (Zhimulev, 2014).
Genes were proposed to reside in interbands, or bands (1-2 genes/band). Also, bands were proposed to contain many structural genes transcribed coordinately and as polycistronic messages. In some models, band and interband were considered to form a single genetic unit, where one part of the gene was embedded in a band, and the other part mapped to an interband. The Paul model (Paul, 1972) is of special interest. The author considered interband regions as essentially polymerase-binding sites, so transcription would progress into band regions from initiation sites which were likely situated near the band-interband junctions. (Zhimulev, 2014).
Very interesting conclusions were made regarding the general meaning of banding pattern: bands were regarded as hosting inactivated genes, interbands were represented by the genes in a steady state of activity; in other models, interbands were believed to contain constantly active housekeeping genes. Further details on various banding pattern models are available in (Zhimulev, 1999 and Zimulev, 2014 and references therein).
Several recent technological advances have dramatically moved forward understanding of polytene chromosome organization. First, efforts of the modENCODE project have produced genome-wide profiling data for many proteins that specifically localized to bands or interbands in interphase chromosomes (Zhimulev, 2014).
Secondly, using genome-wide DamID mapping of 53 chromosomal proteins and histone modifications Filion (2010) generated a map of Drosophila chromatin landscape and demonstrated that the genome can be segmented into five main chromatin types. Conditionally named 'BLUE' (Pc-dependent repression) and 'BLACK' (repression mechanism not defined) chromatin types associated with repressed chromatin, 'YELLOW' chromatin contained ubiquitously expressed genes, whereas 'RED' chromatin harbored active genes with more complex expression patterns. 'GREEN' chromatin type was defined by enrichment of heterochromatin-specific proteins HP1 and Su(var)3-9). More refined analysis of modENCODE data resulted in description of many more chromatin states. Significant proportion of genome sequence is known to map to a special class of polytene chromosome bands, called intercalary heterochromatin (IH). These chromosome regions associate with BLACK chromatin proteins (H1, SUUR, LAM, D1) and range from 100 to 700 kb in length. Here, DNA replicates late and compared to the genome average, these regions have lower gene density. Genomic localization of proteins that constitute repressed chromatin can thus be used as a marker to establish the molecular position of IH (Zhimulev, 2014).
Third, recent studies developed an approach to simultaneously map the interband material on polytene chromosomes and in the genome using transposon insertion tags. This allows exact localization of insertion sites both on cytological and physical maps as well as precise identification of sequences around the transposon integration sites (Zhimulev, 2014).
Using this approach, protein composition and other chromatin parameters were described in 12 DNA sequences corresponding to polytene chromosome interbands. They display general features of open chromatin: low nucleosome density, histone H1 dips, association with TSS-specific proteins such as RNA polymerase II, various transcription factors, nucleosome remodeling proteins - NURF, ISWI, WDS, interband-specific proteins (CHRIZ/CHROMATOR, CHRIZ hereafter), proteins of origin recognition complexes (ORC). Moreover, they show clustering of DNaseI hypersensitive sites (DHS) (Zhimulev, 2014).
Based on these data, subdivided all polytene chromosome bands were divided into two contrasting groups: loosely compacted early-replicating, so-called 'grey' bands and dense late-replicating compact bands ('black' IH bands). They differ in many aspects of their protein and genetic make-up, as well as in DNA compactization (Zhimulev, 2014).
Previously work has shown that polytene chromosomes and interphase chromosomes from dividing cells display identical organization. Namely, interbands from polytene chromosomes and the corresponding DNA sequences from cell line chromosomes share similar features in terms of localization of open chromatin-type proteins. Consequently, banding pattern appears as a fundamental organization principle of interphase chromosomes. In both types of chromosomes, homologous interbands and bands have identical physical borders and length; importantly, they also associate with identical sets of proteins. Hence, the notion of an interband defined as a decondensed region in the context of polytene chromosomes is also applicable to other types of interphase chromosomes. In other words, the term 'interband' should be viewed as an equivalent of a constantly decondensed region in the context of any interphase chromosome. Accordingly, hereafter this wider definition of an interband is used (Zhimulev, 2014).
In the present work, using various cytological approaches, a new set of precisely mapped interbands were characterized, and then the modENCODE data on localization of active chromatin proteins was processed using a custom-designed computation model. This analysis suggests that interphase chromosome interbands contain constantly active promoter regions of ubiquitously active genes. Coding sequences of these genes, at least in two regions studied, map to adjacent loosely compacted early-replicating 'grey' bands. In contrast, densely packed, late-replicating bands of polytene chromosomes appear to preferentially harbor tissue-specific genes (Zhimulev, 2014).
The present study aims at unraveling the genetic and functional organization of basic morphological features of interphase chromosomes. In the context of polytene chromosomes, these features display distinct degrees of chromatin packaging and comprise interbands, loosely compacted grey bands and dense IH bands. This study attempted to correlate positions of gene elements, gene expression and the epigenetic state of underlying chromatin for these structures. To do so, these morphological elements were first accurately located on the physical map of the genome. This allowed comparison of their positions with genetic and epigenetic maps, as well as with protein localization profiles, transcription profiles and other features of chromatin. So, functional domains could be related to the banding pattern of polytene chromosomes. (Zhimulev, 2014).
Dense black bands are the most prominent structures in polytene chromosomes. They are readily noticeable due to their highly compacted state, large size, lack of transcription, late replication in the S phase, and a tendency to form ectopic pairing with other bands and pericentric heterochromatin. In fact, black bands are in many regards very similar to pericentric heterochromatin, hence they were called IH. In polytene chromosomes, IH bands frequently fail to complete replication during the S phase endocycles, and are therefore underreplicated. It has recently become clear that underreplication results from the absence of internal replication origins within IH and is dependent on SUUR protein, which maps to IH bands and modulates replication by decreasing the rate of replication fork progression (Zhimulev, 2014).
Underreplication regions showing lowered DNA copy number in polytene chromosomes were molecularly mapped. This analysis established that IH bands encompass clusters of widely-spaced unique genes (i.e., genes with large intergenic regions, with 6-40 genes per IH band), and that they are generally quite large (100-600 kb). Combined with the data on localization of chromatin proteins, IH borders were precisely mapped for 60 IH regions, which enabled a more refined analysis of these structures (Zhimulev, 2014).
The current data and those of Filion (2010) indicate that IH bands are composed of tissue-specific genes showing low expression levels. One of the prominent features of IH regions is their evolutionary conservation, i.e. they tend to display conserved gene content and order throughout evolution, as has been demonstrated by microsynteny analysis in nine Drosophila species (Zhimulev, 2014).
As compared to IH bands, it is far less trivial to provide accurate mapping for interbands and grey bands, because these regions are fully replicated and are much smaller. Yet, using a combination of EM, P-element tagging and FISH, it was possible to unambiguously map the positions of 32 interbands. Using the data on the features of interband chromatin, a mathematical model was developed that defines four basic chromatin states in the drosophila genome. This model allows identification of interband regions chromosome-wide. Accordingly, the limits of the DNA sequences corresponding to interbands were defined as borders of cyan fragments (Zhimulev, 2014).
With these data in hands, the molecular and epigenetic organization of interbands was analyzed. Interbands clearly displayed features of transcriptionally active regions: H3K4me3 histone modification, lower nucleosome density and histone H1 dips, presence of DHS, localization of RNA polymerase II and components of nucleosome remodeling complexes such as NURF, ISWI, WDS. One characteristic feature of interbands is that they are specifically bound by the chromodomain-containing CHRIZ protein. CHRIZ associates with another interband-specific protein Z4, which directly binds DNA via its seven zinc fingers (Zhimulev, 2014).
According to different estimates, there are 3500-5000 bands and interbands in Drosophila melanogaster polytene chromosomes. Earlier, the existence of about 3500 interbands was predicted. This study used an advanced model that takes into account more factors and hence is more accurate. 5674 cyan fragments were discovered each spanning 2.7 kb on average. Notably, both previous and current estimates of interband numbers are very close to those obtained by cytology (Zhimulev, 2014).
The major finding of this analysis of functional organization of interbands is that they typically encompass 5'-regions of multiply active genes (constitutively and actively transcribed). In a number of instances, it was observed that short genes can be entirely engulfed by interbands, however in most cases the body of the gene is found in the adjacent loosely compacted grey band. Thus, the interband+grey band duo appears as a single functional unit for many multiply active genes, so this unit is heterogeneous in terms of compaction; likewise it shows non-uniform localization of protein markers. Whereas interbands are specifically decorated with CHRIZ, grey bands lack CHRIZ and instead they are enriched with RNApolII. CHRIZ can be speculated to provide the permanently open chromatin state to interbands, where it serves as a pioneer-factor recruiting other transcription components. It is also possible that the observed wide-spread transcription activity of interband regions results from the static physical properties of interband DNA, such as sequence-dependent DNA flexibility, which may create nucleosome-free regions at promoters. Such regions may serve as 'entry points' to recruit proteins promoting further binding of transcription factors, chromatin remodelers, etc (Zhimulev, 2014).
The findings, therefore, resonate well with several early ideas regarding the interplay of structural and functional organization of banding pattern in polytene chromosomes. These include interband localization of multiply active genes, and the hypothesis of a single functional unit composed of band+interband (Zhimulev, 2014).
Recently, there has been an avalanche of publications describing various types of domain organization in the genomes of eukaryotes. So, the current data can be conveniently compared with other genome-wide chromatin annotation projects. Domain organization is summarized of a 400 kb fragment of the X chromosome encompassing various types of bands and accurately mapped interbands. Two large domains, 189 and 170 kb long, correspond to polytene chromosome bands 10A1-2 and 10B1-2, and display features of intercalary heterochromatin (magenta-green chromatin states). In between these late-replicating domains, there is a region composed of alternating interbands and grey bands (cyan and blue fragments). When applied to this region, 5-state chromatin classification model by Filion (2010) produces very similar domains, - the important difference however is that regions of YELLOW chromatin (active gene transcription according to Filion do not discriminate between small grey bands and interbands, nor between regulatory vs gene body parts (Zhimulev, 2014).
Kharchenko (2011) performed genome-wide profiling of 18 histone modifications and constructed 9-state chromatin models. In two contrasting cell lines (S2 and BG3), transcriptionally silent chromatin corresponds to the IH bands 10A1-2 and 10B1-2, whereas state 1 chromatin (active promoters and TSS) maps to the active genes and perfectly matches the cyan state of interbands, as defined by the analysis (Zhimulev, 2014).
Using the modified Hi-C approach based on the ligation of chromatin fragments that are found in close proximity in cross-linked chromatin, high-resolution chromosomal contact maps were generated. As it follows from this analysis, the entire genome is partitioned into a series of physical domains containing active and repressive epigenetic marks. These domains are delimited by boundaries demonstrating insulator binding, high DNaseI sensitivity and a set of specific proteins: CHRIZ and the active histone mark H3K4me3. The regions of interbands match well with the boundary sites that delimit the contacting domains identified via Hi-C. Localization of cyan chromatin and physical domains were compared throughout the genome. Of 1100 boundary sites referenced in Sexton (2012), 760 (69%) map to interbands (cyan) and loosely compacted grey bands (blue chromatin) (Zhimulev, 2014).
Positions of interbands and loosely compacted grey bands display co-localization with clusters of multiply active genes. According to the current analysis, there are 12 genes nested between the 10A1-2 and 10B1-2, with their 5'-ends mapping to cyan chromatin. Of these, 5 genes were classified as housekeeping genes (Zhimulev, 2014).
Using the data from the genes were classified as 'housekeeping' or 'differentially regulated'. Under this classification, the region 10A1-2 to 10B1-2 harbors 11 housekeeping genes, of which 9 genes correspond to the definition of a multiply active gene (Zhimulev, 2014).
NSL complexes are reportedly regulators of multiply active genes and bind promoters demonstrating broad transcriptional pattern and nucleosome-free regions. Positions of NSL-binding peaks match nicely the interband positions found in the study. All these comparisons further confirm the main conclusion of this work about interbands as sites of continuously active genes (Zhimulev, 2014).
This is the first study to investigate the molecular-genetic organization of polytene chromosome interbands located on both molecular and cytological maps of Drosophila genome. The majority of the studied interbands contained one gene with a single transcription initiation site; the remaining interbands contained one gene with several alternative promoters, two or more unidirectional genes, and "head-to-head" arranged genes. In addition, intricately arranged interbands containing three or more genes in both unidirectional and bidirectional orientation were found. Insulator proteins, ORC, P-insertions, DNase I hypersensitive sites, and other open chromatin structures were situated in the promoter region of the genes located in the interbands. This area is critical for the formation of the interband, an open chromatin region in which gene transcription and replication are combined (Zykova, 2019).
Developmental enhancers bind transcription factors and dictate patterns of gene expression during development. Their molecular evolution can underlie phenotypical evolution, but the contributions of the evolutionary pathways involved remain little understood. Using mutation libraries in Drosophila melanogaster embryos, this study observed that most point mutations in developmental enhancers led to changes in gene expression levels but rarely resulted in novel expression outside of the native pattern. In contrast, random sequences, often acting as developmental enhancers, drove expression across a range of cell types; random sequences including motifs for transcription factors with pioneer activity acted as enhancers even more frequently. These findings suggest that the phenotypic landscapes of developmental enhancers are constrained by enhancer architecture and chromatin accessibility. It is proposed that the evolution of existing enhancers is limited in its capacity to generate novel phenotypes, whereas the activity of de novo elements is a primary source of phenotypic novelty (Galupa, 2023).
This study used transgenesis-based mutagenesis and de novo gene synthesis during fly embryogenesis to investigate evolutionary pathways for enhancer activity. Fly development was used to explore how novel patterns of gene expression might appear from either molecular evolution of developmental enhancers or random sequences. Notably, while reporter gene assays and minimal enhancers may not reflect the full regulatory activities of native loci, such an approach allows evaluation of a broad range of 'possible' enhancer variation in a controlled experimental setup, without associated fitness costs and allowing a broader exploration of evolution and development without the complexities and historical contingencies found in nature. Furthermore, using such an assay in a developmental model system, which generates an embryo in 24 h, regulatory activities can be assayed across ~100,000 cells of different lineage origins (Galupa, 2023).
Using this approach, it was found that most mutations in enhancers led to changes in levels of reporter gene expression, but almost entirely within their native zones of expression, similar to previous studies using transgenic mutagenesis of the Shh enhancer in murine embryos, or the E3N enhancer
and the wing spot196 enhancer in fly embryos. Consistent with current results, known phenotypic evolution through nucleotide mutations of standing regulatory elements seems to appear either through changes in the levels or timings of expression within native zones or the loss of regulatory activities. For example, the evolution of pigmentation spots in fly wings occurred via a specific spatial increase in the melanic protein Yellow, which is uniformly expressed at low levels throughout the developing wings of fruit flies. Evolution of other traits such as thoracic ribs in vertebrates,
limbs in snakes, pelvic structures in sticklebacks, and seed shattering in rice
are all associated with loss of enhancer activity due to internal enhancer mutations. Additionally, mutations have been found to occur less often in functionally constrained regions of the genome, suggesting that mutation bias may reduce the occurrence of deleterious mutations in regulatory regions (Galupa, 2023).
Consistent with these results, phenotypic novelties underlain by enhancer-associated ectopic gains of expression are reportedly due to transposon mobilization, rearrangements in chromosome topology, or de novo evolution of enhancers from DNA sequences with unrelated or nonregulatory activities (Galupa, 2023).
Previous studies have explored the potential of random DNA sequences to lead to reporter gene expression, either as enhancers or promoters, especially in cell lines of prokaryotic or eukaryotic origin. These have shown that there is a short (or sometimes null) mutational distance between random sequences and active cis-regulatory elements, which may improve evolvability. This study tested random sequences in a developmental context and found that most showed enhancer activity across several types of tissues and developmental stages. These results are consistent with a study that tested enhancer activity of all 6-mers in developing zebrafish embryos and found a diverse range of expression for ~38% of the sequences at two developmental stages.
We observed expression driven by random sequences even in the absence of motifs within their sequence for TFs with pioneering activity. Yet, when such motifs were included, nearly all sequences acted as 'strong' enhancers (leading to high levels of expression), consistent with the 'evolutionary barrier' to the formation of a novel enhancer being lower in regions that already contain motifs for DNA-binding factors, which can 'act cooperatively with newly emerging sites (Galupa, 2023).
It is interesting to note that, despite the high potential of random sequences to be expressed during development and across cell types, expression prior to gastrulation was never observed; this was not evaluated in the zebrafish study or in other studies. This may be due to the rapid rates of early fruit fly development, in which gene expression patterns are highly dynamic, and cell-fate specifications occur within minutes. As such, there may be extensive regulatory demands placed on transcriptional enhancers, reflected in the clusters of high-affinity binding sites common across early embryonic developmental enhancers as well as their extensive conservation in function and location (Galupa, 2023).
In the future, it will be interesting to explore how regulatory demands that change across development-such as nuclear differentiation, network cross-talk, and metabolic changes- are reflected in regulatory architectures and their evolvability.
The observation that most random sequences led to expression suggests that the potential of any sequence within the genome to drive expression is enormous and thus 'an important playground for creating new regulatory variability and evolutionary innovation (Galupa, 2023).
This was further supported by the regulatory potential of the genomic sequences that were tested, containing Ubx/Hth motifs; indeed, the results from this work imply that enhancers would more likely evolve from sequences that contain or are biased toward specific motifs (e.g., GATA and Zelda). Perhaps the challenge from an evolutionary perspective has not been what allows expression, but what prevents expression; thus, mechanisms that repress 'spurious' expression might have evolved across genomes. This is in line with propositions that nucleosomal DNA in eukaryotes has evolved to repress transcription, along with transcriptional repressors and other mechanisms such as DNA methylation, as a response (at least partially) to 'the unbearable ease of expression' present in prokaryotes (Galupa, 2023).
The action of such repressive mechanisms could also explain why mutagenesis of developmental enhancers, which are subject to evolutionary selection, does not easily lead to expression outside their native patterns of expression. In sum, the findings of this study raise exciting questions about the evolution of enhancers and the emergence of novel patterns of expression that may underlie new phenotypes, suggesting an underappreciated role for de novo evolution of enhancers by happenstance. Genetic theories of morphological evolution will benefit from comparing controlled, multi-dimensional laboratory experiments with standing variation;
such an integrative approach could provide the frameworks that will facilitate making of both transcriptional and evolutionary predictions (Galupa, 2023).
One limitation of this study lies on the numbers - this study has tested a significant number of enhancer variants, but it is still possible that ectopic expression would have been captured more frequently had a larger set of enhancer variants been tested. Also, in principle, a higher number of mutations per enhancer could have also enhanced the likelihood of ectopic expression. Previous work from this lab with the E3N enhancer reported that indeed the proportion of lines with ectopic expression increased with the number of mutations (Galupa, 2023).
However, this increase plateaued around 20%-30% for lines with ~3+ mutations per enhancer and in this study, the number of mutations in the enhancer variants for twiMPE, rhoNEE, and tinB ranges from 1 to 7 mutations, so it would be expected to have captured a number of lines with ectopic expression. Importantly, the assay captures millions of years of variation in a controlled setting decoupled from fitness costs. It is also possible that ectopic expression might be present in developmental stages that were not analyzed. Finally, would the results be different if a different promoter had been used? This was not tested formally, but based on published literature, it is believed that using a different promoter would not have major implications in the results observed. Testing a total of enhancer-promoter combinations in human cells, efficiency of enhancers has been shown to be approximately the same irrespective of the type of promoter used, and a recent combinatorial analysis of 1,000 human promoters and 1,000 human enhancers confirmed that most enhancers activate all promoters by similar amounts (Galupa, 2023).
These studies, in cell lines, could only address levels of expression, not spatial patterns-but very recently published results from the lab
show that developmental promoters in fly embryos can drive a range of outputs but do not affect spatial aspects of expression, only levels (Galupa, 2023).
In mammals, insulators contribute to the regulation of loop extrusion to organize chromatin into topologically associating domains. In Drosophila the role of insulators in 3D genome organization is, however, under current debate. This study addressed this question by combining bioinformatics analysis and multiplexed chromatin imaging. A class of Drosophila insulators enriched at regions forming preferential chromatin interactions genome-wide. Notably, most of these 3D interactions do not involve TAD borders. Multiplexed imaging shows that these interactions occur infrequently, and only rarely involve multiple genomic regions coalescing together in space in single cells. Finally, it was shown that non-border preferential 3D interactions enriched in this class of insulators are present before TADs and transcription during Drosophila development. These results are inconsistent with insulators forming stable hubs in single cells, and instead suggest that they fine-tune existing 3D chromatin interactions, providing an additional regulatory layer for transcriptional regulation.
are present before TADs and transcription during Drosophila development. These results are inconsistent with insulators forming stable hubs in single cells, and instead suggest that they fine-tune existing 3D chromatin interactions, providing an additional regulatory layer for transcriptional regulation (Messina, 2023).
Drosophila insulators were the first DNA elements found to regulate gene expression by delimiting chromatin contacts. It is still not known how many of them exist and what impact they have on the Drosophila genome folding. Contrary to vertebrates, there is no evidence that fly insulators block cohesin-mediated chromatin loop extrusion. Therefore, their mechanism of action remains uncertain. To bridge these gaps, this study mapped chromatin contacts in Drosophila cells lacking the key insulator proteins CTCF and Cp190. With this approach, hundreds of insulator elements were found. Their study indicates that Drosophila insulators play a minor role in the overall genome folding but affect chromatin contacts locally at many loci. These observations argue that Cp190 promotes cobinding of other insulator proteins and that the model, where Drosophila insulators block chromatin contacts by forming loops, needs revision. This insulator catalog provides an important resource to study mechanisms of genome folding (Kahn, 2023).
Despite of the long-term studies available on genetic organization of polytene chromosome bands and interbands, little is known regarding long gene location on chromosomes. To analyze it, bioinformatic approaches were used, and genome-wide distribution of introns in gene bodies and in different chromatin states was characterized, and using fluorescent in situ hybridization they were juxtaposed with the chromosome structures. Short introns up to 2 kb in length are located in the bodies of housekeeping genes (grey bands or lazurite chromatin). In the group of 70 longest genes in the Drosophila genome, 95% of total gene length accrues to introns. The mapping of the 15 long genes showed that they could occupy extended sections of polytene chromosomes containing band and interband series, with promoters located in the interband fragments (aquamarine chromatin). Introns (malachite and ruby chromatin) in polytene chromosomes form independent bands, which can contain either both introns and exons or intron material only. Thus, a novel type of the gene arrangement in polytene chromosomes was discovered; peculiarities of such genetic organization are discussed (Khoroshko, 2020).
Chemical cross-linking and DNA sequencing have revealed regions of intra-chromosomal interaction, referred to as topologically associating domains (TADs), interspersed with regions of little or no interaction, in interphase nuclei. TADs and the regions between them were found to correspond with the bands and interbands of polytene chromosomes of Drosophila. Further, the conservation of TADs between polytene and diploid cells of Drosophila was established. From direct measurements on light micrographs of polytene chromosomes, the states of chromatin folding in the diploid cell nucleus was deduced. Two states of folding, fully extended fibers containing regulatory regions and promoters, and fibers condensed up to 10-fold containing coding regions of active genes, constitute the euchromatin of the nuclear interior. Chromatin fibers condensed up to 30-fold, containing coding regions of inactive genes, represent the heterochromatin of the nuclear periphery. A convergence of molecular analysis with direct observation thus reveals the architecture of interphase chromosomes (Eagen, 2015).
At the intermediate scale of genomic spatial organization of kilobases to megabases, which encompasses the sizes of genes, gene clusters and regulatory domains, the three-dimensional (3D) organization of DNA is implicated in multiple gene regulatory mechanisms. At this scale, the genome is partitioned into domains of different epigenetic states that are essential for regulating gene expression. This study investigated the 3D organization of chromatin in different epigenetic states using super-resolution imaging. Genomic domains were classified in Drosophila cells into transcriptionally active, inactive or Polycomb-repressed states, and distinct chromatin organizations were observed for each state. All three types of chromatin domains exhibit power-law scaling between their physical sizes in 3D and their domain lengths, but each type has a distinct scaling exponent. Polycomb-repressed domains show the densest packing and most intriguing chromatin folding behaviour, in which chromatin packing density increases with domain length. Distinct from the self-similar organization displayed by transcriptionally active and inactive chromatin, the Polycomb-repressed domains are characterized by a high degree of chromatin intermixing within the domain. Moreover, compared to inactive domains, Polycomb-repressed domains spatially exclude neighbouring active chromatin to a much stronger degree. Computational modelling and knockdown experiments suggest that reversible chromatin interactions mediated by Polycomb-group proteins play an important role in these unique packaging properties of the repressed chromatin. Taken together, these super-resolution images reveal distinct chromatin packaging for different epigenetic states at the kilobase-to-megabase scale, a length scale that is directly relevant to genome regulation (Boettiger, 2016).
Suppressor of Hairy wing [Su(Hw)] is an insulator protein that participates in regulating chromatin architecture and gene repression in Drosophila. Previous studies have shown that Su(Hw) is also required for pre-replication complex (pre-RC) recruitment on Su(Hw)-bound sites (SBSs) in Drosophila S2 cells and pupa. This study describes the effect of Su(Hw) on developmentally regulated amplification of 66D and 7F Drosophila amplicons in follicle cells (DAFCs), widely used as models in replication studies. Su(Hw) binding co-localizes with all known DAFCs in Drosophila ovaries, whereas disruption of Su(Hw) binding to 66D and 7F DAFCs causes a two-fold decrease in the amplification of these loci. The complete loss of Su(Hw) binding to chromatin impairs pre-RC recruitment to all amplification regulatory regions of 66D and 7F loci at early oogenesis (prior to DAFCs amplification). These changes coincide with a considerable Su(Hw)-dependent condensation of chromatin at 66D and 7F loci. Although this study observed the Brm, ISWI, Mi-2, and CHD1 chromatin remodelers at SBSs genome wide, their remodeler activity does not appear to be responsible for chromatin decondensation at the 66D and 7F amplification regulatory regions. This study has discovered that, in addition to the CBP/Nejire and Chameau histone acetyltransferases, the Gcn5 acetyltransferase binds to 66D and 7F DAFCs at SBSs and this binding is dependent on Su(Hw). It is proposed that the main function of Su(Hw) in developmental amplification of 66D and 7F DAFCs is to establish a chromatin structure that is permissive to pre-RC recruitment (Vorobyeva, 2021).
To elucidate the principles governing insulator architectural functions, this study used two insulators, Homie and Nhomie, that flank the Drosophila even-skipped locus. It was shown that homologous insulator interactions in trans, between Homie on one homolog and Homie on the other, or between Nhomie on one homolog and Nhomie on the other, mediate transvection. Critically, these homologous insulator:insulator interactions are orientation-dependent. Consistent with a role in the alignment and pairing of homologs, self-pairing in trans is head-to-head. Head-to-head self-interactions in cis have been reported for other fly insulators, suggesting that this is a general principle of self-pairing. Homie and Nhomie not only pair with themselves, but with each other. Heterologous Homie-Nhomie interactions occur in cis, and they serve to delimit a looped chromosomal domain that contains the even skipped transcription unit and its associated enhancers. The topology of this loop is defined by the heterologous pairing properties of Homie and Nhomie. Instead of being head-to-head, which would generate a circular loop, Homie-Nhomie pairing is head-to-tail. Head-to-tail pairing in cis generates a stem-loop, a configuration much like that observed in classical lampbrush chromosomes. These pairing principles provide a mechanistic underpinning for the observed topologies within and between chromosomes (Fujioka, 2016).
The highly regular and reproducible physical organization of chromosomes in multicellular eukaryotes was recognized a century ago in cytological studies on the lampbrush chromosomes that are found in oocytes arrested at the diplotene phase of meiosis I. At this stage, homologous chromosomes are paired. The two homologs display a similar and reproducible architecture. It consists of a series of loops emanating from the main axis, that are arranged in pairs, one from each homolog. In between the loops are regions of more compacted chromatin. A similar physical organization is evident in insect polytene chromosomes. As with lampbrush chromosomes, the paired homologs are aligned in precise register. However, instead of one copy of each homolog, there are hundreds. While loops are not readily visible, each polytene segment has a unique pattern of bands and interbands that depends upon the underlying DNA sequence and chromosome structure (Fujioka, 2016).
Subsequent studies have shown that the key features of chromosome architecture evident in lampbrush and polytene chromosomes are also found in diploid somatic cells. One of these is the subdivision of the chromatin fiber into a series of loop domains. There are now many lines of evidence indicating that looping is a characteristic architectural feature. Biochemical evidence comes from chromosome conformation capture (3C) experiments, which show that distant sites come into contact with each other in a consistent pattern of topologically associating domains (TADs). While the first studies in mammals suggested that TADs have an average length of 1 Mb, subsequent experiments showed that the average is only about 180 kb. In flies, TADs are smaller, between 10-100 kb (Sexton, 2012; Hou, 2012). Neighboring TADs are separated from each other by boundaries that constrain both physical and regulatory interactions. In mammals and also in flies, these boundaries typically correspond to sequences bound by insulator proteins like CTCF (Fujioka, 2016).
That TAD boundaries correspond to insulators is consistent with the known properties of these elements. Insulators subdivide the chromosome into functionally autonomous regulatory domains. When interposed between enhancers or silencers and target promoters, insulators block regulatory interactions. They also have an architectural function in that they can bring distant chromosomal sequences together, and in the proper configuration can promote rather than restrict regulatory interactions. Moreover, insulators are known to mediate contacts between distant sequences (loop formation), and these physical contacts depend upon specific interactions between proteins bound to the insulators (Fujioka, 2016).
The notion that insulators are responsible for subdividing eukaryotic chromosomes into a series of looped domains raises questions about the rules governing loop formation in cis. One of these is the basis for partner choice. Is choice based simply on proximity, or is there an intrinsic partner preference? A second concerns the topology of the loop formed by interacting partners in cis. Do the partners interact to form a stem-loop-like structure, or does the interaction generate a circular loop ('circle-loop')? The answer to this question will depend upon whether there is an orientation dependence to the interactions between two heterologous insulators. In flies, homologs are typically paired in somatic cells, not just in cells that are polyploid. This means that the loop domains in each homolog must be aligned in precise register along their entire length. A plausible hypothesis is that both alignment and homolog pairing are mediated by insulator interactions in trans. If this is case, there are similar questions about the rules that govern trans interactions. Is there a partner preference in the interactions that mediate homolog pairing? Is there an orientation dependence, and if so, what is the topological outcome of the looped domains generated by insulator interactions in paired chromosomes in cis and in trans? (Fujioka, 2016).
This study has used insulators from the even skipped (eve) locus to address the questions posed above about the architecture of eukaryotic chromosomes. The eve domain spans 16 kb and is bordered upstream by the Nhomie (Neighbor of Homie) insulator and downstream by Homie (Homing insulator at eve). eve encodes a homeodomain transcription factor that is required initially for segmentation, and subsequently in the development of the CNS, muscles, and anal plate. It has a complex set of enhancers that activate expression at different stages and tissues, and a Polycomb response element (PRE) that silences the gene in cells where it isn't needed. In early embryos, the stripe enhancers upstream (3+7, 2, late stripes) and downstream (4+6, 1, and 5) of the eve gene activate transcription in a pair-rule pattern. Later in development, around the time that germband retraction commences, mesodermal (Me) and neuronal (CNS) enhancers turn on eve expression in a subset of cells in each of these tissues. These late enhancers continue to function once germband retraction is complete, while another enhancer (APR) induces transcription in the presumptive anal plate. Located just upstream of eve is CG12134, while the TER94 gene is downstream. Unlike eve, both of these genes are ubiquitously expressed throughout much of embryogenesis (Fujioka, 2016).
The importance of insulators in organizing eukaryotic chromosomes has been recognized since their discovery in the 1980's. However, the principles underlying their architectural and genetic functions have not been fully elucidated. With this goal in mind, this study asked how these elements shape two critical architectural features of chromosomes. The first is homolog pairing. Homologs pair in flies from the blastoderm stage onward, and the consequent trans-interactions are important for proper gene regulation. The phenomenon of homolog pairing is not unique to Drosophila. Homologs are paired in lampbrush chromosomes of invertebrate and vertebrate oocytes. The second is the looped domain organization. Although there is now compelling evidence that insulators subdivide chromosomes into topologically independent looped domains (and that these domains play a central role in gene regulation), the topology of the loops is unknown. Moreover, while the loops must emanate from the main axis of the chromosome, the relationships between the loops, the insulators that delimit them, and the main chromosomal axis are not understood. As homolog pairing is more straightforward and the likely mechanism better documented, it is considered first (Fujioka, 2016).
Homolog pairing requires mechanisms for aligning homologs in precise register, and maintaining their stable association. While many schemes are imaginable, the simplest utilizes elements distributed along each homolog that have self-interaction specificity. Such a mechanism would be consistent with the persistence of local pairing and transvection in chromosomal rearrangements. It would also fit with studies on the pairing process. Self-association of pairing elements would locally align sequences in register, and ultimately link homologs together along their entire length. In this mechanism, self-association must be specific and also directional, namely head-to-head. This avoids the introduction of unresolvable loops and maximizes pairing for transvection (Fujioka, 2016).
In Drosophila, the homing of P-element transgenes, in which normally random insertion becomes targeted, suggested the ability of genomic elements to self-interact. Such a homing activity was found in the engrailed locus for a region that includes two PREs, and later studies showed that some insulators and a promoter region also possess homing activity. The self-interaction implied by homing suggests that these elements may facilitate homolog pairing. However, in contrast to PREs and promoters, insulators have consistently been found to engage in specific self-interactions. Thus, among the known elements in the fly genome, insulators are the best candidates to align homologs in register and maintain pairing. Moreover, genome-wide chromatin immunoprecipitation experiments (ChIPs) show that insulators are distributed at appropriate intervals along each chromosome (Fujioka, 2016).
A role in homolog pairing was first suggested by the discovery that the su(Hw) and Mcp insulators each can mediate regulatory interactions between transgenes inserted at distant sites. The Fab-7 insulator can also mediate long-range regulatory effects. Further evidence that self-association is characteristic of fly insulators came from insulator bypass experiments. These experiments showed that bypass is observed when an insulator is paired with itself, while heterologous combinations are less effective or don't give bypass. Moreover, self-pairing is, with few exceptions, head-to-head (Fujioka, 2016).
That insulators mediate homolog pairing through specific self-interactions is further supported by the current studies. Using a classical transvection assay, this study found that Homie-Homie and Nhomie-Nhomie combinations stimulate trans-regulatory interactions between enhancers on one homolog and a reporter on the other. Moreover, the parameters that favor transvection dovetail with those expected for a pairing mechanism based on insulator self-interactions in trans. First, the two insulators must be in the same orientation. When they are in opposite orientations, transvection is not enhanced (or enhancement is much weaker). Second, the enhancers and reporter must be located on the same side (centromere proximal or distal) of the insulators. In addition to transvection, Homie and Nhomie also engage in highly specific and directional distant regulatory interactions (Fujioka, 2016).
While there is compelling evidence that insulator self-interactions are responsible for homolog pairing, many issues remained unresolved. Perhaps the most important is the nature of the code used for self-recognition and orientation. The best hint comes from bypass experiments using multimerized binding sites for Su(Hw), dCTCF, or Zw5. Homologous multimer combinations give bypass, while heterologous combinations do not. However, bypass is observed for composite multimers when they are inserted in opposite orientations (e.g., Su(Hw) dCTCF >-< dCTCF Su(Hw)), but not the same orientation (e.g., Su(Hw) dCTCF >> Su(Hw) dCTCF). These findings argue that the identity and order of proteins bound to the insulator determine its self-association properties (Fujioka, 2016).
The first direct evidence that insulators generate loops came from 3C experiments on the mouse β-globin and the fly 87A7 heat shock loci. These studies suggested that physical interactions between adjacent insulators in cis could subdivide chromosomes into looped domains. Subsequent work has confirmed this conclusion (Rao, 2014). However, while these experiments demonstrate that cis insulator interactions generate loops, they provided no information about the topology of these loops, or how they are arranged (Fujioka, 2016).
Cis interactions could, a priori, be either head-to-head like self-association in trans, or head-to-tail. The consequences are quite different. Head-to-head interactions generate a circle-loop, while head-to-tail interactions generate a stem-loop. If heterologous insulators interact with only one specific partner, the circle-loop or the stem-loop will be linked to neighboring circles or stem-loops by loops without anchors. These unanchored loops would correspond to the main axis of the chromosome, and the circle-loops or stem-loops would then protrude from the main axis in a random orientation and at distances determined by the length and compaction of the unanchored loops (Fujioka, 2016).
On the other hand, if insulators in a chromosomal segment are able to interact with both of their neighbors, then the main axis of the chromosome in this region would be defined by the insulators. Quite different structures are predicted for head-to-head and head-to-tail interactions. Head-to-head would give a series of variably sized circle-loops linked together at their base by an array of interacting insulators. The base would correspond to the main axis of the chromosome, and each circle-loop would extend from one side of the main axis to the other. If the direction of coiling were always the same, this would give a structure resembling a helix anchored to a rod. If the direction of coiling were random, the structure would be more complicated and variable, since neighboring circle-loops could extend out from the main axis in either the same or the opposite direction (not illustrated). The loop-axis relationship would be more regular for head-to-tail insulator pairing in cis. Adjacent stem-loops would extend out from the main axis in opposite directions much like the lampbrush chromosomes formed when haploid sperm heads are injected into amphibian oocytes. This stem-loop organization would also fit with the radial loop model proposed by Laemmli and others for the first level of folding of metaphase chromosomes (Fujioka, 2016).
Since the current experiments show that Homie-Nhomie association is head-to-tail, the topology of the eve locus in vivo is a stem-loop, not a circle-loop. This finding raises a number of questions. Perhaps the most important is whether head-to-tail interactions are the rule rather than the exception. While the orientation dependence of homologous interactions has been extensively investigated, there have been no systematic studies on interactions between neighboring insulators. However, there are reasons to think that cis interactions are more likely head-to-tail than head-to-head. One is homolog pairing. As mentioned above, the circle-loops formed by head-to-head interactions can coil in either direction, either left-handed or right-handed. If coiling were random, then about half of the circle-loops on each homolog would be coiled in opposite directions. In this case, head-to-head pairing of homologous insulators in each homolog would generate a structure in which the circle-loops would point in opposite directions. This topology would not be compatible with transvection. Coiling of the circle-loops in the same direction on both homologs would permit interdigitation of one circle-loop inside the other; however, the chromatin fiber from the inside circle-loop would need to cross in on one side and out on the other. If the main axis of the chromosome in the paired region is defined by a series of interacting insulators in cis, then generating a topology permissive for transvection (not illustrated) would require coiling of successive homologous circle-loops on each homolog in the same direction, one inside the other (Fujioka, 2016).
These topological issues aren't encountered when heterologous insulator interactions in cis are head-to-tail. Head-to-head pairing of homologous insulators in trans would bring regulatory elements and genes in the two homologous stem-loops into close proximity. Alignment of the two homologs is straightforward whether or not the main axis of the chromosome is defined by a series of interacting insulators. Alternating loops extending upwards and downwards from the main axis of the chromosome would be directly aligned when homologous insulators pair head-to-head in trans (Fujioka, 2016).
While the requirements for aligning and pairing homologs would appear to favor stem-loops between heterologous insulators in cis in flies, homolog pairing does not occur in vertebrates except in specialized cell types. This could mean that circle-loops formed by cis interactions between heterologous insulators are permissible in vertebrate chromosomes. However, even in organisms in which homolog pairing doesn't occur in somatic cells, it seems possible that cis-pairing interactions more commonly generate stem-loops than circle-loops (see Chromosome architecture: pairing head-to-head and head-to-tail in cis). First, following DNA replication and before mitosis (during the S and G2 phases of the cell cycle), sister chromatids are aligned. Maintaining this alignment may facilitate epigenetic mechanisms that template chromatin structures from one cellular generation to the next, such as the copying of histone modifications onto both daughter chromosomes. The simpler topology of stem-loops could facilitate this sister chromatid pairing, as well as their separation during mitosis. Second, recent studies on the relationship between loop domains and CTCF insulators showed that in more than 90% of the cases, the CTCF binding sites on opposite ends of a loop are in opposite orientation. Thus, assuming that the orientation of pairing is such that the CTCF sites are aligned in parallel to form the loop, pairing between CTCF insulators at the ends of the loop would generate stem-loops rather than circle-loops. If insulators form the main axis of the chromosome, there is an additional explanation for such a bias. Head-to-head pairing in cis could generate a series of circular loops that extend out from the same side of the main axis. This configuration would be favorable for crosstalk between regulatory elements and genes in adjacent loops. By contrast, head-to-tail pairing, where adjacent stem-loops extend out in opposite directions, would disfavor crosstalk, helping to explain how insulators block enhancer-promoter communication between adjacent loops (Fujioka, 2016).
Genome organization involves cis and trans chromosomal interactions, both implicated in gene regulation, development, and disease. This study focused on trans interactions in Drosophila, where homologous chromosomes are paired in somatic cells from embryogenesis through adulthood. First long-standing questions were addressed regarding the structure of embryonic homolog pairing and, to this end, a haplotype-resolved Hi-C approach was developed to minimize homolog misassignment and thus robustly distinguish trans-homolog from cis contacts. This computational approach, which is called Ohm, reveals pairing to be surprisingly structured genome-wide, with trans-homolog domains, compartments, and interaction peaks, many coinciding with analogous cis features. a significant genome-wide correlation was found between pairing, transcription during zygotic genome activation, and binding of the pioneer factor Zelda. The findings reveal a complex, highly structured organization underlying homolog pairing, first discovered a century ago in Drosophila. Finally, the versatility of this haplotype-resolved approach was demonstrated by applying it to mammalian embryos (Erceg, 2019)
Trans-homolog interactions have been studied extensively in Drosophila, where homologs are paired in somatic cells and transvection is prevalent. Nevertheless, the detailed structure of pairing and its functional impact have not been thoroughly investigated. Accordingly, this study generated a diploid cell line from divergent parents and applied haplotype-resolved Hi-C, showing that homologs pair with varying precision genome-wide, in addition to establishing trans-homolog domains and compartments. This study revealed at least two forms of pairing: tight pairing, spanning contiguous small domains, and loose pairing, consisting of single larger domains. Strikingly, active genomic regions (A-type compartments, active chromatin, expressed genes) correlated with tight pairing, suggesting that pairing has a functional implication genome-wide. Finally, using RNAi and haplotype-resolved Hi-C, it was shown that disruption of pairing-promoting factors results in global changes in pairing, including the disruption of some interaction peaks (AlHaj Abed, 2019).
Metazoan chromosomes are folded into discrete sub-nuclear domains, referred to as chromosome territories (CTs). The molecular mechanisms that underlie the formation and maintenance of CTs during the cell cycle remain largely unknown. This paper reports the development of high-resolution chromosome paints to investigate CT organization in Drosophila cycling cells. Large-scale chromosome folding patterns and levels of chromosome intermixing are shown to be remarkably stable across various cell types. The data also suggest that the nucleus scales to accommodate fluctuations in chromosome size throughout the cell cycle, which limits the degree of intermixing between neighboring CTs. Finally, this study shows that the cohesin and condensin complexes are required for different scales of chromosome folding, with condensin II being especially important for the size, shape, and level of intermixing between CTs in interphase. These findings suggest that large-scale chromosome folding driven by condensin II influences the extent to which chromosomes interact, which may have direct consequences for cell-type specific genome stability (Rosin, 2018).
Metazoan genomes are arranged into a nested hierarchy of structural features, ranging from small chromatin loops to larger insulated neighborhoods or topologically associated domains (TADs). TADs are believed to direct and insulate gene regulatory networks, which can engage in long-range interactions with each other, ultimately packaging chromosomes into sub-nuclear compartments termed chromosome territories (CTs) (Rosin, 2018).
CTs are a widespread feature of nuclear organization across a variety of cell types and species, as revealed by both fluorescence in situ hybridization (FISH) and chromosome-conformation-capture (3C)-based studies. Recently, several studies have implicated the ring-shaped SMC (structural maintenance of chromosomes) complexes-cohesin and condensin-in the regulation of large-scale chromatin folding and CT formation. However, the contribution of each complex to local topology, large-scale chromatin folding, and chromosome individualization at single-cell resolution has been hindered by technical limitations. The consequence of CT loss during interphase also remains unclear. This is due, in part, to both the paucity of factors known to directly influence this level of organization and the difficulty in visualizing their effects at single cell resolution. However, CT intermixing has been theorized to influence the location and frequency of translocations and the position of a gene within and between CTs seems to influence its access to the machinery responsible for specific nuclear functions, such as transcription, splicing, and DNA repair (Rosin, 2018).
This study leveraged the flexible, scalable Oligopaint FISH technology to generate high-resolution chromosome paints to the entire Drosophila genome. Combined with a custom 3D segmentation pipeline, a comprehensive picture of chromosome size, shape, and position at single-cell resolution. The results show that various cell types in Drosophila harbor spatially partitioned CTs. Interestingly, widespread somatic homolog pairing in Drosophila results in homologs sharing a single CT, suggesting that homologous and heterologous chromosomes are distinguished at the cellular level in this species. Further, this study characterized the differential roles of cohesin and condensin complexes in local chromatin compaction, large-scale chromatin folding, and CT formation. Cohesin and condensin II were shown to drive different scales of chromatin folding during interphase, with condensin II being especially important for large-scale interactions and the spatial partitioning of chromosomes. These findings indicate that condensin II-driven large-scale chromatin conformations during interphase influence the extent to which chromosomes interact, which has the potential to affect gene regulation and genome stability (Rosin, 2018).
In this study, we demonstrate that Drosophila cells harbor spatially distinct CTs and found remarkably consistent levels of intermixing in a variety of cell types and throughout the cell cycle. While the vast majority of cells showed contact between all three major chromosomes, it was possible to measure that, on average, only 40% of the Drosophila genome is intermixed (not accounting for homologous chromosomes). This is strikingly similar to the estimate of 40-46% CT intermixing in human lymphocytes, possibly indicating a widespread and conserved restraint on inter-chromosomal interactions. However, it is noted that a small population of cells do exhibit >90% overlap between neighboring CTs. The fate of these cells will be important to explore in the future (Rosin, 2018).
Further, the condensin II complex was identified as an essential factor for CT formation in cycling cells. These results are consistent with those reported on condensin in yeast, tetrahymena, and post-mitotic polytene cells of Drosophila. These data are also in line with previous work showing that condensin II serves as an 'anti-pairing' factor that disrupts pairing interactions and separates homologous loci. Additionally, it was shown that condensin II overexpression can further compact chromosomes and reduce the level of CT intermixing. Together, these data highlight the highly conserved role of the condensin II complex in controlling the level of inter-chromosomal associations in eukaryotic cells (Rosin, 2018).
If condensin II has the capacity to spatially separate homologous and heterologous chromosomes, how does somatic pairing persist in Drosophila cells that have CTs? One possibility is that pairing interactions are established prior to CT formation and thus, homologous chromosomes would be folded in concert. This would be consistent with some persistence of homolog pairing through mitosis and suggests a model in which chromosomes are folded into CTs through post-mitotic condensin II activity. In addition, pairing interactions may require additional condensin activity to separate homologous versus heterologous interactions. Indeed, these studies showed that condensin II overexpression increases whole-chromosome unpairing in Kc167 and BG3 cells. It is speculated that interphase condensin II levels and thus inter-chromosomal associations are tightly regulated, and could be modified in a cell-type-specific manner. For instance, in contrast to virtually all other cell types in Drosophila, homologous chromosomes in germline stem cells remain unpaired throughout development. This separation between homologs could potentially reflect increased levels of condensin II activity and may indicate that inter-chromosomal associations are reduced to protect the stem-cell population from potentially deleterious rearrangements. Indeed, previous work has shown that different extents of chromosome intermixing correlate with translocation frequencies-both those occurring naturally in the human population and those induced experimentally in human and mouse lymphocytes. Therefore, an alteration in condensin II activity and subsequent CT intermixing levels has the potential to influence the location and frequency with which translocations occur. Intriguingly, mice carrying a hypomorphic allele of cap-H2 were recently shown to frequently develop T-cell lymphomas with highly rearranged chromosomes in the transformed cells. It will be important to determine whether this increased genome instability is associated with increased CT contact prior to the rearrangement event (Rosin, 2018).
When accounting for the popular model of loop extrusion and the stabilizing function of SMC complexes, condensin II activity could potentially fold whole chromosomes into a configuration that limit their interactions with the rest of the genome. While the nature of these interactions remains unknown, they are clearly distinct from cohesin-driven interactions given that cohesin depletion does not significantly change intermixing levels in Drosophila or yeast. Consistent with this hypothesis, a recent study demonstrated that depletion of the cohesin complex in mammals eliminates chromatin looping and TAD formation but does not disrupt long-range interactions between similar chromatin states, highlighting the notion that local insulation and higher-order folding must rely on distinct molecular determinants. Combined with the current findings that large-scale configurations are stable throughout the cell cycle and require condensin II activity, it is proposed that condensin II drives long-range interactions that are established early in interphase. In this model, condensin II may act as an 'organizational bookmark' by prioritizing intra-chromosomal folding immediately following mitotic exit. As condensin II is enriched at highly active regions of the genome marked by H3K4me3, its activity could potentially allow gene regulatory networks and chromatin compartments to favor intra- versus inter-chromosomal interactions. Further studies identifying the interactions driven by condensin II in relation to cohesin will be critical for understanding how these molecular machines cooperatively guide the genome through the cell cycle and development (Rosin, 2018).
Finally, this report describes an efficient and scalable method of high-resolution chromosome painting using Oligopaint FISH technology. Combined with a custom 3D-segmentation pipeline, quantitative measurements of chromosome size, shape, position, and overlap can be analyzed in a systematic and potentially high-throughput fashion. Moreover, the ability to conduct sequential rounds of hybridization with Oligopaints permits 3D analysis of many, if not all, CTs simultaneously. It is anticipated that this technology will lead to an enhanced ability to visualize and karyotype chromosomes in a number of systems, providing a novel battery of assays to better characterize how chromatin is packaged and spatially partitioned in the nucleus (Rosin, 2018).
Interaction domains in Drosophila chromosomes form by segregation of active and inactive chromatin in the absence of CTCF loops, but the role of transcription versus other architectural proteins in chromatin organization is unclear. This study finds that positioning of RNAPII via transcription elongation is essential in the formation of gene loops, which in turn interact to form compartmental domains. Inhibition of transcription elongation or depletion of cohesin decreases gene looping and formation of active compartmental domains. In contrast, depletion of condensin II, which also localizes to active chromatin, causes increased gene looping, formation of compartmental domains, and stronger intra-chromosomal compartmental interactions. Condensin II has a similar role in maintaining inter-chromosomal interactions responsible for pairing between homologous chromosomes, whereas inhibition of transcription elongation or cohesin depletion has little effect on homolog pairing. The results suggest distinct roles for cohesin and condensin II in the establishment of 3D nuclear organization in Drosophila (Rowley, 2019).
Inter- and intra-chromosomal interactions among DNA-bound proteins establish patterns of chromatin organization detectable by Hi-C. The original low-resolution genome-wide Hi-C maps described the segregation of active and inactive chromatin into A and B compartments. Later, higher-resolution maps identified domains characterized by preferential intra- versus inter-domain contacts. Interaction domains have been described in different organisms and are commonly referred to as topologically associating domains (TADs). In addition to these features, intense point-to-point loops have been detected by high-resolution Hi-C in mammals. The anchors of these loops are enriched in CTCF and cohesin, and predominantly contain CTCF motifs in convergent orientation (Rowley, 2019).
CTCF loops are an important component of chromatin organization in vertebrates, yet plants and invertebrates either lack a homolog or CTCF does not appear to form stable loops. Instead, chromosomal domains in these organisms, including Drosophila, correspond to the transcriptional state of specific sequences in the genome. Borders between these domains form at discontinuities between active and inactive regions containing proteins and histone modifications characteristic of their transcriptional state. This pattern of 3D organization is similar to that observed in mammals after depletion of CTCF or Rad21 and has been studied in detail in Drosophila, where analyses of high-resolution Hi-C data show that chromatin is predominately organized by the fine-scale segregation of active and inactive chromatin into A and B compartmental domains (Rowley, 2017). Indeed, transcriptional state alone can be used to computationally simulate the experimental Hi-C interaction pattern at 1-kb resolution with great accuracy (Rowley, 2017). In further support for a role of transcription or factors associated with the transcriptional state of genes in chromatin organization, inhibition of transcription initiation and subsequent degradation of RNA polymerase II (RNAPII) using triptolide disrupts Drosophila compartmental domains and their interactions. Interestingly, the extent of disruption of 3D organization correlates with the levels of RNAPII after triptolide treatment. Drosophila Hi-C maps also show a few hundred punctate signals corresponding to specific point-to-point interactions, but these loops are not associated with CTCF. Instead, the loop anchors are enriched for developmental enhancers, Pc, and Rad21. It is unclear whether these Pc loops are formed by cohesin-mediated loop extrusion as it has been proposed for CTCF loops in mammals (Rowley, 2019).
In addition to inter- and intra-chromosomal interactions, Drosophila chromosomes participate in extensive pairing with their homologs. Pairing between homologs is responsible for the transvection phenomenon, which involves interactions between enhancers and promoters of genes located in two homologous chromosomes. Analysis of the extent of this pairing typically makes use of fluorescence in situ hybridization (FISH) probes hundreds of kilobases long, making it difficult to determine whether pairing occurs at discrete loci or in large regions. Several proteins have been shown to affect homolog pairing including condensin II, the levels of which are regulated by the SCFSlimb ubiquitin ligase. Depletion of Slimb increases levels of condensin II and decreases homolog pairing, while depletion of condensin II increases homolog pairing, suggesting that condensin II antagonizes chromosome pairing. While the role of condensin II in this aspect of nuclear organization is well known, its relationship to other aspects of chromosome organization is largely unexplored (Rowley, 2019).
This study examined the contribution of condensin II, cohesin, and the distribution of RNAPII to the establishment of various features of Drosophila 3D chromatin organization. Furthermore, analysis of homologous pairing interactions using Hi-C data suggests that pairing occurs at discrete loci with an average length of 6.4 kb enriched for architectural proteins. The results highlight the importance and distinct roles of RNAPII or other components of the transcription complex, cohesin, and condensin II in the establishment of nuclear organization (Rowley, 2019).
These results support a model of chromatin organization where RNAPII and cohesin promote interactions within genes to create small gene domains. Interactions between adjacent gene domains result in the formation of active compartmental domains, and interactions among these domains give rise to the characteristic plaid pattern of Hi-C heatmaps often referred to as the A compartment. The frequency of interactions within and between genes and A compartmental domains correlates with the amount of RNAPII and cohesin, which co-localize extensively in the genome. Because of this, the allocation of a specific sequence to the A compartment should not be done in absolute terms. Rather, sequences in the A compartment have different positive eigenvector values that correlate with the amount of RNAPII and cohesin. Contiguous sequences lacking RNAPII and cohesin have a negative eigenvector value and form B compartmental domains. Interactions among B compartmental domains in Drosophila are more infrequent compared to those among A compartmental domains, that is, the plaid pattern of Hi-C heatmaps in Drosophila arises in large part due to interactions between A compartments. However, sequences within B compartmental domains interact as frequently as those located in A domains. These interactions may arise as a consequence of proteins present in silenced genes. Alternatively, or in addition, interactions within B compartmental domains may result from interactions between adjacent A domains, which enclose B domains within loops similar to those formed by CTCF/cohesin in vertebrates. This is supported by results showing that inhibition of transcription initiation with triptolide or using the heat shock response, which result in the loss of A compartmental domains, also result in decreased interaction frequencies within B domains (Rowley, 2019).
These findings suggest that, whereas interaction frequency of sequences in active genes correlates with transcription elongation, it is likely that the presence of RNAPII, or other components of the transcription/elongation complexes, is a better candidate to explain the correlation between transcription and 3D organization. Inhibition of transcription results in dramatic changes to chromatin domains in Drosophila, yet transcription inhibition was reported to have little effect in mammalian embryonic nuclei. It is speculated that transcription inhibition studies in mammalian cells could be affected by the prevalence of CTCF loop domains. These loops may tether chromatin together such that inhibition of transcription for short periods of time is insufficient to disrupt chromatin organization. Meanwhile, in organisms that lack CTCF loops, such as Drosophila and prokaryotes, the larger effect of transcription inhibition may be due to the lack of point-to-point chromatin tethering by CTCF loops. It would be interesting to analyze whether absence of transcription or depletion of RNAPII with inhibitors such as triptolide have a stronger effect in cells depleted of CTCF (Rowley, 2019).
Previous results have shown a role for condensin II in chromatin structure during interphase. Condensin II colocalizes extensively with Drosophila architectural proteins, but in spite of the similar distribution, some observations suggest a distinct role for Cap-H2 in chromatin biology with respect to other architectural proteins. For example, all architectural proteins, including Rad21, are re-distributed during the heat shock response and they accumulate at enhancer sequences. However, the amount of enhancer-bound Cap-H2 and the number of occupied enhancers decreases after temperature stress. These observations may be explained by the opposing roles that condensin II and cohesin play in mediating intra-chromosomal interactions. Condensin II is present in active chromatin but it antagonizes the formation of gene domains and A compartmental domains, and condensin II depletion results in an increase to long-range A-A compartmental interactions. These results are in line with recent observations indicating that chromosome volume, as detected by Oligopaint, increases in Cap-H2 knockdown Drosophila cells (Rosin, 2018). The mechanisms by which these two SMC motors play opposing role in chromatin interactions is unclear. Presumably, their function in chromatin 3D organization is related to their ability to extrude loops, as was proposed for cohesin in mammals. Condensin has also been shown to extrude loops in vitro (Ganji, 2018), and it would be interesting to understand whether its role, opposite to that of cohesin, is based on different potential extrusion mechanisms between these two complexes. Thus, condensin II could antagonize cohesin interactions by directly inhibiting these same interactions or by promoting different interactions (Rowley, 2019).
Drosophila chromosomes participate in extensive homologous chromosome pairing, but the details of the mechanisms underlying this phenomenon are not well understood. Analysis of Hi-C data support a button model of pairing, where the buttons are short pairing sites likely corresponding to binding sites for specific proteins, rather than large domains. These pairing sites are enriched in architectural proteins, including Rad21 and Cap-H2. Although depletion of Rad21 only has no effect on pairing, it is possible that some architectural proteins may promote pairing while others act as anti-pairers, as is the case for Cap-H2. The general antagonistic role of condensin II in the establishment of interactions between homologs as well as short- and long-range intra-chromosomal contacts suggests common mechanisms responsible for these apparently different phenomena (Rowley, 2019).
In the nucleus, chromatin is intricately structured into multiple layers of 3D organization important for genome activity. How distinct layers influence each other is not well understood. In particular, the contribution of chromosome pairing to 3D chromatin organization has been largely neglected. This study addresses this question in Drosophila, an organism that shows robust chromosome pairing in interphasic somatic cells. The extent of chromosome pairing depends on the balance between pairing and anti-pairing factors, with the anti-pairing activity of the CAP-H2 condensin II subunit being the best documented. This study identifieds the zinc-finger protein Z4 as a strong anti-pairer that interacts with and mediates the chromatin binding of CAP-H2. It is also reported that hyperosmotic cellular stress induces fast and reversible chromosome unpairing that depends on Z4/CAP-H2. And, most important, by combining Z4 depletion and osmostress, this study shows that chromosome pairing reinforces intrachromosomal 3D interactions. On the one hand, pairing facilitates RNAPII occupancy that correlates with enhanced intragenic gene-loop interactions. In addition, acting at a distance, pairing reinforces chromatin-loop interactions mediated by Polycomb (Pc). In contrast, chromosome pairing does not affect which genomic intervals segregate to active (A) and inactive (B) compartments, with only minimal effects on the strength of A-A compartmental interactions. Altogether, these results unveil the intimate interplay between inter-chromosomal and intra-chromosomal 3D interactions, unraveling the interwoven relationship between different layers of chromatin organization and the essential contribution of chromosome pairing (Puerto, 2023).
High-throughput assays of three-dimensional interactions of chromosomes have shed considerable light on the structure of animal chromatin. Despite this progress, the precise physical nature of observed structures and the forces that govern their establishment remain poorly understood. This study presents high resolution Hi-C data from early Drosophila embryos. Boundaries between topological domains of various sizes were shown to map to DNA elements that resemble classical insulator elements: short genomic regions sensitive to DNase digestion that are strongly bound by known insulator proteins and are frequently located between divergent promoters. Further, a striking correspondence was shown between these elements and the locations of mapped polytene interband regions. It is likely this relationship between insulators, topological boundaries, and polytene interbands extends across the genome, and a model is proposed in which decompaction of boundary-insulator-interband regions drives the organization of interphase chromosomes by creating stable physical separation between adjacent domains (Stadler, 2017).
Several Hi-C studies in flies have identified enrichments of insulator proteins at topologically associated domains (TAD) boundaries. These studies varied in their resolution (due to use of 4- vs. 6-cutter enzymes and sequencing depth), methods (solution vs. in situ Hi-C), and, critically, in the methods used to identify TAD boundaries. As a result, each study relied on distinct sets of boundaries for analyses of the molecular features of these structures. This study explored several methods to identify topological domains and associated boundaries and found that no single approach was sufficient to exhaustively identify all of these features in the genome. Rather, by using a combination of visual inspection of Hi-C maps at a large number of loci, unbiased hand-calling, and computational searches, a very close, two-way association was consistently observed between sites of combinatorial insulator protein binding (insulators) and the boundaries between topological domains. This result supports prior studies which found enriched insulator protein binding at topological boundaries, and extends this finding by localizing boundaries to discrete insulator elements. Hi-C data are exceptionally complex and reveal many layers of genomic organization, and it is suspected that many questions in this field will only be resolved by the combined work of multiple groups using distinct analysis strategies and techniques (Stadler, 2017).
The most intriguing finding of this study is the association of TAD boundaries with polytene interbands. The implication that these elements are decompacted, extended chromatin regions provides an attractive model in which simple physical separation explains multiple activities associated with insulators, including the ability to block enhancer-promoter interactions, prevent the spread of silenced chromatin, and organize chromatin structure (Stadler, 2017).
A number of prior observations are consistent with the identity of insulators/boundaries as interbands. First, estimates suggest that there are ~5000 interbands constituting 5% of genomic DNA, with an average length of 2 kb, numbers that are in line with the current estimates of boundary element length and number. Second, interbands are associated with insulator proteins, with CP190 appearing to be a constitutive feature of all or nearly all interbands, which is precisely what was observed for boundary elements. Third, interbands and boundary elements are highly sensitive to DNase digestion. Fourth, interbands have been shown to contain the promoters and 5' ends of genes, and this study shows a strong enrichment for promoters oriented to transcribe away from boundaries, which would place upstream regulatory elements within or near the interband. Finally, deletion of both isoforms of BEAF-32, the second-most highly enriched insulator protein at boundary elements, results in polytene X chromosomes that exhibit loss of banding and are wider and shorter than wild type, consistent with a loss of decompacted BEAF-32-bound regions. It is possible that interbands in polytene chromosomes result from multiple underlying molecular phenomena, but it is likely that decompacted insulator elements constitute a significant fraction of these structures (Stadler, 2017).
While frequent looping of insulators has not been seen in Hi-C data from fly tissue, the current model of chromatin compaction at insulators is not mutually exclusive with a role for looping in the function of some insulators. Indeed, a limited set of cases was seern in which interactions between boundaries seem to organize special genome structures with, at least in the case of the Scr locus, clear functional implications. It is likely that additional boundary-associated distal interactions will be found in other tissues and stages of fly development. However, it is emphasized that these interactions are rare and do not appear to be general features of the function of boundary elements (Stadler, 2017).
The data presented in this study offer a picture of the structure of the interphase chromatin of Drosophila that attempts to unify years of studies of polytene chromosomes with modern genomic methods. In this picture, interphase chromatin consists of alternating stretches of compacted, folded chromatin domains separated by regions of decompacted, stretched regions. The compacted regions vary in size from a few to hundreds of kilobases and correspond to both polytene band regions and TADs in Hi-C data. Decompacted regions that separate these domains are short DNA elements that are defined by the strong binding of insulator proteins and correspond to polytene interbands and TAD boundaries (insulators). An intuitive view of this structure in a non-polytene context might resemble the well-worn 'beads on a string', in which insulator/interband regions are the string and bands/TADs form beads of various sizes. Future work, including experimental manipulation of the sequences underlying these structures, will focus on validating and refining this model, exploring how it fits into hierarchical levels of genome organization, and understanding its implications for genome function (Stadler, 2017).
The Polycomb group (PcG) proteins are key conserved regulators of development, initially discovered in Drosophila and now strongly implicated in human disease. Nevertheless, differing silencing properties between the Drosophila and mammalian PcG systems have been observed. While specific DNA targeting sites for PcG proteins called Polycomb response elements (PREs) have been identified only in Drosophila, involvement of non-coding RNAs for PcG targeting has been favored in mammals. Another difference lies in the distribution patterns of PcG proteins. In mouse and human cells, PcG proteins show broad distributions, significantly overlapping with H3K27me3 domains. In contrast, only sharp peaks on PRE regions are observed for most PcG proteins in Drosophila, raising the question of how large domains of H3K27me3, up to many tens of kilobases, are formed and maintained in Drosophila. This study provides evidence that PcG distributions on silent chromatin in Drosophila are considerably broader than previously detected. Using BioTAP-XL, a chromatin crosslinking and tandem affinity purification approach, a broad, rather than PRE-limited overlap of PcG proteins with H3K27me3 was found, suggesting a conserved spreading mechanism for PcG in flies and mammals (Jung, 2016).
Epigenetic inheritance models posit that during Polycomb repression, Polycomb Repressive Complex 2 (PRC2) propagates histone H3K27 tri-methylation (H3K27me3) independently of DNA sequence. This study shows that insertion of Polycomb Response Element (PRE) DNA into the Drosophila genome creates extended domains of H3K27me3-modified nucleosomes in the flanking chromatin and causes repression of a linked reporter gene. After excision of PRE DNA, H3K27me3 nucleosomes become diluted with each round of DNA replication and reporter gene repression is lost, whereas in replication-stalled cells, H3K27me3 levels stay high and repression persists. Hence, H3K27me3-marked nucleosomes provide a memory of repression that is transmitted in a sequence-independent manner to daughter strand DNA during replication. In contrast, propagation of H3K27 tri-methylation to newly incorporated nucleosomes requires sequence-specific targeting of PRC2 to PRE DNA (Laprell, 2017).
The ability of certain histone-modifying enzymes to bind to the modification they generated has led to models where such enzymes might propagate modified chromatin domains by a positive feedback loop, independently of the underlying DNA sequence. Two paradigms of chromatin states have been proposed to be maintained by such an epigenetic inheritance mechanism: constitutive heterochromatin with histone H3 lysine 9 di- and tri-methylation (H3K9me2/3) generated by Suv39/Clr4 enzymes, and Polycomb-repressed chromatin marked with H3K27me3 by PRC2. In both chromatin states, these histone modifications are essential for repressing gene transcription. To date, there is compelling evidence that H3K9me2/3- and H3K27me3-modified nucleosomes are transmitted to daughter strand DNA during replication. However, the steps required to propagate these modifications are much less understood. Fission yeast Clr4 has the capacity to propagate ectopically induced H3K9me2/3 domains over many cell divisions by an H3K9me2/3-based positive feedback loop but only in cells mutated for H3K9me2/3 demethylase activity. In the case of PRC2, allosteric activation of the enzyme induced by binding to H3K27me3 has been proposed to be the foundation for propagating H3K27me3 chromatin. In mammalian cells, transient DNA-tethering of PRC2 generates short ectopic H3K27me3 domains that were at least partially maintained for several cell divisions after release of DNA-tethered PRC2. However, in Drosophila, where PRC2 and other Polycomb group (PcG) protein complexes are targeted to PREs, repression imposed by insertion of PRE DNA next to a reporter gene was lost upon excision of PRE DNA. This study investigated how insertion and excision of PRE DNA at ectopic sites in Drosophila affects binding of PcG proteins and H3K27me3 at the molecular level (Laprell, 2017).
Two previously described strains were analyzed that each carried a single copy of the >PRE>dppWE-TZ reporter gene, integrated at different chromosomal locations. >PRE>dppWE-TZ contains a 1.6 kilobase (kb) DNA fragment of the bxd PRE from the HOX gene Ultrabithorax (Ubx), flanked by FRT recombination sites (>PRE>) to permit excision of PRE DNA by Flp-mediated recombination. Adjacent to the >PRE> cassette, the construct contains a reporter gene comprising the wing imaginal disc enhancer from decapentaplegic (dpp) (E), linked to the hsp70 TATA box minimal promoter (T) and LacZ sequences encoding β-galactosidase (Z) . In the presence of the >PRE> cassette, the transgene was silenced and no β-galactosidase activity could be detected in wing imaginal discs of >PRE>dppWE-TZ transgenic animals. In contrast, >dppWE-TZ transgenic animals, generated by excision of the >PRE> cassette in the germ line, showed strong β-galactosidase expression in the characteristic pattern driven by the dpp enhancer. The observation that silencing of the intact >PRE>dppWE-TZ reporter gene is lost in mutants lacking PRC2 function, prompted determination of the H3K27 methylation profile and binding of PcG proteins across the transgene. In both lines, the transgene had inserted into a genomic location normally devoid of H3K27me3 and PcG protein binding. Chromatin immunoprecipitation (ChIP) assays were performed on batches of wing imaginal discs from >PRE>dppWE-TZ and the corresponding >dppWE-TZ transgenic animals, and the immunoprecipitates were analyzed by quantitative real-time PCR (qPCR). For qPCR, primer pairs were used that selectively amplified transgene sequences and sequences in the genomic regions flanking the transgene insert. As controls, primer pairs were used amplifying sequences at the endogenous bx PRE in Ubx that are known to be bound by PcG proteins (C2) or enriched for H3K27me3 (C1 and C3) and at two regions elsewhere in the genome (C4 and C5) without PcG protein binding or H3K27me3 (Laprell, 2017).
The PRC1 subunits Polycomb (Pc), Polyhomeotic (Ph) and the PRC2 subunit E (z) were specifically enriched at the transgene PRE in animals carrying >PRE>dppWE-TZ and, as expected, no binding was detected in >dppWE-TZ animals. In both >PRE>dppWE-TZ transgenic lines, H3K27me3 was present at high levels across a domain that extended about 4-5 kb to either side of the >PRE> cassette, spanning almost the entire construct. No enrichment of H3K27me3 was detectable at the >dppWE-TZ transgene. At >PRE>dppWE-TZ, PRC2 thus tri-methylates H3K27 across a chromatin interval that spans about 8-10 kb (Laprell, 2017).
To estimate to what extent nucleosomes at the >PRE>dppWE-TZ transgene are tri-methylated at H3K27, the H3K27me2 profile was determined. H3K27me2 levels across the >PRE>dppWE-TZ transgene were much lower than at C4 and C5 and comparable to the levels at Ubx (regions C1-C3) that is repressed and predominantly tri-methylated at H3K27 in wing imaginal discs. Conversely, across >dppWE-TZ, H3K27me2 levels were much higher and comparable to those seen at C4 and C5. This suggest that the nucleosomes across the >PRE>dppWE-TZ transgene are predominantly tri-methylated at H3K27 (Laprell, 2017).
Excision of the >PRE> cassette from >PRE>dppWE-TZ transgenic animals by heat-shock induced expression of Flp during larval development results in appearance of β-galactosidase expression in the dpp pattern 12 hours after the heat shock. Efficiency of PRE excision was measured and it was found that 8 hours after a single 1-hour heat shock, excision had occurred in about 95% of wing imaginal disc cells. The delayed increase of β-galactosidase expression over time suggests a gradual rather than abrupt loss of repression. ChIP analyses were performed on chromatin prepared from batches of entire wing imaginal discs dissected from >PRE>dppWE-TZ transgenic animals 12, 32 or 56 hours after Flp-induction. This allowed monitoring the consequences of PRE excision in cells that had undergone at least one (+12 hours), at least two (+32 hours), or more than four (+56 hours) cell divisions. 12 hours after Flp-induction, H3K27me3 levels were at least two-fold reduced across the entire transgene and further reduced by at least two-fold at the 32 hour time point. 56 hours after Flp-induction, H3K27me3 levels across the transgene were nearly as low as in >dppWE-TZ animals derived from >dppWE-TZ germ cells. The histone H3 profile was unaltered at all time points, suggesting that PRE excision does not cause global disruption of nucleosome occupancy across the transgene. The loss of H3K27me3 after PRE excision suggests that PRC2 is unable to propagate H3K27me3 across the >dppWE-TZ transgene in the absence of PRE DNA (Laprell, 2017).
In parallel, Pc protein binding was monitored after PRE excision. Pc, unlike Ph or other PRC1 subunits, is not only bound at PREs but also associates with the chromatin flanking PREs likely reflecting its interaction with H3K27me3-modified nucleosomes. 12 hours after PRE excision, Pc binding at the transgene was already almost reduced to background levels (Laprell, 2017).
The H3K27me3 profile at the >PRE>dppWE-UZ transgene that contains a 4.1 kb fragment of the Ubx promoter instead of the hsp70 minimal promoter was then analyzed. At >PRE>dppWE-UZ, the H3K27me3 domain spans about 12 kb and is thus about 4 kb longer than at >PRE>dppWE-TZ. Nevertheless, after PRE excision, H3K27me3 at dppWE-UZ was lost at a rate comparable to that seen at dppWE-TZ. Ubx promoter DNA thus does not enable H3K27me3 propagation. It is concluded that even at a domain that spans 12 kb and therefore comprises about 60 nucleosomes, PRC2 is unable to propagate H3K27me3 in the absence of PRE DNA (Laprell, 2017).
The H3K27me3 profile and reporter gene repression was then analyzed after PRE excision in animals in which DNA replication had been blocked. Larvae were reared in liquid medium containing Aphidicolin, an inhibitor of DNA polymerases A and D, which resulted in a complete block of DNA replication in imaginal discs. In larvae reared in Aphidicolin-containing medium, Flp-induced PRE excision from >PRE>dppWE-TZ was as efficient as under normal growth conditions but 12 hours after excision, H3K27me3 levels at the transgene were undiminished compared to +PRE control larvae. In larvae reared in liquid medium without Aphidicolin, PRE excision resulted in the expected two-fold reduction of H3K27me3 levels after 12 hours. Together, this suggests that the loss of H3K27me3 nucleosomes after PRE excision in proliferating cells reflects their dilution as they become transmitted to DNA daughter strands during replication. Unlike under normal growth conditions, Aphidicolin-treated larvae lacked detectable β-galactosidase expression 12 hours after PRE excision. When these animals were permitted to recover in medium lacking Aphidocolin, they resumed DNA replication and began expressing β-galactosidase. If DNA replication is blocked and H3K27me3 levels stay high, repression is thus also sustained in the absence of PRE DNA, possibly by PRC1 (Laprell, 2017).
Finally, PRE excision was induced from >PRE>dppWE-TZ in larvae that were hemizygous for UtxΔ, a null mutation in the single H3K27me3 demethylase in Drosophila. 12 hours after Flp-induction, H3K27me3 levels at the transgene were reduced about two-fold, like in wild-type animals. This suggest that demethylation of H3K27me3 by Utx does not contribute to the disappearance of H3K27me3 from transgene chromatin after PRE excision (Laprell, 2017).
These results lead to the following conclusions. First, PRE cis-regulatory DNA provides the genetic basis not only for generating but also for propagating H3K27me3-modified chromatin. This argues against a simple epigenetic model where PRC2 binding to parental H3K27me3 nucleosomes after replication would suffice to propagate H3K27 tri-methylation in daughter strand chromatin. PRC2 needs to be recruited to PRE DNA first, before allosteric activation through interaction with H3K27me3 nucleosomes in flanking chromatin may then facilitate methylation of newly incorporated nucleosomes. Secondly, following PRE excision and replication, parental H3K27me3 nucleosomes remain associated with the same underlying DNA in daughter cells and thus provide epigenetic memory. However, while in replication-stalled cells high H3K27me3 levels permit to sustain repression also in the absence of PRE DNA, their dilution in proliferating cells is accompanied with loss of repression after one cell division. H3K27me3 nucleosomes therefore only appear to provide short-term epigenetic memory of the repressed state. Hence, DNA targeting of PRC2 after replication to replenish H3K27me3 is critical to preserve repression (Laprell, 2017).
Drosophila HOX and other large-size PcG target genes often contain multiple PREs and H3K27me3 domains that span dozens of kilobases. Deletion of single PREs from these genes typically results in only minor diminution of the H3K27me3 profile and misexpression is less severe than misexpression of the native genes in PcG mutants. Furthermore, when the same >PRE> cassette that was used in this study was excised from a Ubx-LacZ reporter gene with more extended Ubx upstream regulatory sequences, repression was lost with a longer delay\, suggesting that additional elements with PRE properties in those Ubx sequences permitted to sustain repression through more cell divisions. The evolution of PRE DNA sequences and of their frequency and arrangement within target genes may thus ultimately determine stability and heritability of H3K27me3 chromatin and PcG repression (Laprell, 2017).
O-GlcNAc Transferase (OGT/SXC) is essential for Polycomb repression suggesting that the O-GlcNAcylation of proteins plays a key role in regulating development. OGT transfers O-GlcNAc onto serine and threonine residues in intrinsically disordered domains of key transcriptional regulators; O-GlcNAcase (OGA) removes the modification. To pinpoint genomic regions that are regulated by O-GlcNAc levels, ChIP-chip and microarray analysis was performed after OGT or OGA RNAi knockdown in S2 cells. After OGA RNAi, a genome-wide increase was observed in the intensity of most O-GlcNAc-occupied regions. In contrast, O-GlcNAc levels were strikingly insensitive to OGA RNAi at sites of polycomb repression. Microarray analysis suggested that altered O-GlcNAc cycling perturbed the expression of genes associated with morphogenesis and cell cycle regulation. A viable null allele of oga (ogadel.1) was produced in Drosophila allowing visualization of altered O-GlcNAc cycling on polytene chromosomes. Trithorax (Trx), Absent small or homeotic discs 1 (Ash1) and Compass member Set1 histone methyl-transferases were O-GlcNAc-modified in ogadel.1 mutants. The ogadel.1 mutants displayed altered expression of a distinct set of cell cycle related genes. These results show that the loss of Oga in Drosophila globally impacts the epigenetic machinery allowing O-GlcNAc accumulation on RNA Polymerase II and numerous chromatin factors including Trx, Ash1 and Set1 (Akin, 2016).
Polycomb group (PcG) complexes PRC1 and PRC2 are well known for silencing specific developmental genes. PRC2 is a methyltransferase targeting histone H3K27 and producing H3K27me3, essential for stable silencing. Less well known but quantitatively much more important is the genome-wide role of PRC2 that dimethylates approximately 70% of total H3K27. H3K27me2 occurs in inverse proportion to transcriptional activity in most non-PcG target genes and intergenic regions and is governed by opposing roaming activities of PRC2 and complexes containing the H3K27 demethylase UTX. Surprisingly, loss of H3K27me2 results in global transcriptional derepression proportionally greatest in silent or weakly transcribed intergenic and genic regions and accompanied by an increase of H3K27ac and H3K4me1. H3K27me2 therefore sets a threshold that prevents random, unscheduled transcription all over the genome and even limits the activity of highly transcribed genes. PRC1-type complexes also have global roles. Unexpectedly, a pervasive distribution of histone H2A ubiquitylated at lysine 118 (H2AK118ub) was found outside of canonical PcG target regions, dependent on the RING/Sce subunit of PRC1-type complexes. It was shown, however, that H2AK118ub does not mediate the global PRC2 activity or the global repression and is predominantly produced by a new complex involving L(3)73Ah, a homolog of mammalian PCGF3 (Lee, 2016).
Mitosis brings about major changes to chromosome and nuclear structure. This study used recently developed proximity ligation assay-based techniques to investigate the association with DNA of chromatin-associated proteins and RNAs in Drosophila embryos during mitosis. All groups of tested proteins, histone-modifying and chromatin-remodeling proteins and methylated histones remained in close proximity to DNA during all phases of mitosis. RNA transcripts were found to be associated with DNA during all stages of mitosis. Reduction of H3K27me3 levels or elimination of RNAs had no effect on the association of the components of PcG and TrxG complexes to DNA. Using a combination of proximity ligation assay-based techniques and super-resolution microscopy, he number of protein-DNA and RNA-DNA foci was found to undergo significant reduction during mitosis, suggesting that mitosis may be accompanied by structural re-arrangement or compaction of specific chromatin domains (Black, 2016).
Eukaryotic DNA replicates asynchronously, with discrete genomic loci replicating during different
stages of S phase. Drosophila larval tissues undergo endoreplication without cell division, and the
latest replicating regions occasionally fail to complete endoreplication, resulting in
underreplicated domains of polytene chromosomes. This study shows that linker histone H1 is required for the underreplication (UR) phenomenon
in Drosophila salivary glands. H1 directly interacts with the
Suppressor of UR (SUUR) protein and is
required for SUUR binding to chromatin in vivo. These observations implicate H1 as a critical factor
in the formation of underreplicated regions and an upstream effector of SUUR. It was also
demonstrated that the localization of H1 in chromatin changes profoundly during the endocycle. At
the onset of endocycle S (endo-S) phase, H1 is heavily and specifically loaded into late replicating
genomic regions and is then redistributed during the course of endoreplication. The data suggest
that cell cycle-dependent chromosome occupancy of H1 is governed by several independent processes.
In addition to the ubiquitous replication-related disassembly and reassembly of chromatin, H1 is
deposited into chromatin through a novel pathway that is replication-independent, rapid, and
locus-specific. This cell cycle-directed dynamic localization of H1 in chromatin may play an
important role in the regulation of DNA replication timing (Andreyeva, 2017).
This study demonstrated that virtually all major sites of UR throughout the Drosophila genome exhibit a
substantial increase in salivary gland DNA copy number upon depletion of the linker histone H1, thus implicating H1 in the regulation of endoreplication. In control knockdown salivary glands, 46 underreplicated domains were identified. While these regions are in general agreement with previous efforts
to map underreplicated domains by less sensitive microarray analyses, fewer underreplicated sites were identified than a recent report that used high-throughput
sequencing of salivary gland DNA (Yarosh, 2014). Notably, the underreplicated domains
that the current analyses failed to detect represent sites with the weakest degree of UR. One possible
source of variation is the distinct technical approach that was used compared with Yarosh (2014), as simultaneous sequencing of a nonpolytenized (embryonic) genome as a means to
normalize the reads from underrepresented sequences in polytenized tissues (Yarosh, 2014) likely provides additional sensitivity. Another potential explanation could lie in the
relative sequencing depth of the respective assays (approximately fourfold lower in the current study),
considered crucial for the analyses of next-generation sequencing data. However, this explanation is less likely, as subsampling of the current reads to much lower
depths yielded no appreciable difference in the number and location
of identified underreplicated sites or the change in copy number upon H1 knockdown (Andreyeva, 2017).
On average, a moderate knockdown of H1 led to an ~50% copy number gain at the center of
underreplicated domains in intercalary heterochromatin (IH; large dense bands scattered in euchromatin comprising clusters of repressed genes. The copy number is not restored to the same degree as that
in a SuUR genetic mutant. The difference is
likely attributable to the incomplete depletion of H1. In fact, in an independent biological
validation experiment that resulted in an ~95% depletion of H1, an almost complete restoration of
copy number was observed. The observation of an almost complete reversal of UR
in cells depleted of H1 (but still wild type for SuUR) strongly suggests an epistatic mechanism of
action in which both H1 and SUUR act together in the same biochemical pathway (Andreyeva, 2017).
This study found that H1 and SUUR are also involved in UR of PH. For instance, both the mapped pericentric
regions and TE sequences, which are
highly abundant in pericentric regions, exhibit an
increase of DNA copy number upon H1 knockdown. The SuURES mutation also results in a
robust loss of UR at PH, as measured by changes in DNA copy number at TEs. The abrogation of H1
expression gives rise to a somewhat weaker effect on the UR of PH than that of IH, which is consistent with an almost complete elimination of SUUR protein from polytene
chromosome arms in salivary glands depleted of H1 by RNAi but the persistence of residual SUUR at
their PH. The role of H1 in maintaining the underreplicated state of PH may be relevant to
its important regulatory functions in constitutive heterochromatin, where it recruits Su(var)3-9,
facilitates H3K9 methylation, and maintains TEs in a transcriptionally repressed state. Recently, it was proposed that TE repression in ovarian somatic cells involves an H3K9
methylation-independent process through recruitment of H1 by Piwi-piRNA complexes, resulting in
reduced chromatin accessibility. The current results also implicate UR of TE sequences
in polytenized cells as yet another putative mechanism that contributes to regulation of their
expression. Interestingly, it was shown previously that double mutants encompassing both the
Su(var)3-9 and SuUR mutant alleles exhibit a synthetically increased predominance of novel
band-interband structures at PH compared with the mutation of SuUR alone.
While the evidence suggests a relationship between UR and transcriptionally repressive epigenetic
states, such as H3K9 methylation, the nature of this relationship remains largely speculative (Andreyeva, 2017).
This study demonstrated that SUUR
protein physically interacts with H1 in both a complex mixture of whole-cell extracts that contain
endogenous native H1 and recombinant purified H1 polypeptides. Furthermore, the particular structural domains of the two proteins were delimited that are required for the interaction. SUUR protein contains several sequence features that have
been implicated in regulation of UR and binding to specific proteins. Although SUUR possesses a
putative bromodomain, it contains no identifiable DNA-binding domain, so the
mechanism that allows SUUR to exhibit a preference for specific genomic underreplicated loci is
unknown. The positively charged central region is both necessary and sufficient to interact with
heterochromatin protein 1a (HP1a), which suggests a possible involvement of
HP1a in tethering SUUR to H3K9me2/3-rich PH. However, the specific localization of SUUR to
underreplicated IH, which is not enriched for H3K9me2/3, remains enigmatic. This study
now demonstrates that the central region of SUUR is also sufficient for binding directly to H1 in
vitro. Considering that the central region of SUUR is essential for the faithful localization of the
protein to chromatin in vivo, including underreplicated IH, it seems likely that H1 directly mediates the tethering of SUUR to chromatin in underreplicated regions (Andreyeva, 2017).
The tripartite structure of H1 provides multiple binding interfaces for interacting proteins and
thus allows H1 to mediate several biochemically separable functions in vivo. For
instance, the globular domain and proximal 25% of the CTD are required for H1 loading into
chromatin, while the proximal 75% of the CTD is needed for normal polytene morphology, H3K9
methylation, and physical interactions with Su(var)3-9. This study discovered a previously
unknown function for the distal 25% of the H1 CTD, which is shown to be essential for binding to
SUUR. Deletion of this region of H1 results in a near-complete loss of the interaction with SUUR. Thus, in addition to its critical functions in heterochromatin structure and activity,
the CTD of H1 is likely also important in facilitating UR (Andreyeva, 2017).
One of the most
striking findings in this study is the observation that the genomic occupancy of H1 undergoes
profound changes during the endoreplication cycle. It also remains largely mutually exclusive with
that of DNA polymerase clamp loader PCNA, which is consistent with the observed depletion
of H1 in nascent chromatin compared with mature chromatin (Andreyeva, 2017).
H1 is heavily loaded into late replicating loci at the onset of replication (when these loci are
silent for replication). Combined, the current observations indicate that the chromosome distribution of H1
during the endocycle is governed by at least three independent processes. Two
of them [replication-dependent (RD) eviction of H1 and RD deposition of H1 after the passage of
replication fork] are related to the well-recognized obligatory processes of chromatin disassembly
and reassembly during replication. The third
pathway, which directs early deposition of H1 into late replicating loci, has not been described
previously. This process is (1) replication-independent (RI); (2) locus-specific, with a strong
preference for late replicating sites; and (3) apparently more rapid than the RD deposition of H1,
since very high levels of H1 occupancy are observed in all nuclei immediately after the initiation
of endo-S. It is possible that the RI pathway of H1 loading into chromatin is mediated by a
selective recruitment of H1 based on epigenetic core histone modification-dependent mechanisms. For
instance, mammalian H1.2 was reported to recognize H3K27me3, and this modification
is very abundant in IH (Sher et al. 2012) (Andreyeva, 2017).
Also, the RI mechanism for deposition of H1 probably does not involve de novo nucleosome assembly,
as H1 is known to exhibit a mutually exclusive distribution with RI core histone variants, and there is no known nuclear process during early S phase that requires
core histone turnover. In the future, it will be interesting to further confirm that RI nucleosome
assembly does not take place during early replication in salivary gland polytene chromosomes.
Finally, the locus-specific RI deposition of H1 in early endo-S chromatin may be conserved in the
normal S phase of diploid tissues, and it will require independent experimentation with sorted
mitotically dividing cells to confirm this possibility (Andreyeva, 2017).
This study also provides cytological evidence that
the functions of H1 and SUUR are biochemically linked. Specifically, it was demonstrated that SUUR
localizes to a subset of H1-positive bands and requires H1 for its precise distribution
in polytene chromosomes, nuclear localization, and stability in salivary gland
cells. These observations implicate H1 as an upstream effector of SUUR functions in vivo
and an essential component of the biological pathway that maintains loci of reduced ploidy in
polytenized cells. Importantly, this finding adds to a growing list of biochemical partners of H1
that mediate their chromatin-directed functions in an H1-dependent fashion (Andreyeva, 2017).
Interestingly, even a moderate depletion of H1 (to ~30% of normal) results in a complete removal of
SUUR from chromosome arms. Thus, H1-dependent localization of SUUR
requires high concentrations of the linker histone in chromatin. This conclusion is also consistent
with SUUR colocalization with polytene loci that are the most strongly stained for H1.
In contrast, elimination of the H3K9me2 mark from polytene spreads requires very extensive depletion
of H1, whereas the moderate depletion of H1 does not strongly affect H3K9
dimethylation in the chromocenter or polytene arms. Therefore, the robust
effect of even moderate H1 depletion on SUUR localization in chromatin is unlikely to be mediated
indirectly through disorganization of heterochromatin structure (Andreyeva, 2017).
Unexpectedly, the cell cycle-dependent temporal pattern of H1 localization is not identical
to that of SUUR. In contrast to H1, SUUR protein (1) is only weakly present in IH during early
endo-S phase, (2) achieves the maximal occupancy at IH loci only in the late endo-S, and (3)
colocalizes with PCNA at certain sites. The observations made in this
study and in previous works can be summarized in the following model for H1-mediated regulation of
SUUR association with chromatin. The initiation of the deposition of SUUR in
chromosomes is strongly dependent on H1. More specifically, SUUR is preferentially localized to
chromatin domains that are highly enriched for H1. For instance, the tremendously elevated
concentration of H1 in IH of early endo-S cells promotes and nucleates the initiation of deposition
of SUUR into these regions. However, the pattern of SUUR occupancy at these sites does not occur
temporally in parallel with that of H1. Initially, the exceptionally high abundance of H1 in late
replicating loci during early endo-S is not paralleled by a simultaneous comparable increase of SUUR
occupancy. Rather, loading of SUUR into these sites lags
significantly behind H1 occupancy. Thus, the rate of SUUR localization to H1-rich IH appears to be
much slower than that of the RI deposition of H1 into these loci. After the initial recruitment,
further loading of SUUR does not require H1, and SUUR continues (in a slower fashion) to accumulate
at IH throughout the endo-S phase even when H1-enriched domains dissipate in the course of DNA
endoreplication. The additional loading of SUUR in chromatin is likely facilitated by its
self-association through dimerization of the N terminus and physical
interactions with the replication fork, as proposed previously. In this
fashion, SUUR achieves its maximal concentration in IH loci by the late endo-S (Andreyeva, 2017).
This study has demonstrated that H1 has a pivotal
function in the establishment of UR of specific IH loci in polytenized salivary gland cells. The
findings that H1 interacts directly with SUUR in vitro and is required for SUUR localization to late
replicating IH in polytene chromosomes in vivo strongly suggest that the H1-mediated recruitment of
SUUR promotes UR by obstructing replication fork progression in its cognate underreplicated loci but
does not affect replication origin firing. However, the remarkable temporal
pattern of H1 distribution in endoreplicating polytene chromosomes suggests that it may also play a
direct SUUR-independent role in regulation of endoreplication. This is especially plausible
considering that the temporal distribution patterns of SUUR and H1 are dissimilar (Andreyeva, 2017).
In contrast to the role of SUUR in slowing down the replication fork progression during late endo-S
phase, H1 (acting in the absence of SUUR during early endo-S) may function to repress the initiation
of endoreplication, as proposed in several studies. DNA-seq analyses also suggest this mechanism. Compared with the relatively smooth, flat profiles of
DNA copy numbers in SuURES mutant salivary glands, the profiles in
H1-depleted cells exhibit a jagged, uneven appearance, indicative of aberrant local
initiation of replication. Unfortunately, the experimental system (cytological analyzes of salivary
glands) cannot be used to further confirm this idea. First, an extensive depletion of H1 results in
the loss of polytene morphology; second, since the staging of endo-S progression is
based on PCNA staining, a spurious activation of ectopic replication origins would result in an
incorrect calling of the stage. To further complicate these analyses, polytenized cells are not
amenable to other methods of cell cycle staging, such as fluorescence-activated cell sorting (FACS).
In the future, it will be important to examine the role of H1 in regulation of DNA replication
timing in sorted Drosophila diploid cells (Andreyeva, 2017).
Post-translational modifications (PTMs) of core histones are important epigenetic determinants that correlate with functional chromatin states. Thus study addresses the function of PTMs in Drosophila that encodes a single somatic linker histone, dH1. It has been reported that dH1 is dimethylated at K27 (dH1K27me2). This study shows that dH1K27me2 is a major PTM of Drosophila heterochromatin. At mitosis, dH1K27me2 accumulates at pericentromeric heterochromatin, while, in interphase, it is also detected at intercalary heterochromatin. ChIPseq experiments show that >98% of dH1K27me2 enriched regions map to heterochromatic repetitive DNA elements, including transposable elements, simple DNA repeats and satellite DNAs. Moreover, expression of a mutated dH1K27A form, which impairs dH1K27me2, alters heterochromatin organization, upregulates expression of heterochromatic transposable elements and results in the accumulation of RNA:DNA hybrids (R-loops) in heterochromatin, without affecting H3K9 methylation and HP1a binding. The pattern of dH1K27me2 is H3K9 methylation independent, as it is equally detected in flies carrying a H3K9R mutation, and is not affected by depletion of Su(var)3-9, HP1a or Su(var)4-20. Altogether these results suggest that dH1K27me2 contributes to heterochromatin organization independently of H3K9 methylation (Bernues, 2022).
Transgenerational epigenetic inheritance (TEI) describes the transmission of alternative functional states through multiple generations in the presence of the same genomic DNA sequence. Very little is known about the principles and the molecular mechanisms governing this type of inheritance. In this study, by transiently enhancing 3D chromatin interactions, stable and isogenic Drosophila epilines were established that carry alternative epialleles, as defined by differential levels of Polycomb-dependent trimethylation of histone H3 Lys27 (forming H3K27me3). After being established, epialleles can be dominantly transmitted to naive flies and can induce paramutation. Importantly, epilines can be reset to a naive state by disruption of chromatin interactions. Finally, it was found that environmental changes modulate the expressivity of the epialleles, and this paradigm was extended to naturally occurring phenotypes. This work sheds light on how nuclear organization and Polycomb group (PcG) proteins contribute to epigenetically inheritable phenotypic variability (Ciabrelli, 2017).
Developmental gene expression is tightly regulated through enhancer elements, which initiate dynamic
spatio-temporal expression, and Polycomb response elements
(PREs), which maintain stable gene silencing. These two cis-regulatory functions are thought to
operate through distinct dedicated elements. By examining the occupancy of the Drosophila pleiohomeotic repressive complex (PhoRC) during embryogenesis,
extensive co-occupancy was revealed at developmental enhancers. Using an established in vivo assay
for PRE activity, it was demonstrated that a subset of characterized developmental enhancers can
function as PREs, silencing transcription in a Polycomb-dependent manner. Conversely, some classic
Drosophila PREs can function as developmental enhancers in vivo, activating spatio-temporal
expression. This study therefore uncovers elements with dual function: activating transcription in
some cells (enhancers) while stably maintaining transcriptional silencing in others (PREs). Given
that enhancers initiate spatio-temporal gene expression, reuse of the same elements by the Polycomb
group (PcG) system may help fine-tune gene expression and ensure the timely maintenance of cell
identities (Erceg, 2017).
While enhancers initiate spatio-temporal transcriptional activity, PREs maintain a previously determined transcriptional state of their target genes, thus leading to transcriptional memory. PREs are generally thought to be dedicated solely to gene silencing and not to contain enhancer-like features to activate gene expression. This study presents evidence to the contrary, that both functions can be encoded in the same cis-regulatory element, depending on the cellular context. This is not a rare event -- almost 25% of PhoRC occupancy is at developmental enhancers. Of the 16 elements that this study tested experimentally (either enhancers for PRE activity or PREs for enhancer activity), nine have dual function, being sufficient to activate transcription in a specific spatio-temporal pattern and mediate PcG-dependent silencing in vivo (Erceg, 2017).
These dual elements have interesting implications for transcriptional regulation during embryonic development. First, at the level of PcG protein recruitment, this subset of enhancers is highly enriched in the Pho motif, which distinguishes them from other developmental enhancers. This suggests that the recruitment of Pho to PhoRC enhancers is direct via sequence-specific DNA binding, consistent with an instructive model of recruitment, although other factors are likely involved. PcG proteins and developmental TFs bind in close proximity to each other within the same element (a single DNase hypersensitive site), raising the possibility of direct interplay between the two. The results
indicate that the activity of PhoRC-bound enhancers is dominated by tissue-specific TFs that activate transcription in some cells while being dominated by a functional PcG complex in other cells. Is this due to mutually exclusive occupancy of developmental TFs and PcG proteins in different tissues, or do they compete functionally at these elements? The dramatic derepression of enhancer activity in different cell types upon PcG protein removal suggests that other tissue-specific TFs must occupy these enhancers in the PcG silenced cell. This has interesting implications for enhancer activity, as it is well known that TFs bind to thousands of sites (tens of thousands in mammalian cells), but only a subset of associated target genes changes expression when the TF is removed. This has led to the general assumption that the majority of binding events is nonfunctional or neutral. These data suggest that at least a subset of this embryonic occupancy can be functional if not actively antagonized by the presence of PcGs (Erceg, 2017).
Second, enhancer-mediated polycomb recruitment has interesting implications for the mechanism of PcG-mediated silencing. The current models suggest that PcG proteins silence transcription mainly by silencing a gene's promoter, in keeping with PcG recruitment to CpG islands in vertebrates, or by coordinating a three-dimensional repressive topology, where the entire gene's locus is silenced. In either mode, a gene's promoter would not be permissive to enhancer activation. The data suggest that there may be a third mode of very local silencing at an individual enhancer, leaving the promoter and the rest of the gene's regulatory landscape open for activation by other enhancers, as was observed at the prat2 locus. This would allow for much more fine-tuning of silencing in individual tissues and stages. It also suggests that PcG proteins could play a more dynamic role, similar to a 'standard' transcriptional repressor at enhancers (Erceg, 2017).
Third, this may have broader implications for cell fate decisions during rapid developmental transitions. When multipotent cells become specified into different lineages, a specific transcriptional program often needs to be activated in one cell while being repressed in other cells from the same progenitor population. Having active enhancers in the precursor cells remain accessible to directly recruit the PcG complexes would ensure that these enhancers become silenced in a timely manner. Conversely, having maternally deposited PcG proteins already bound to enhancers early in development may serve as placeholders to ensure that these dual elements remain open and available for TFs to activate at the appropriate development stage. Interestingly, in the majority of the tested cases, PcG proteins and developmental TFs use these dual elements to regulate the same target gene, the vast majority of which is key developmental regulators of cell identity (Erceg, 2017).
The identification of PREs in other species has remained a key challenge, with only a handful of PREs identified in mammals and plants to date. In mammals, the PcG system is recruited to inactive CpG islands, with few specific sequence features. Although there are mammalian homologs of the Drosophila Pho and dSfmbt proteins, Yin Yang 1 (YY1) and SFMBT, respectively, the conservation of PhoRC as a complex and its involvement in mammalian PcG silencing remain unclear. It is proposed that such dual enhancers/PREs will also exist in mammals, although, given this apparent lack of conservation of YY1 function, their mechanism of PcG recruitment may have diverged (Erceg, 2017).
The Polycomb group (PcG) and Trithorax group (TrxG) proteins are key epigenetic regulators controlling the silenced and active states of genes in multicellular organisms, respectively. While precise mechanisms of the PcG/TrxG protein recruitment remain unknown, the important role is suggested to belong to sequence-specific DNA-binding factors. At the same time, it was demonstrated that the PRE DNA-binding proteins are not exclusively localized to PREs but can bind other DNA regulatory elements, including enhancers, promoters, and boundaries. To gain an insight into the PRE DNA-binding protein regulatory network, in this study, differences in abundance of the Combgap, Zeste, Psq, and Adf1 PRE DNA-binding proteins were sought. While there were no conspicuous differences in co-localization of these proteins with other functional transcription factors, it was shown that Combgap and Zeste are more tightly associated with the Polycomb repressive complex 1 (PRC1), while Psq interacts strongly with the TrxG proteins, including the BAP SWI/SNF complex. The Adf1 interactome contained Mediator subunits as the top interactors. In addition, Combgap efficiently interacted with AGO2, NELF, and TFIID. Combgap, Psq, and Adf1 have architectural proteins in their networks. This study further investigated the existence of direct interactions between different PRE DNA-binding proteins and demonstrated that Combgap-Adf1, Psq-Dsp1, and Pho-Spps can interact in the yeast two-hybrid assay (Chetverina, 2022).
Polycomb-mediated repression of gene expression is essential for development, with a pivotal role played by trimethylation of histone H3 lysine 27 (H3K27me3), which is deposited by Polycomb Repressive Complex 2 (PRC2). The mechanism by which PRC2 is recruited to target genes has remained largely elusive, particularly in vertebrates. This study demonstrates that MTF2, one of the three vertebrate homologs of Drosophila melanogaster Polycomblike, is a DNA-binding, methylation-sensitive PRC2 recruiter in mouse embryonic stem cells. MTF2 directly binds to DNA and is essential for recruitment of PRC2 both in vitro and in vivo. Genome-wide recruitment of the PRC2 catalytic subunit EZH2 is abrogated in Mtf2 knockout cells, resulting in greatly reduced H3K27me3 deposition. MTF2 selectively binds regions with a high density of unmethylated CpGs in a context of reduced helix twist, which distinguishes target from non-target CpG islands. These results demonstrate instructive recruitment of PRC2 to genomic targets by MTF2 (Perino, 2018).
A ten-eleven translocation (TET) ortholog exists as a DNA N(6)-methyladenine (6mA) demethylase (DMAD) in Drosophila. However, the molecular roles of 6mA and DMAD remain unexplored. Through genome-wide 6mA and transcriptome profiling in Drosophila brains and neuronal cells, this study found that 6mA may epigenetically regulate a group of genes involved in neurodevelopment and neuronal functions. Mechanistically, DMAD interacts with the Trithorax-related complex protein Wds to maintain active transcription by dynamically demethylating intragenic 6mA. Accumulation of 6mA by depleting DMAD coordinates with Polycomb proteins and contributes to transcriptional repression of these genes. These findings suggest that active 6mA demethylation by DMAD plays essential roles in fly CNS by orchestrating through added epigenetic mechanisms (Yao, B. 2018).
During central nervous system (CNS) development, spatiotemporal gene expression programs mediate specific lineage decisions to generate neuronal and glial cell types from neural stem cells (NSCs). However, little is known about the epigenetic landscape underlying these highly complex developmental events. This study performed ChIP-seq on distinct subtypes of Drosophila FACS- purified neural stem cells (NSCs) and their differentiated progeny to dissect the epigenetic changes accompanying the major lineage decisions in vivo By analyzing active and repressive histone modifications, this study shows that stem cell identity genes are silenced during differentiation by loss of their activating marks and not via repressive histone modifications. This analysis also uncovers a new set of genes specifically required for altering lineage patterns in type II neuroblasts, one of the two main Drosophila NSC identities. Finally, it was demonstrated that this subtype specification in NBs, unlike NSC differentiation, requires Polycomb-group (PcG)-mediated repression (Abdusselamoglu, 2019).
During development of the central nervous system (CNS), neural stem cells (NSCs) divide asymmetrically to generate daughter cells with self-renewing capacity but also complex neurogenic and gliogenic lineages. Regulation of this process requires tight and highly dynamic control of multiple cell fate decisions. For cells to commit to their ultimate cell identity, spatiotemporal gene expression programs are required. It is assumed that activation of lineage-specific genes and silencing of stem cell genes is accompanied by changes in chromatin states. How histone modifications change over time during neurogenesis in vivo, however, is not very well described (Abdusselamoglu, 2019).
The Drosophila larval CNS has become a key model for the fundamental mechanisms underlying brain development and chromatin states. The larval CNS is populated by distinct types of NSCs, or neuroblasts (NBs), which vary in abundance, neuronal output and division mode. Together, these NBs give rise to the majority of the neurons of the adult brain. The majority of the central brain NBs are of type I (NBIs). Each NBI gives rise to another NBI and a ganglion mother cell (GMC), which divides once more to generate two differentiated neurons or glia. Type II NBs (NBIIs) are a rare subpopulation with only eight NBII per brain lobe. Unlike NBIs, NBIIs divide into one NBII and one transit-amplifying cell called an intermediate neural progenitor (INP). NBIIs generate many more neurons, because INPs continue to divide asymmetrically five or six times, each time giving rise to a GMC that divides into two neurons or glia cells. Other than lineage structure and size, cell markers can also be used to differentiate NB subtypes. Whereas NBIs express both Asense (Ase) and Deadpan (Dpn), NBIIs only express Dpn. During neurogenesis, both NB subtypes divide asymmetrically to give rise to their respective progeny. Brain tumors form if the asymmetric segregation of cell fate determinants during NB cell division is disrupted. Among these determinants are the TRIM-NHL protein Brain tumor (Brat) and the Notch inhibitor Numb. Although Brat depletion results in the generation of ectopic NBII-like tumor NBs (tNBs) at the expense of differentiated brain cells, simultaneous loss of Brat and Numb causes the NBI-like tNBs to overproliferate (Abdusselamoglu, 2019).
In many cell types, transitions in chromatin states are regulated by the evolutionarily conserved Polycomb group (PcG) and Trithorax group (TrxG) proteins. PcG and TrxG have emerged as antagonistic regulators that silence or activate gene expression, respectively. These multimeric protein complexes regulate the transcriptional state of genes by post-translationally modifying amino acid residues of histone tails. PcG proteins exert a repressive activity via two main complexes, the Polycomb repressive complexes 1 and 2 (PRC1 and PRC2). Although PRC1 and PRC2 can exist in various compositions and associate with context-specific accessory proteins, both have been shown to contain a specific core set of proteins including subunits with catalytic activity. Within PRC2, Enhancer of zeste [E(z) in Drosophila, EZH1/2 in mammals] catalyzes the trimethylation of lysine 27 on histone 3 (H3K27me3). H3K27me3 is recognized by PRC1, which in turn includes the histone H2A ubiquityltransferase Sce [RING1A (RING1) and RING1B (RNF2) in mammals]. Histone modifications associated with active transcription are deposited by TrxG proteins, which counteract repressive marks by histone acetylation or methylation, in particular by trimethylation of lysine 4 on histone H3 at active promoters (Abdusselamoglu, 2019).
Although well-known for their role in long-term transcriptional memory, PcG and TrxG complexes are highly dynamic during development and thus facilitate cellular plasticity. In the last decade, it has been shown that PcG and TrxG complexes are crucial for ensuring correct neurogenesis in mammals as well as in Drosophila. Despite the strength of genetic in vivo experiments, however, global analysis of the histone modifications underlying their function, and therefore target genes, has mainly been performed in vitro. This constitutes a real knowledge gap, as recent studies have demonstrated that the chromatin states may vary significantly between in vivo tissues and their related in vitro cell lines, mainly owing to culture conditions. Given also that epigenetic changes are highly context- and developmental time-dependent, providing in vivo datasets to investigate chromatin states of different cell types in complex tissues will increase understanding of how the epigenetic landscape dynamically defines cellular states (Abdusselamoglu, 2019).
In recent years, in vivo studies made use of Drosophila to shed light on the dynamics of chromatin state changes during embryonic neural differentiation and during larval stages. Profiling the binding of chromatin remodelers has highlighted the plasticity of chromatin states during differentiation. Although binding of chromatin factors is associated with active or repressive chromatin, binding does not necessarily reflect downstream histone modifications. For example, the histone marks can change drastically between parasegments of the Drosophila embryo, whereas the occupancy of PcG proteins remains unchanged. Thus, investigating the dynamics of chromatin states based on chromatin marks is crucial for understanding the functional specialization of cells during development. Moreover, how PcG/TrxG complexes target genes on the chromatin level between different subtypes of progenitor cells during neuronal differentiation or tumorigenic transformation has remained elusive (Abdusselamoglu, 2019).
This study has used the Drosophila larval CNS to track in vivo changes of histone modifications not only upon differentiation, but also between different populations of NSCs and their tumorigenic counterparts. A fluorescence-activated cell sorting (FACS)-based method was developed to sort different cell types and perform ChIP-seq for the active histone mark, H3K4me3, and the repressive mark, H3K27me3. The FACS-based approach provides an in vivo dataset that reveals dynamic histone modifications during neuronal differentiation. In particular, it was observed that self-renewal and cell-division genes are repressed independently of H3K27me3 levels. In contrast, it was further shown that H3K27me3-mediated repression is crucial for silencing lineage-specific stem cell factors, including known factors as well as a new set of genes that are specific to NBIIs. Finally, genetic evidence is presented for the requirement of these new NBII-specific factors for self-renewal and demonstrate the role of PcG complexes in defining different subtypes of neural stem cells (Abdusselamoglu, 2019).
This study provides a resource of histone modification datasets for different types of NSCs and their differentiated progeny. In combination with chromatin accessibility and binding maps of chromatin remodelers of Drosophila brain cells, it is hoped that this dataset will serve as a useful community resource. During differentiation, stem cell identity genes are silenced in a PcG-independent manner, which supports previous findings showing that these genes are silenced through HP1-enriched chromatin. In addition, PcG-mediated silencing is unlikely to instruct the stepwise inactivation of stem cell genes during differentiation as loss of H3K27me3 did not induce ectopic NBs (Abdusselamoglu, 2019).
This study has taken advantage of in vivo genetic labeling to investigate chromatin dynamics of different NB subtypes. As the type II NBs are very lowly abundant, tumor NBs of type I and type II origins were used as a proxy in order to obtain enough material to be able to compare these two cell types. Each change was further validated by comparing tumor with healthy type I NBs and excluded artifacts present because of the tumorigenic state of the cells. The data show that both TrxG and PcG are required to establish NBII identity. A set of NBII-specific genes was identified, including previously identified btd and Sp1. Dll and eya, which are specifically required for NBII maintenance, were identified. It has been previously described that btd acts as an activator of Dll in the development of the ventral imaginal discs. This suggests that in NBII-identity specification the Trithorax-target btd could act together with Dll and eya. Such a mechanism would explain why the loss of btd causes a distinct phenotype compared with the loss of Dll and eya. Interestingly, an NBI to NBII conversion is observed only in 18% of NBIs ectopically overexpressing btd, indicating that either co-factors are missing or that the chromatin of btd targets is inaccessible. The data of NB subtype-specific genes being characterized by H3K27me3 repressive chromatin favor the latter. Therefore, as opposed to TrxG-activated stem cell and mitosis genes, the repression of NBII-specific genes is ensured by PcG-mediated H3K27me3 histone modifications, suggesting that Polycomb plays a role in defining the diversity of NSC lineages. Moreover, the data indicate that PcG repression is required not only for the silencing of HOX genes but also for the self-renewal capacity of NBs. Unlike TrxG, the loss of catalytic subunits of PcG complexes did not convert NBIIs to NBIs or vice versa. This suggests that NB subtype specification cannot be explained solely by an absence of repression but requires a further activation mechanism. Strikingly, loss of PcG complexes caused a significant decrease in the number of NBs. Interestingly, across all the cell types, developmental genes such as caudal, eve, peb, scr and slp1, as well as genes involved in embryonic NB temporal patterning [hb, kr, pdm (nub), cas and grh], are heavily marked with H3K27me3. It is therefore possible that PcG-mediated repression is required to silence these developmentally crucial genes in addition to the Hox genes. Thus, the observed reduction in NB stemness might be caused by the de-repression of these genes (Abdusselamoglu, 2019).
Besides an overall decrease in NB maintenance, an increased sensitivity of NBII lineages to a reduction in PRC2 activity was observed. Interestingly, opa and ham, two previously described temporal switch genes in NBII lineages, are also enriched in H3K27me3 marks in NBs. Opa and ham are expressed in the immediate NBII progeny, the INPs, and ectopic expression of these genes limits self-renewal of NBIIs and causes NBIIs to disappear. Even though these two genes are heavily marked with H3K27me3 across all NB samples, NBIIs might be specifically sensitive to PRC2 depletion because they could be more receptive to premature de-repression of genes, the expression of which is normally restricted NBII progeny only (Abdusselamoglu, 2019).
In the future, investigating the downstream targets of PcG in different NB subtypes could reveal the underlying mechanisms of subtype specification. In conclusion, these data provide a useful resource to investigate how chromatin state dynamics orchestrate the diversity and correct progression of NSC lineages (Abdusselamoglu, 2019).
Polycomb Group (PcG) proteins form memory of transient transcriptional repression that is necessary for development. In Drosophila, DNA elements termed Polycomb Response Elements (PREs) recruit PcG proteins. How PcG activities are targeted to PREs to maintain repressed states only in appropriate developmental contexts has been difficult to elucidate. PcG complexes modify chromatin, but also interact with both RNA and DNA, and RNA is implicated in PcG targeting and function. This study shows that R-loops, three-stranded nucleic acid structures formed when an RNA hybridizes to a complementary DNA strand, thereby displacing the second DNA strand, form at many PREs in Drosophila embryos, and correlate with repressive states. In vitro, both PRC1 and PRC2 can recognize R-loops and open DNA bubbles. Unexpectedly, this study found that PRC2 [E(z), Esc and Su(z)12] drives formation of RNA-DNA hybrids, the key component of R-loops, from RNA and dsDNA. These results identify R-loop formation as a feature of Drosophila PREs that can be recognized by PcG complexes, and RNA-DNA strand exchange as a PRC2 activity that could contribute to R-loop formation (Alecki, 2020).
During Drosophila embryogenesis, transiently expressed transcription factors activate homeotic (Hox) genes in certain regions of the embryo and repress them in others to dictate the future body plan. Polycomb Group (PcG) proteins form a memory of these early cues by maintaining patterns of Hox gene repression for the rest of development. This paradigm for transcriptional memory is believed to be used by the PcG at many genes in Drosophila, and to underlie the conserved and essential functions of PcG proteins in cell differentiation and development from plants to mammals. Polycomb response elements (PREs) are DNA elements that can recruit PcG proteins, but they also recapitulate the memory function of the PcG-when combined with early acting, region-specific enhancers in transgenes, they maintain transgene repression in a PcG-dependent manner only in regions where the early enhancer was not active. PREs contain a high density of binding sites for transcription factors that can recruit PcG proteins through physical interactions. However, the widespread expression, binding pattern, and properties of factors that bind PREs cannot explain how PREs can exist in alternate, transcription-history dependent states to maintain restricted patterns of gene expression, or how they can switch between states. Furthermore, DNA sequences with PRE-like properties have been difficult to identify in other species despite the conservation of PcG complexes, their biochemical activities, and their critical roles in development (Alecki, 2020).
RNAs may provide context specificity to PcG protein recruitment and function. Some PREs, and some PcG-binding sites in mammalian and plant cells, are transcribed into ncRNA, while others reside in gene bodies, and thus are transcribed when the gene is expressed (Herzog, 2014). Both the direction and level of transcription have been correlated with the functional state of PREs. The PcG complex Polycomb Repressive Complex 2 (PRC2) has a well-described high affinity for RNA. RNA is suggested to recruit PRC2 to specific chromatin sites13, but RNA binding can also compete for chromatin binding and inhibit PRC2 activity. One way for RNA to interact with the genome is by the formation of R-loops, three-stranded nucleic acid structures formed when an RNA hybridizes to a complementary DNA strand, thereby displacing the second DNA strand. R-loops have been linked to regulation of transcription and chromatin previously, through a variety of mechanisms. This includes links to PcG regulation in mammalian cells. The formation of R-loops over genes with low to moderate expression is associated with increased PcG binding and H3K27 trimethylation (H3K27me3) in human cells and R-loops have recently been implicated in promoting PRC1 and PRC2 recruitment in mammalian cells, although other evidence suggests they antagonize recruitment of PRC2. It is hypothesized that R-loop formation could biochemically link RNA to PcG-mediated silencing through PREs and tested this idea in the Drosophila system (Alecki, 2020).
This study identified R-loop forming sequencing in Drosophila embryos and S2 cells and observe that ~25% of PREs form R-loops. Interestingly, PREs that form R-loops are more likely to be bound by PcG proteins compared with PREs that do not form R-loops, suggesting that R-loops may be involved in PcG targeting. In vitro, PRC1 and PRC2 recognize R-loops and open DNA-bubbles. Further, when provided dsDNA and RNA, PRC2 induces the formation of RNA-DNA hybrids, the key components of R-loops. These data suggest a mechanism for RNA to contribute to targeting of PcG proteins via R-loop formation induced by the RNA-DNA strand exchange activity of PRC2 (Alecki, 2020).
The demonstration that PRC2 induces the formation of RNA-DNA hybrids in vitro, that PRC2 and PRC1 recognize R-loops in vitro, and that R-loops are present at PREs in vivo suggest a mechanistic model for how RNAs could induce or maintain the OFF state of PREs. If PREs (or the gene they control and in many cases are embedded in) are highly transcribed, the RNA could compete for PRC2 binding to chromatin, as has been demonstrated in vitro and in vivo. However, a lower level of transcription through a PRE (or transcription in an orientation that is favourable for R-loop formation) could allow R-loops to form, possibly via the RNA-DNA hybrid forming activity of PRC2. R-loop formation will repress additional RNA production by preventing RNA polymerase passage allowing recruitment of additional PRC2 (by PRE-binding transcription factors or interactions with other PcG proteins) and its retention on chromatin. PRC2 could then modify histones to maintain a repressive chromatin state. The R-loop, in conjunction with H3K27me3 and PRE-binding transcription factors, would also promote binding of PRC2 and PRC1. R-loops may also interfere with binding or function of proteins that promote the active state of PREs, although this remains to be tested. The data indicate that both coding and ncRNAs form R-loops. The regulation of these RNAs and therefore of R-loops could provide transcriptional memory and developmental context specificity to PcG recruitment by transcription factors that constitutively recognize PREs. A conceptually similar model for how high levels of RNA production at PREs could promote the ON state and low levels the OFF state has been proposed; R-loop formation provides one mechanism by which it can occur. Although this model is highly speculative at this time, it integrates many observations, and provides testable hypotheses (Alecki, 2020).
Observations in Drosophila are also consistent with a possible connection between R-loops and PcG function. The helicase Rm62 interacts genetically with both PcG and TrxG genes, and colocalizes with the PRE-binding protein Dsp1 on polytene chromosomes. Rm62 is the Drosophila homologue of the DDX5 helicase, which can unwind RNA-DNA hybrids in vitro and is implicated in R-loop resolution in vivo. A recent genome-wide RNAi screen for TrxG interacting genes (which should antagonize PcG function) identified the gene for RNaseH140. RNA has been suggested to be important in switching PREs between OFF and ON states, although this has been contested by experiments aiming to test whether transcription through a PRE can switch it to the active state. Resolution of R-loops by cellular RNases or RNA-DNA helicases could contribute to switching PRE states, which will be intriguing to test. It is also likely that even in the simple model suggested in the paper (see Model for the role of R-loop formation driven by PRC2 in PcG gene silencing), the levels of RNA corresponding to 'low' and 'high', and the strength of the effect will depend both on the genomic context and the sequences of the RNAs that are produced (Alecki, 2020).
R-loop formation is observed at ~30% of PREs; these may represent a specific class of PREs. Most R-loops are believed to form co-transcriptionally, so that R-loops would be predicted to depend on PRE transcription. Indeed, >70% of R-loops formed at PREs overlap an annotated coding or non-coding RNA, and PREs with R-loops are more likely to have RNA Pol II signal in ChIP-seq experiments. However, ~67% of PREs where R-loops were not observed also overlap an annotated transcript. Further, a fraction of PREs with R-loops (and a fraction of total R-loops) either do not overlap any annotated transcripts, or overlap a transcript in the opposite orientation as the R-loop. While some of these discrepancies likely reflect incomplete annotation of rare transcripts, they raise the intriguing possibility that the RNA used to form the R-loops could be supplied in trans. Careful analysis of the RNA component of R-loops at PREs will be needed to resolve this. Although speculative at this time, the ability of PRC2 to induce RNA-DNA hybrids could contribute to non-co-transcriptional R-loop formation (Alecki, 2020).
This study finds that PRC2 can induce RNA-DNA strand exchange from RNA and linear dsDNA in vitro. A small number of other proteins have been shown to have similar activity, using various types of substrates. These include the repair proteins Rad52/RecA and PALB2, the human capping enzyme (CE)50, the viral protein ICP8 and the telomere-binding protein TRF2. Like the activity of PRC2, none of these reactions require ATP hydrolysis (although R-loop formation by RecA is stimulated by ATPγS), and most use linear DNA substrates or an unpaired or ssDNA region. The exceptions are TRF2 and ICP8. ICP8 can mediate R-loop formation from an RNA and a supercoiled plasmid. TRF2 stimulates invasion of RNA oligos into a supercoiled plasmid encoding a telomeric DNA array, but the mechanism is believed to be induction of positive supercoiling by TRF2 that facilitates DNA unwinding and RNA invasion. RNA-DNA strand exchange has been investigated most closely for Rad52, and its homologue RecA. Rad52 has been shown both to carry out 'inverse strand exchange' where Rad52 first binds the dsDNA, allowing RNA strand exchange, and to use an RNA-bridging mechanism, in which Rad52 first binds the RNA, and can bridge two dsDNA fragments by forming RNA-DNA hybrids with segments of each of them. Both of these mechanisms are candidates to mediate RNA-mediated repair of DSBs. PRC2 requires a DNA end for RNA-DNA strand exchange in vitro; for this activity to occur in vivo, either a DNA break would be required, or PRC2 would need to be able to use DNA opened by (an)other factors, or by transcription. These requirements may limit PRC2 strand exchange activity at PREs. In order to fully understand the impact of this activity in vivo and to what extent PRC2 contributes to R-loop formation at PREs, additional experiments will be necessary. Interestingly, Topoisomerase II interacts with a subunit of PRC1, colocalizes with PcG proteins in the BX-C, and is implicated in PRE-mediated silencing; transient Topo II induced breaks have been implicated in regulation of transcription and chromatin compaction, and could also be used by PRC2. It is also possible that the activity of PRC2 contributes to RNA-DNA strand exchange at DNA breaks where RNA-DNA hybrids have been shown to form and where PRC2 is recruited (Alecki, 2020).
The connection between RNA and PRC2 has been recognized for some time, in species from plants to humans, but mechanisms beyond RNA binding by PRC2 have not previously been described. This discovery of PRC2-mediated RNA-DNA strand exchange, suggests one mechanism to connect RNA to PcG targeting and function (Alecki, 2020).
R-loops are involved in transcriptional regulation, DNA and histone post-translational modifications, genome replication and genome stability. To what extent R-loop abundance and genome-wide localization is actively regulated during metazoan embryogenesis is unknown. Drosophila embryogenesis provides a powerful system to address these questions due to its well-characterized developmental program, the sudden onset of zygotic transcription and available genome-wide data sets. This study measured the overall abundance and genome localization of R-loops in early and late-stage embryos relative to Drosophila cultured cells. It was demonstrated that absolute R-loop levels change during embryogenesis and that RNaseH1 catalytic activity is critical for embryonic development. R-loop mapping by strand-specific DRIP-seq reveals that R-loop localization is plastic across development, both in the genes which form R-loops and where they localize relative to gene bodies. Importantly, these changes are not driven by changes in the transcriptional program. Negative GC skew and absolute changes in AT skew are associated with R-loop formation in Drosophila. Furthermore, this study demonstrated that while some chromatin binding proteins and histone modifications such as H3K27me3 are associated with R-loops throughout development, other chromatin factors associated with R-loops in a developmental specific manner. These findings highlight the importance and developmental plasticity of R-loops during Drosophila embryogenesis (Munden, 2022).
Diffuse midline gliomas and posterior fossa type A ependymomas contain the recurrent histone H3 lysine 27 (H3 K27M; see Drosophila H3) mutation and express the H3 K27M-mimic EZHIP (CXorf67), respectively. H3 K27M and EZHIP are competitive inhibitors of Polycomb Repressive Complex 2 (PRC2) lysine methyltransferase activity. In vivo, these proteins reduce overall H3 lysine 27 trimethylation (H3K27me3) levels; however, residual peaks of H3K27me3 remain at CpG islands (CGIs) through an unknown mechanism. This study reports that EZHIP and H3 K27M preferentially interact with PRC2 that is allosterically activated by H3K27me3 at CGIs and impede its spreading. Moreover, H3 K27M oncohistones reduce H3K27me3 in trans, independent of their incorporation into the chromatin. Although EZHIP is not found outside placental mammals, expression of human EZHIP reduces H3K27me3 in Drosophila melanogaster through a conserved mechanism. These results provide mechanistic insights for the retention of residual H3K27me3 in tumors driven by H3 K27M and EZHIP (Jain, 2020).
Polycomb repressive complexes 1 and 2 have been historically described as transcriptional repressors, but recent reports suggest that PRC1 might also support activation, although the underlying mechanisms remain elusive. This study shows that stage-specific PRC1 binding at a subset of active promoters and enhancers during Drosophila development coincides with the formation of three-dimensional (3D) loops, an increase in expression during development and repression in PRC1 mutants. Dissection of the dachshund locus indicates that PRC1-anchored loops are versatile architectural platforms that persist when surrounding genes are transcriptionally active and fine-tune their expression. The analysis of mammalian RING1B binding profiles and 3D contacts during neural differentiation in mice suggests that this role is conserved in mammals (Loubiere, 2020).
Polycomb group proteins (PcG) assemble into two main epigenetic complexes called Polycomb repressive complexes 1 and 2 (PRC1 and PRC2), which are highly conserved across metazoans and collaborate at multiple levels to maintain their target genes in a repressed state. PRC1 also binds a subset of active promoters and enhancers devoid of the PRC2-mediated H3K27me3 repressive mark in both Drosophila and mammals. Loss-of-function experiments suggest that PRC1 might contribute to the transcriptional activation of a subset of genes. However, since PRC1 binds to a large number of sites, disentangling direct from indirect regulatory effects has been proven difficult, and the molecular mechanisms that might support transcriptional activation by PRC1 are obscure. This study tested whether PRC1 might mediate gene activation by forming enhancer-promoter loops, in addition to the repressive chromatin loops that were previously described (Loubiere, 2020).
PRC1 plays important roles during normal physiology and in cancer, but how it might play a dual silencing and an activating role is a matter of great interest. The data provided in this study suggest that, rather than behaving as transcriptional repressors, PRC1-mediated loops establish versatile architectural platforms that can induce repression and activation. In the absence of transcription factors, PRC1 might cooperate with PRC2 to establish repressive loops and to form silent Polycomb domains, whereas the binding of developmental transcription factors might exploit PRC1-dependent enhancer-promoter contacts to coordinate the timely induction of cognate genes during development. The net effect of looping appears to be gene specific since expression of the CG5888 gene is more sensitive to disruption of a PRC1-dependent loop than the neighboring dac gene. At dac, one of the PRC1 binding sites is within a few hundred base pairs (bp) from the TSS, whereas the PRC1-binding site closest to the two alternative CG5888 alternative promoters is located, respectively, around 16 and 25 kb away. One possibility is therefore that the relative location of regulatory elements might modulate the effects of PRC1-dependent loops. In addition to the role in 3D looping, recent work has shown that PRC1 might assist transcription by modulating occupancy and phosphorylation of RNA polymerase II, as well as association of the pausing-elongation factor Spt5 to enhancers and promoters. In future studies, it will be important to analyze the relative role and the interplay of these mechanisms in individual cells at the onset of silencing, as well as during transcriptional activation to understand the chain of molecular events that triggers this dual function for Polycomb-bound regions (Loubiere, 2020).
sChromatin loops between gene pairs have been observed in diverse contexts in both flies and vertebrates. Combining high-resolution Capture-C, DNA fluorescence in situ hybridization, and genetic perturbations, this study dissected the functional role of three loops between genes with related function during Drosophila embryogenesis. By mutating the loop anchor (but not the gene) or the gene (but not loop anchor), loop formation and gene expression were disentangled, and the 3D proximity of paralogous gene loci supports their co-regulation. Breaking the loop leads to either an attenuation or enhancement of expression and perturbs their relative levels of expression and cross-regulation. Although many loops appear constitutive across embryogenesis, their function can change in different developmental contexts. Taken together, these results indicate that chromatin gene-gene loops act as architectural scaffolds that can be used in different ways in different contexts to fine-tune the coordinated expression of genes with related functions and sustain their cross-regulation (Pollex, 2023).
Many studies in flies, mice, and humans have observed prominent chromatin loops (high-frequency interactions) using chromatin conformation capture-based techniques. Examining a high-confidence set during Drosophila embryogenesis revealed that the majority of LAs are close to, or overlapping, gene promoters and often involve genes with related function and overlapping expression that cross-regulate each other—either paralogous genes or genes in the same biological process. This suggests a regulatory function for the physical proximity of these genes in chromatin loops in their co-regulation. Genetic dissection indicates that this is the case-in all three cases examined, deleting a loop anchor (LA) at one side of the loop strongly diminishes the loop and impacts the expression of the gene on the other side, spanning 190 kb, 240 kb, and even more than 2.5 Mb away. These results uncover a new role for paralog gene-gene loops in facilitating the cross-regulation of their co-expression (Pollex, 2023).
Examining three loci with constitutive loops, the results suggest that although the loops can be detected throughout embryogenesis, their functional role in gene expression appears to change in a context-dependent manner. The salm-salr loop is associated with (PcG-mediated) co-repression in the early blastoderm embryo (stage 5), while this study showed that proximity of salm-salr is compatible with their co-expression at mid-embryogenesis, although this functionally was not assessed. Similarly, close proximity of tsh-tio is associated with silencing or co-repression in the anterior part of the embryo and with co-expression in the posterior part, where perturbation of the loop leads to a down-regulation of expression of both genes. At later stages, an additional stage-specific loop was observed between the tsh-tio promoters that is associated with their co-expression. For scyl-chrb, the outer loop is associated with co-activation at the blastoderm stage,16 while at mid-embryogenesis the loop is attenuating both genes' expression as deletion of the loop leads to up-regulation (Pollex, 2023).
In all three cases (tsh-tio, scyl-chrb, and GluRIA-B), the looped topology modulates (dampening and/or boosting) gene expression, rather than being strictly required for full activation or repression. In the case of dampening, competition for shared cis or trans regulatory resources (i.e., TFs or enhancers) in a "hub" or condensate might be rate limiting when in spatial proximity. This is expected to be the case at the scyl-chrb and GluRIA-IB loci, as these factors do not regulate transcription directly (they are components of the Tor pathway or transmembrane receptors). For example, when the loci of scyl and chrb are in close proximity, the expression of one modulates the expression of the other, such that both genes have lower expression. When the loop is broken, both genes have higher expression, indicating that their (indirect) cross-regulation is broken. At other loci, spatial proximity is "beneficial" for transcription, perhaps due to the sharing of enhancers or TFs. These results suggest that moving away from thinking of these high-frequency interactions as strictly activating or repressing loops is necessary, and rather they should be viewed as a topological framework to facilitate an expression state that can be flexibly used to either boost or attenuate co-expression, depending on the biological context, and thereby help maintain the cross-regulation of genes with related function (Pollex, 2023).
Such gene-gene loops between paralogous genes can also occur over very large genomic distances, which appears to be a specific property of late stages of embryogenesis and likely differentiated tissues. The 2.6-Mb high-frequency loop between the GluRIA-GluRIB genes is a nice example, as it only occurs in differentiating neurons. By both Capture-C and DNA FISH, the loop initiates at 6-8 h and is established by 10-12 h, which is earlier than the expression of both genes (12-14 h, FlyBase). When these loops are diminished, either by genomic inversions or deletions of the LA, it leads to an up-regulation of the two genes' expression. Similar to the scyl-chrb locus, this long-range loop is therefore associated with transcriptional dampening of the genes' expression. This is somewhat counter-intuitive, as in both cases the loop is present in the tissues and stages when the genes are expressed, and suggests that the loop might be required to maintain the relative levels of the two genes' expression. Similar to the other cases examined, the expression of one GluRI gene modulates or cross-regulates the expression of the other (Pollex, 2023).
Cumulating evidence suggests that the nervous system might be special in using such very-long-range gene-gene loops in the regulation of genes involved in neuronal function, as recently observed in mouse and Drosophila, as well as in the regulation of olfactory receptor genes. The results suggest that the function of these long-range loops is also likely to be context dependent, modulating (either enhancing or dampening) the relative levels of the two genes' expression in different ways depending on the developmental context and the activity of the regulatory elements in the vicinity of the LAs (Pollex, 2023).
While many loops appear constitutive over time (as seen from chromatin conformation capture methods), the underlying function, and perhaps also their mechanism of formation, seems to change between developmental stages or tissues. In the early blastoderm Drosophila embryo and Kc cells, many LAs are bound by PcG proteins or GAF or both, which was also observed in this study for the salm-salr and tsh-tio loci. GAF, also called GAGA factor and Trl, is a trithorax protein that was recently proposed to promote loop formation by binding to regions close to promoters ("tethering elements") and forming a loop that facilitates transcriptional coupling. However, the role of GAF/Trl is complex, as it also binds to Polycomb response elements (PREs), and GAF/Trl binding is essential for full PcG recruitment and loop formation, at least at the dac locus. It is therefore difficult to disentangle the direct contribution of GAF for loop formation. Moreover, depletion of GAF in the early embryo impacted only a small fraction of loops (∼6% [12/186]), indicating that other factors are required for loop formation (Pollex, 2023).
Genetic dissection of a number of loci in Drosophila revealed that many insulator elements also contain a PRE, either in close proximity (e.g., eve, en) or even embedded within them. It is possible that this overlap of different biological elements at LAs (an insulator [or architectural] element generating the loop and a PRE/TRE leading to repression or activation) might partially explain context-dependent loop function. In addition, some regulatory elements have inherent dual functionality, acting as enhancers in one context and as repressive PREs in another, depending on the context (Pollex, 2023).
The system is likely more complex-the majority (~70%) of constitutive loop anchors identified in this study are bound by PcG, GAF, or both proteins, while the majority of stage-specific loops are not, at least based on the currently available ChIP data. This again indicates that other as yet unidentified proteins are involved in gene-gene loop formation. Moreover, the identification of nested or overlapping tissue/stage-specific loops close to constitute loops suggests that different mechanisms are likely involved. At the tsh-tio locus, for example, the constitutive loop encloses a nested highly stage-specific loop, which forms adjacent to it and only forms at late stages of embryogenesis coinciding with the increased overlap of tsh-tio co-expression. Perhaps this and other loops are transcription dependent, so that when both genes are active, cross-regulation will be properly ensured. It is also currently not clear if the long-range gene-gene loops spanning megabases (e.g., GluRIA-B) are formed by the same mechanism as the shorter intra-TAD loops (50- to 250-kb range, e.g., scyl/chrb, tsh/tio, salm/salr, kni/knrl). PcG and GAF have been proposed to function over both scales, but as both are ubiquitously expressed and bind to thousands of sites, it is currently not clear where the specificity comes from, especially over megabase scales (Pollex, 2023).
One interesting feature of the genes involved in paired gene-gene chromatin loops is that many modulate or cross-regulate each other's expression (either directly or indirectly), which is important as many are functionally redundant. At the scyl-chrb locus, for example, deletion of one of the two genes (without impacting the loop) leads to up-regulation of the other gene’s expression, suggesting some kind of compensatory cross-regulatory mechanism. Breaking the loop, and thereby decreasing the 3D physical proximity between the two paralogous genes, perturbs this ability to cross-regulate each other's expression. When the scylla LA is deleted, for example, both genes are mis-regulated in the same direction, suggesting a perturbation of their compensatory cross-regulation. Similarly, Tsh and Tio cross-regulate each other's expression and fully or partially compensate for the deletion of the other in the developing embryo and larvae. Again, this cross-regulation becomes uncoupled after the loop is perturbed, which leads to the mis-regulation of both genes' expression in the same direction (down-regulation), suggesting a requirement of their looped topology for their cross-regulation (Pollex, 2023).
Cross-regulation has been demonstrated for many of the genes that are observed to be looped together: for example, negative regulatory feedback between H15 and mid (constitutive loop), e.g., overexpression of mid leads to H15 down-regulation. Therefore, such gene-gene loops might facilitate positive or negative regulatory feedback between gene pairs to help maintain their coordinated levels of co-expression or exclusive expression. Examples of this could include the regulation of the bric-a-brac (bab1 and bab2) or nubbin-pdm2 paralogous genes that are connected by constitutive loops. At nub-pdm2, a shared enhancer element acts differently on nubbin and pdm2 due to a pdm2-specific silencer element that is regulated by nub. Upon perturbation of cis interaction, one layer of these cross-regulatory networks might be lost. The functional contribution of each gene-gene loop might thereby reflect the evolution of different strategies for tissue- and stage-specific cross-regulation depending on what factors and contexts modulate the underlying regulatory elements and LAs (Pollex, 2023).
The Hi-C data on whole embryos could identify hundreds of chromatin loops. As this is averaged over all cells and tissues in the embryo (and across thousands of embryos), it will identify the most prominent, high-frequency loops at that time point. However, it will miss many tissue-specific loops, as this study showed at the tsh locus, in addition to more transient loops. Tissue-specific Capture-C or Hi-C has much higher sensitivity and the added advantage of providing information from a selected subset of nuclei (i.e., a given cell type or tissue). The functional dissection of chromatin loops in gene expression is limiting but extremely important, as the outcome was not apparent and is different for different loci and even for the same locus in different contexts. The overlap of LAs with promoters renders the dissection of cause and consequence of loop formation, transcription, and the regulation gene expression challenging but feasible with careful dissection. However, there is a clear need to scale up to assess the role of many more loops. This is currently not very tractable in embryos with the methods used here (Pollex, 2023).
Polycomb group (PcG) mutants were first identified in Drosophila on the basis of their failure to maintain proper Hox gene repression during development. The proteins encoded by the corresponding fly genes mainly assemble into one of two discrete Polycomb repressive complexes: PRC1 or PRC2. However, biochemical analyses in mammals have revealed alternative forms of PRC2 and multiple distinct types of noncanonical or variant PRC1. Through a series of proteomic analyses, this study identified analogous PRC2 and variant PRC1 complexes in Drosophila, as well as a broader repertoire of interactions implicated in early development. These data provide strong support for the ancient diversity of PcG complexes and a framework for future analysis in a longstanding and versatile genetic system (Kang, 2022).
The Polycomb group (PcG) complexes, PRC1 and PRC2, each encompass numerous alternative subunits and configurations in mammalian cells. This has not been explored to the same extent in Drosophila, where Polycomb complexes comprise a reduced number of paralogous and accessory subunits. In the absence of extensive analyses, it has been assumed that PcG complexes have greatly diversified in mammals. However, numerous subunits of alternative PcG complexes are highly conserved, including the RING1 and YY1 binding protein (RYBP) and PcG RING finger (PCGF) proteins, which have ancient origins (Kang, 2022).
The previously described Drosophila dRAF and PhoRC complexes have multiple subunits in common with mammalian variant PRC1 complexes vPRC1.1 and vPRC1.6, respectively. However, several subunits thought to play defining roles in mammals were not detected in the fly complexes, including RYBP. Furthermore, orthologous PRC1.3/5 complexes have not been reported. These observations support the need for additional analyses in Drosophila (Kang, 2022).
This study used cross-linking, tandem affinity purification, and mass spectrometry (BioTAP-XL) to find that fly embryos use RYBP (Ring and YY1 Binding Protein; CG12190) and three PCGF subunits-Psc, Su(z)2, and L(3)73Ah (CG4195)-to assemble complexes related to all previously described vPRC1 subtypes. CG14073 (BCOR) is a signature subunit of PRC1.1, and CG8677 (RSF1) is a newly identified interactor. Fly PRC1.3/5 may have a conserved role in the nervous system based on Tay (CG9056), its defining subunit. Sfmbt (CG16975) interactions, which encompass the previously described PhoRC-L, suggest unexpectedly broad modularity of a potential fly vPRC1.6.
The modularity of PRC2 in Drosophila was also confirmed, with Pcl (CG5109) and Scm (CG9495) restricted to PRC2.1 and Jarid2 (CG3654) and Jing (CG9397) restricted to PRC2.2. The conservation appears to extend to the association of PRC2.1 with stable repression and PRC2.2 with a heterogenous or transitional role. Phenotypes from overexpression of Jing or compensatory knockdown of Jarid2 provide further evidence for the importance of a proper balance between PRC2.1 and PRC2.2 during development (Kang, 2022).
The Polycomb repressive complexes PRC1, PRC2, and PR-DUB repress target genes by modifying their chromatin. In Drosophila, PRC1 compacts chromatin and monoubiquitinates histone H2A at lysine 118 (H2Aub1), whereas PR-DUB is a major H2Aub1 deubiquitinase, but how H2Aub1 levels must be balanced for Polycomb repression remains unclear. This study shows that in early embryos, H2Aub1 is enriched at Polycomb target genes, where it facilitates H3K27me3 deposition by PRC2 to mark genes for repression. During subsequent stages of development, H2Aub1 becomes depleted from these genes and is no longer enriched when Polycomb maintains them repressed. Accordingly, Polycomb targets remain repressed in H2Aub1-deficient animals. In PR-DUB catalytic mutants, high levels of H2Aub1 accumulate at Polycomb target genes, and Polycomb repression breaks down. These high H2Aub1 levels do not diminish Polycomb protein complex binding or H3K27 trimethylation but increase DNA accessibility. H2Aub1 interferes with nucleosome stacking and chromatin fiber folding in vitro. Consistent with this, Polycomb repression defects in PR-DUB mutants are exacerbated by reducing PRC1 chromatin compaction activity, but Polycomb repression is restored if PRC1 E3 ligase activity is removed. PR-DUB therefore acts as a rheostat that removes excessive H2Aub1 that, although deposited by PRC1, antagonizes PRC1-mediated chromatin compaction (Bonne, 2022).
This study investigated the molecular function of H2Aub1 in developing Drosophila with a focus on elucidating why and how H2Aub1 deubiquitinase activity of PR-DUB is needed for Polycomb repression (Bonne, 2022).
The following main conclusions can be drawn from the work reported in this study. First, the genomic H2Aub1 profile shaped by the antagonistic actions of PRC1 and PR-DUB is unexpectedly dynamic during development. This balance permits PRC1 to generate H2Aub1 domains at Polycomb target genes in early embryos, but subsequently, it is H2Aub1 deubiquitination by PR-DUB that dominates and shapes a near-uniform, low-level H2Aub1 landscape in late stage embryos and larvae. Second, the early H2Aub1 domains are required for the rapid establishment of H3K27me3 domains; however, at HOX genes, this pathway does not appear to be critical because, with a delay, PRC2 generates H3K27me3 domains at these loci also in animals lacking H2Aub1. In contrast, the formation of noncanonical H3K27me3 domains strictly depends on H2Aub1. Third, in wild-type Drosophila, H2Aub1 is no longer enriched at Polycomb target genes during developmental stages when the Polycomb machinery acts to repress transcription of these genes. Together with the finding that Polycomb repression appears intact in animals lacking H2Aub1, this corroborates that chromatin compaction but not H2Aub1 is central to the PRC1 repression mechanism. Fourth, H2Aub1 inhibits chromatin fiber folding of nucleosome arrays in vitro. H2Aub1 therefore directly impacts the structural organization of chromatin. Fifth, in PR-DUB mutants, excessive H2Aub1 accumulation at HOX genes increases DNA accessibility and disrupts Polycomb repression at these loci. High H2Aub1 levels at these target genes are therefore detrimental to Polycomb repression. Sixth, removal of PRC1 E3 ligase activity restores Polycomb repression in PR-DUB mutants, whereas reduction of PRC1 chromatin compaction activity exacerbates the repression defects. This suggests that PR-DUB preserves Polycomb repression by antagonizing persistent H2Aub1 deposition by PRC1 in order to allow effective chromatin compaction by PRC1 (Bonne, 2022).
Atomic structures of diverse nucleosome arrays revealed that stacking interactions between the solvent-exposed surfaces of the histone octamers in n and n+2 nucleosomes are a common feature. In vivo, chromatin does not exist as long regularly folded fibers, but direct contacts between n and n+2 nucleosomes are also widely observed using sequencing-based approaches. Moreover, tomographic studies suggest that nucleosome stacking is a prevalent feature of chromatin organization in vertebrate nuclei. FRET measurements in this study show that these stacking interactions are critical for folding of unmodified nucleosome arrays in solution and that the presence of ubiquitin at H2AK119 sterically prevents nucleosome-nucleosome stacking. In conclusion, the results show that in addition to providing a binding site for PRC2.2 and variant PRC1 complexes, H2Aub1 can also directly impact the structural organization of chromatin (Bonne, 2022).
To relate the observations from these in vitro studies to the effects of H2Aub1 on chromatin compaction in vivo, it is important to consider the density of H2Aub1-modified nucleosomes within a given stretch of chromatin in cells. In late stage wild-type embryos, only a few percent of total H2A carry the ubiquitin modification. Considering that H2Aub1 is globally distributed across the entire genome in these embryos, one would expect that only a small fraction of nucleosomes across any given stretch of chromatin contains H2Aub1. During the stages when Polycomb repression acts, nucleosome stacking in Polycomb target gene chromatin is therefore unlikely to be grossly impaired. In contrast, the 20-fold to 30-fold increase in H2Aub1 nucleosome density in HOX gene chromatin in late stage PR-DUB mutant embryos could interfere with regular nucleosome stacking. The increased DNA accessibility at several Polycomb target genes in PR-DUB mutants is consistent with a less compact chromatin organization at these loci (Bonne, 2022).
The data argue that the high H2Aub1 accumulation in PR-DUB mutants does not interfere with Polycomb complex binding to PREs or H3K27me3 deposition in the flanking chromatin. The enhancement of the Polycomb repression defects in PR-DUB mutants with reduced PRC1 dosage (i.e., in Pc/+; Asx0 animals) is consistent with a scenario in which the excess of H2Aub1 directly impacts the ability of target gene-associated PRC1 to compact chromatin. This conclusion is in contrast to the recent suggestion that defective Polycomb repression in PR-DUB mutant mouse ESCs might be caused by reduced H3K27me3 deposition at Polycomb target genes. Specifically, these studies proposed that the genome-wide low increase in H2Aub1 levels in PR-DUB mutants would sequester PRC2.2 and thereby titrate it away from Polycomb target genes. The currenr data provide no support for such a scenario in Drosophila (Bonne, 2022).
s
The morphological defects in PR-DUB mutants suggest that Polycomb repression is primarily disrupted at HOX genes. Moreover, the repression defects at HOX genes are remarkably tissue- and stage-specific and not as widespread as in the case of mutants lacking PRC1 or PRC2. Even at HOX genes, the high H2Aub1 levels therefore still permit the Polycomb machinery to sustain repression in a fraction of cells. A possible explanation could be that the high H2Aub1 nucleosome density antagonizes chromatin compaction by PRC1 but does not fully abrogate it, and consequently, only some but not all tissue-specific enhancers in HOX genes are able to overcome this residual repression (Bonne, 2022).
What is the molecular basis for the formation of very different genomic H2Aub1 profiles in early and late embryos? Absolute quantification of proteins revealed that the copy number of Sce molecules per nucleus is comparable in early and late stage embryos. In contrast, the copy numbers of Calypso and Asx proteins in nuclei from early stage embryos are very low, and both proteins are fivefold to sevenfold more abundant in nuclei from late stage embryos. The lower abundance of PR-DUB relative to PRC1 in 0- to 6-h-old embryos could therefore provide a simple mechanistic explanation why PRC1 is able to generate H2Aub1 domains at Polycomb target genes early but not late in embryogenesis. Alternative scenarios to explain the different H2Aub1 landscapes in early and late embryos could be modulation of PRC1 or PR-DUB activity or differences in genomic targeting of these complexes during the different stages of embryogenesis (Bonne, 2022).
The formation of the H3K27me3 landscape is thought to rely on the combined action of PRC2.1 and PRC2.2, whereby the relative contributions by these two complexes vary at different genomic locations. This study used H2Aub1-deficient embryos that lack the nucleosomal binding site specific for PRC2.2. H2Aub1-deficient embryos show a genome-wide reduction of the H3K27me3 profile that is particularly pronounced in early embryos. During this stage of development, H2Aub1 and PRC2.2 therefore contribute significantly to H3K27me3 domain formation, analogously to what has been reported in mouse ESCs. Intriguingly, in later stage H2Aub1-deficient embryos, regular domains with near-normal coverage of H3K27me3 appear at HOX and many other canonical Polycomb target genes. This recovery of H3K27me3 domains explains why H2Aub1-deficient animals do not show the repression defects associated with loss of PRC2 activity. The recovery of near-normal levels of H3K27me3 at canonical H3K27me3 domains in late stage H2Aub1-deficient Drosophila embryos argues that the system, at least in flies, is considerably more plastic than anticipated (Bonne, 2022).
A simple straightforward finding of this study was that H2Aub1 is not enriched at Polycomb target genes during the embryonic and larval stages when PRC1 is critically required to repress these genes. This is consistent with the finding that SceI48A mutants show no Polycomb mutant phenotypes and no general transcriptional deregulation of Polycomb target genes in late stage embryos. Together, these observations all argue against a critical role of H2Aub1 in the actual repression mechanism by which PRC1 blocks transcription. Identifying the reason why SceI48A or H2AK117R/K118R/K121R/K122R mutants die as late stage embryos will require further studies, including experiments to investigate whether the low uniform levels of H2Aub1 across the genome might have an essential function in a process other than transcriptional regulation (Bonne, 2022).
The Polycomb group (PcG) complex PRC1 localizes in the nucleus in condensed structures called Polycomb bodies. The PRC1 subunit Polyhomeotic (Ph) contains an oligomerizing sterile alpha motif (SAM) that is implicated in both PcG body formation and chromatin organization in Drosophila and mammalian cells. A truncated version of Ph containing the SAM (mini-Ph) forms phase-separated condensates with DNA or chromatin in vitro, suggesting that PcG bodies may form through SAM-driven phase separation. In cells, Ph forms multiple small condensates, while mini-Ph typically forms a single large nuclear condensate. It is therefore hypothesized that sequences outside of mini-Ph, which are predicted to be intrinsically disordered, are required for proper condensate formation. In this study identified three distinct low-complexity regions in Ph based on sequence composition. The role of each of these sequences was systematically tested in Ph condensates using live imaging of transfected Drosophila S2 cells. Each sequence uniquely affected Ph SAM-dependent condensate size, number, and morphology, but the most dramatic effects occurred when the central, glutamine-rich intrinsically disordered region (IDR) was removed, which resulted in large Ph condensates. Like mini-Ph condensates, condensates lacking the glutamine-rich IDR excluded chromatin. Chromatin fractionation experiments indicated that the removal of the glutamine-rich IDR reduced chromatin binding and that the removal of either of the other IDRs increased chromatin binding. These data suggest that all three IDRs, and functional interactions among them, regulate Ph condensate size and number. The results can be explained by a model in which tight chromatin binding by Ph IDRs antagonizes Ph SAM-driven phase separation. These observations highlight the complexity of regulation of biological condensates housed in single proteins (Kapur, 2022).
EZH1, a polycomb repressive complex-2 component, is involved in a myriad of cellular processes. EZH1 represses transcription of downstream target genes through histone 3 lysine27 (H3K27) trimethylation (H3K27me3). Genetic variants in histone modifiers have been associated with developmental disorders, while EZH1 has not yet been linked to any human disease. However, the paralog EZH2 is associated with Weaver syndrome. This study report a previously undiagnosed individual with a novel neurodevelopmental phenotype identified to have a de novo missense variant in EZH1 through exome sequencing. The individual presented in infancy with neurodevelopmental delay and hypotonia and was later noted to have proximal muscle weakness. The variant, p.A678G, is in the SET domain, known for its methyltransferase activity, and an analogous somatic or germline mutation in EZH2 has been reported in patients with B-cell lymphoma or Weaver syndrome, respectively. Human EZH1/2 are homologous to fly Enhancer of zeste (E(z)), an essential gene in Drosophila, and the affected residue (p.A678 in humans, p.A691 in flies) is conserved. To further study this variant, null alleles were obtained, and transgenic flies were generated expressing wildtype [E(z)WT] and the variant [E(z)A691G]. When expressed ubiquitously the variant rescues null-lethality similar to the wildtype. Overexpression of E(z)WT induces homeotic patterning defects but notably the E(z)A691G variant leads to dramatically stronger morphological phenotypes. A dramatic loss is reported of H3K27me2 and a corresponding increase in H3K27me3 in flies expressing E(z)A691G, suggesting this acts as a gain-of-function allele. In conclusion, this study presents a novel EZH1 de novo variant associated with a neurodevelopmental disorder. Furthermore, we found that this variant has a functional impact in Drosophila (Jangam, 2023).
Polycomb repressive complex 1 (PRC1) strongly influences 3D genome organization, mediating local chromatin compaction and clustering of target loci. Several PRC1 subunits have the capacity to form biomolecular condensates through liquid-liquid phase separation in vitro and when tagged and over-expressed in cells. This study used 1,6-hexanediol, which can disrupt liquid-like condensates, to examine the role of endogenous PRC1 biomolecular condensates on local and chromosome-wide clustering of PRC1-bound loci. Using imaging and chromatin immunoprecipitation, this study showed that PRC1-mediated chromatin compaction and clustering of targeted genomic loci-at different length scales-can be reversibly disrupted by the addition and subsequent removal of 1,6-hexanediol to mouse embryonic stem cells. Decompaction and dispersal of polycomb domains and clusters cannot be solely attributable to reduced PRC1 occupancy detected by chromatin immunoprecipitation following 1,6-hexanediol treatment as the addition of 2,5-hexanediol has similar effects on binding despite this alcohol not perturbing PRC1-mediated 3D clustering, at least at the sub-megabase and megabase scales. These results suggest that weak hydrophobic interactions between PRC1 molecules may have a role in polycomb-mediated genome organization (Williamson, 2023).
Little is understood about how the two major types of heterochromatin domains (HP1 and Polycomb) are kept separate. In the yeast Cryptococcus neoformans, the Polycomb-like protein Ccc1 prevents deposition of H3K27me3 at HP1 domains. This study shows that phase separation propensity underpins Ccc1 function. Mutations of the two basic clusters in the intrinsically disordered region or deletion of the coiled-coil dimerization domain alter phase separation behavior of Ccc1 in vitro and have commensurate effects on formation of Ccc1 condensates in vivo, which are enriched for PRC2. Notably, mutations that alter phase separation trigger ectopic H3K27me3 at HP1 domains. Supporting a direct condensate-driven mechanism for fidelity, Ccc1 droplets efficiently concentrate recombinant C. neoformans PRC2 in vitro whereas HP1 droplets do so only weakly. These studies establish a biochemical basis for chromatin regulation in which mesoscale biophysical properties play a key functional role (Lee, 2023).
Polycomb group (PcG) proteins maintain the silenced state of key developmental genes, but how these proteins are recruited to specific regions of the genome is still not completely understood. In Drosophila, PcG proteins are recruited to Polycomb response elements (PREs) comprised of a flexible array of sites for sequence-specific DNA binding proteins, "PcG recruiters," including Pho, Spps, Cg, and GAF. Pho is thought to play a central role in PcG recruitment. Early data showed that mutation of Pho binding sites in PREs in transgenes abrogated the ability of those PREs to repress gene expression. In contrast, genome-wide experiments in pho mutants or by Pho knockdown showed that PcG proteins can bind to PREs in the absence of Pho. This study directly addressed the importance of Pho binding sites in 2 engrailed (en) PREs at the endogenous locus and in transgenes. The results show that Pho binding sites are required for PRE activity in transgenes with a single PRE. In a transgene, 2 PREs together lead to stronger, more stable repression and confer some resistance to the loss of Pho binding sites. Making the same mutation in Pho binding sites has little effect on PcG-protein binding at the endogenous en gene. Overall, these data support the model that Pho is important for PcG binding but emphasize how multiple PREs and chromatin environment increase the ability of PREs to function in the absence of Pho. This supports the view that multiple mechanisms contribute to PcG recruitment in Drosophila (Brown, 2023).
Diet profoundly influences brain physiology, but how metabolic information is transmuted into neural activity and behavior changes remains elusive. This study shows that the metabolic enzyme O-GlcNAc Transferase (OGT) moonlights on the chromatin of the D. melanogaster gustatory neurons to instruct changes in chromatin accessibility and transcription that underlie sensory adaptations to a high-sugar diet. OGT works synergistically with the Mitogen Activated Kinase/Extracellular signal Regulated Kinase (MAPK/ERK) rolledand its effector stripe (also known as EGR2 or Krox20) to integrate activity information. OGT also cooperates with the epigenetic silencer Polycomb Repressive Complex 2.1 (PRC2.1) to decrease chromatin accessibility and repress transcription in the high-sugar diet. This integration of nutritional and activity information changes the taste neurons' responses to sugar and the flies' ability to sense sweetness. These findings reveal how nutrigenomic signaling generates neural activity and behavior in response to dietary changes in the sensory neurons (Sung, 2023).
Maintenance of appropriate cell states involves epigenetic mechanisms, including Polycomb-group (PcG)-mediated transcriptional repression. While PcG proteins are known to induce chromatin compaction, how PcG proteins gain access to DNA in compact chromatin to achieve long-term silencing is poorly understood. This study shows that the p300/CREB-binding protein (CBP) co-activator is associated with two-thirds of PcG regions and required for PcG occupancy at many of these in Drosophila and mouse cells. CBP stabilizes RNA polymerase II (Pol II) at PcG-bound repressive sites and promotes Pol II pausing independently of its histone acetyltransferase activity. CBP and Pol II pausing are necessary for RNA-DNA hybrid (R-loop) formation and nucleosome depletion at Polycomb Response Elements (PREs), whereas transcription beyond the pause region is not. These results suggest that non-enzymatic activities of the CBP co-activator have been repurposed to support PcG-mediated silencing, revealing how chromatin regulator interplay maintains transcriptional states (Hunt, 2023).
Pruning that selectively eliminates unnecessary or incorrect neurites is required for proper wiring of the mature nervous system. During Drosophila metamorphosis, dendritic arbourization sensory neurons (ddaCs) and mushroom body (MB) γ neurons can selectively prune their larval dendrites and/or axons in response to the steroid hormone ecdysone. An ecdysone-induced transcriptional cascade plays a key role in initiating neuronal pruning. However, how downstream components of ecdysone signalling are induced remains not entirely understood. This study identified that Scm, a component of Polycomb group (PcG) complexes, is required for dendrite pruning of ddaC neurons. Two PcG complexes, PRC1 and PRC2, are important for dendrite pruning. Interestingly, depletion of PRC1 strongly enhances ectopic expression of Abdominal B (Abd-B) and Sex combs reduced, whereas loss of PRC2 causes mild upregulation of Ultrabithorax and Abdominal A in ddaC neurons. Among these Hox genes, overexpression of Abd-B causes the most severe pruning defects, suggesting its dominant effect. Knockdown of the core PRC1 component Polyhomeotic (Ph) or Abd-B overexpression selectively downregulates Mical expression, thereby inhibiting ecdysone signalling. Finally, Ph is also required for axon pruning and Abd-B silencing in MB γ neurons, indicating a conserved function of PRC1 in two types of pruning. This study demonstrates important roles of PcG and Hox genes in regulating ecdysone signalling and neuronal pruning in Drosophila. Moreover, these findings suggest a non-canonical and PRC2-independent role of PRC1 in Hox gene silencing during neuronal pruning (Bu, 2023).
Heat shock inducible expression of genes through the use of heat inducible promoters is commonly used in research despite leaky expression of downstream genes of interest without targeted induction (i.e. heat shock). The development of non-leaky inducible expression systems are of broad interest for both basic and applied studies, to precisely control gene expression. This study characterizes the use of Polycomb response elements and the inducible Heat shock protein 70Bb promoter, previously described as a non-leaky inducible system, to regulate Cas9 endonuclease levels and function in Drosophila melanogaster after varying both heat shock durations and rearing temperatures. Polycomb response elements were shown to significantly reduce expression of Cas9 under Heat shock protein 70Bb promoter control using a range of conditions, corroborating previously published results. It was further demonstrated that this low transcript level of heat-induced Cas9 is sufficient to induce mutant mosaic phenotypes. Incomplete suppression of an inducible Cas9 system by Polycomb response elements with no heat shock suggests that further regulatory elements are required to precisely control Cas9 expression and abundance (Warsinger-Pepe, 2023).
Under stress conditions, the coactivator Multiprotein bridging factor 1 (Mbf1) translocates from the cytoplasm into the nucleus to induce stress-response genes. However, its role in the cytoplasm, where it is mainly located, has remained elusive. This study shows that Drosophila Mbf1 associates with E(z) mRNA and protects it from degradation by the exoribonuclease Pacman (Pcm), thereby ensuring Polycomb silencing. In genetic studies, loss of mbf1 function enhanced a Polycomb phenotype in Polycomb group mutants, and was accompanied by a significant reduction in E(z) mRNA expression. Furthermore, a pcm mutation suppressed the Polycomb phenotype and restored the expression level of E(z) mRNA, while pcm overexpression exhibited the Polycomb phenotype in the mbf1 mutant but not in the wild-type background. In vitro, Mbf1 protected E(z) RNA from Pcm activity. These results suggest that Mbf1 buffers fluctuations in Pcm activity to maintain an E(z) mRNA expression level sufficient for Polycomb silencing (Nishioka, 2018).
Polycomb silencing is essential for the developmental regulation of gene expression. The silencing needs to be robust to tightly repress the expression of developmental genes in undifferentiated cells, such as stem cells, but should also be flexible for rapid release upon differentiation. However, this paradoxical aspect of Polycomb silencing is not well understood (Nishioka, 2018).
Mbf1 was originally identified as an evolutionarily conserved coactivator that connects a transcriptional activator with the TATA element-binding protein (Li, 1994; Takemaru, 1997; Takemaru, 1998). Usually, Mbf1 is present in the cytoplasm; however, under stress conditions, Mbf1 translocates into the nucleus to induce stress-response genes. Previous studies have revealed roles for the coactivator in axon guidance, oxidative stress response, defense against microbial infection, and resistance to drugs such as tamoxifen. However, the cytoplasmic role of Mbf1 has remained elusive, except for mRNA or ribosomal binding (Nishioka, 2018).
Pacman (Pcm/Xrn1) is an evolutionarily conserved 5'-3' exoribonuclease that degrades decapped mRNA. Genetic studies have demonstrated that Drosophila pcm is involved in epithelial closure, male fertility, apoptosis and growth control. Null mutants of pcm are lethal during early pupal stages, suggesting the enzyme plays an essential role in development (Nishioka, 2018 and references therein).
Using a genetic approach in Drosophila, this study shows that cytoplasmic Mbf1 ensures Polycomb silencing by protecting E(z) mRNA from degradation by Pcm. The results thus demonstrate an unexpected component of the regulatory mechanism underlying Polycomb silencing. This mechanism might also allow flexibility in Polycomb silencing, as Mbf1 protein expression declines upon differentiation (Nishioka, 2018).
To address the cytoplasmic role of Mbf1, novel genes were sought that interact with mbf1. Surprisingly, the mbf1 mutation enhanced a classical Polycomb phenotype of Psc and Pc mutants, namely the appearance of an ectopic sex comb tooth or teeth on the male mid-leg. Although mbf12/+ or mbf12/mbf12 flies never exhibited the Polycomb phenotype, penetrance of the phenotype in Psc1/+ increased significantly in Psc1/+; mbf12/+, and further increased in Psc1/+; mbf12/mbf12. The penetrance was restored to the Psc1/+ level by expressing wild-type Mbf1 protein from a transgene. Similar effects of the mbf12 allele were observed with the Pc6 mutation (Nishioka, 2018).
To gain insight into the mechanism underlying the genetic interaction between Psc and mbf1, the expression of the representative Polycomb group genes Pc, E(z) and pho was analyzed. Results of reverse transcription-quantitative PCR (RT-qPCR) analyses demonstrated a prominent reduction in the expression level of E(z) mRNA in Psc1/+; mbf12/+ larvae, whereas Pc and pho mRNA levels remained unchanged. Immunostaining of wing discs demonstrated that E(z) protein expression was severely compromised in Psc1/+; mbf12/+ compared with that in wild type, mbf12/+ or Psc1/+. By contrast, the expression of Pc and Pho proteins was not significantly affected. Western blot analyses confirmed the marked decrease in the E(z) protein level in both wing and leg discs from Psc1/+; mbf12/+. Consistently, Psc1/+; E(z)731/+ exhibited the extra sex comb phenotype, which was comparable to Psc1/+; mbf12/+ (Nishioka, 2018).
It is unlikely that Mbf1 affects E(z) transcription because no significant difference was detected in the E(z) mRNA level between wild-type and mbf12/mbf12 larvae. Consistently, it was not possible to detect any significant difference in the expression of E(z) in the wing disc upon knockdown or overexpression of Mbf1 using a posterior compartment-specific Gal4 driver. When cytoplasmic and nuclear RNA fractions from wing discs were analyzed by RT-qPCR, the nuclear E(z) mRNA level was similar between wild type and Psc1/+; mbf12/+. However, the cytoplasmic E(z) mRNA level in Psc1/+; mbf12/+ decreased to ~20% of the wild-type level. Collectively, these results suggest that mbf1 regulates the E(z) mRNA level post-transcriptionally in the cytoplasm (Nishioka, 2018).
Considering that Mbf1 binds to mRNA, it was hypothesized that cytoplasmic Mbf1 might bind to E(z) mRNA to protect it from degradation, and thereby regulates the E(z) mRNA level. Results of RNA-immunoprecipitation (RIP) experiments revealed a preferential binding of Mbf1 to E(z) mRNA. A ~10-fold enrichment of E(z) mRNA was found in the anti-Mbf1 antibody pull-down fraction from cytoplasmic extracts of embryos. The pull-down was clearly selective, as enrichment of abundant mRNAs, such as RpL32 and RpL30, was not observed. By contrast, E(z) mRNA was barely detectable in the anti-Mbf1 antibody pull-down fraction from embryonic extracts of the mbf1 mutant, used as a negative control. This is not due to absence of E(z) mRNA in the mbf1 mutant (Nishioka, 2018).
Following the observed preferential binding of Mbf1 to E(z) mRNA, this study focused on the Polycomb phenotype and reduced E(z) mRNA expression level, which were not caused by the mbf1 mutation alone. Enhancement of the Polycomb phenotype and the reduction of E(z) mRNA were only detected in the double mbf1 and Polycomb group gene mutant. To explain the synergistic effect of mbf1 and Polycomb group mutations, it was posited that a component of the mRNA degradation pathway was only activated in the Polycomb group mutant background. Therefore, attempts were made to identify the component of the pathway that was activated in the Psc or Pc mutants. Among the mRNAs tested, only pcm mRNA, which encodes the 5'-exoribonuclease, was upregulated in Psc1/+ and Pc6/+ larvae. Neither the decapping enzyme (Dcp2), components of the exosome [Dis3, Prp6 (CG6841) and Prp40 (CG3542)], nor components in the 3'-deadenylation-mediated pathway (twin and Nab2) appeared to be activated. Western blot analyses revealed a 2-fold increase in the Pcm protein level in wing discs from Psc1/+ or Pc6/+ larvae compared with that from wild type. These results led to an investigation of the effects of the pcm mutation on Polycomb silencing and E(z) mRNA expression (Nishioka, 2018).
Strikingly, the pcmΔ1 mutation resulted in significant suppression of the Polycomb phenotype in Psc1/+ and Psc1/+; mbf12/+. This suppression was rescued by expressing the wild-type Pcm protein from a transgene. Similar results were obtained using the Pc6 mutant. Consistent with this result, the pcmΔ1 mutation restored the E(z) mRNA levels in Psc1/+ and Psc1/+; mbf12/+ to near wild-type levels (Nishioka, 2018).
In addition to the extra sex comb phenotype, Psc1/+; mbf12/+ exhibited misexpression of Ubx in wing discs. The signals appeared as spots consisting of clusters of Ubx-positive cells. The pcmΔ1 mutation decreased the number of spots per wing disc. The misexpression occurred predominantly around the dorsoventral border in the posterior compartment. Consistently, adult wing defects were observed along the posterior wing margin, which was also suppressed by pcmΔ1 (Nishioka, 2018).
Importantly, the extra sex comb phenotype was detected under mild overexpression of pcm in mbf12/hs-pcm double heterozygotes at 25°C, even in the wild-type Polycomb group background. hs-pcm/+ exhibited an ~2.5-fold overexpression of Pcm at 25°C. Nevertheless, hs-pcm heterozygotes in the wild-type mbf1 background did not show any Polycomb phenotype. These results suggest that Mbf1 stabilizes Polycomb silencing against fluctuations in the Pcm protein level in vivo. Enhancement of the Polycomb phenotype was also observed in Psc1/+; hs-pcm/+ compared with that in Psc1/+ (Nishioka, 2018).
Biochemical analyses using purified recombinant Mbf1 and Pcm proteins revealed that Mbf1 protects E(z) RNA from degradation by Pcm. RNA protection assays were performed in which in vitro-synthesized E(z) RNA was treated with the RNA pyrophosphatase RppH to convert the 5'-triphosphoryl end into the 5'-monophosphoryl form, which is a Pcm substrate. The RNA was digested with Pcm in the presence or absence of Mbf1. Mbf1 inhibited the digestion of E(z) RNA. In the absence of RppH, RNA degradation was barely detectable, suggesting that the digestion was due to 5'-exoribonuclease activity. Gel filtration of a mixture of Pcm and Mbf1 resulted in the elution of each protein in a clearly separated peak. Furthermore, Mbf1 did not co-immunoprecipitate with Pcm and vice versa. These results suggest that Mbf1 does not inhibit Pcm activity through protein-protein interactions. Collectively, it is concluded that Mbf1 protects E(z) mRNA from degradation by Pcm both in vivo and in vitro (Nishioka, 2018).
It is proposed that cytoplasmic Mbf1 ensures Polycomb silencing by protecting E(z) mRNA from the activity of Pcm. In the mbf1 mutant, E(z) mRNA is free from Mbf1 protein, but pcm expression is downregulated by Polycomb group genes. In the Polycomb group mutant, Pcm expression is upregulated, but E(z) mRNA is partly protected by Mbf1. In the mbf1 Polycomb group double mutant, E(z) mRNA is free from Mbf1 protein and is subject to Pcm attack. Whereas Mbf1 is highly expressed in undifferentiated cells, such as those of embryos, larval testis, ovary, imaginal discs and neuroblasts, its expression is reduced in differentiated tissues, similar to the situation in the mbf1 mutant. This would facilitate the rapid release of developmental genes from Polycomb silencing upon differentiation. Interestingly, expression of mammalian Mbf1 [also termed endothelial differentiation-related factor 1 (Edf1)] and Ezh2 declines immediately after the onset of differentiation (Nishioka, 2018).
A recent study demonstrated that Pcm prevents apoptosis in imaginal discs and downregulates specific transcripts such as hid and reaper. However, suppression of apoptosis did not rescue the lethality of a pcm null mutation at the early pupal stage. Therefore, there might be other targets of Pcm that are essential for early pupal development. The present study indicates that E(z) mRNA could be one such target (Nishioka, 2018).
The mRNA-binding activity of Mbf1 was selective, but might not be strictly specific to E(z) mRNA. Although Polycomb silencing is central to the developmental regulation of gene expression, there could be other mRNAs that bind to Mbf1 in a similar manner, thereby modulating another biological function. Therefore, RIP-seq analysis was conducted to identify Mbf1-bound mRNAs. To ensure robustness of the RIP-seq data, the results were compared independently with two publicly available datasets and identified 804 commonly enriched mRNAs. Among these, the enrichment of four representative mRNAs (GstD5, Ide, Tep2 and Pebp1) was confirmed by RIP RT-qPCR analyses. Interestingly, the expression levels of these four mRNAs decreased in Psc1/+; mbf12/+ and increased in pcmΔ1/Y compared with those in wild type, suggesting that the model can be applied to a wider range of mRNAs than just E(z). However, dependency on the Mbf1/Pcm antagonism appears to differ among the mRNAs (Nishioka, 2018).
Gene ontology and pathway analyses of the 804 genes revealed some interesting properties of the Mbf1-associated mRNAs. The gene ontology terms 'glutathione metabolic process', 'oxidation-reduction process' and 'neurogenesis' which includes E(z), are consistent with the fact that previous studies found defects in oxidative stress defense and axon guidance in the mbf1 mutant. Also of interest are the groups 'positive regulation of innate immune response' and 'defense response to Gram-negative bacterium', as Arabidopsis MBF1 is involved in host defense against microbial infection. Moreover, pathway analysis of the enriched genes implicated Mbf1 in 'drug metabolism', as previously suggested for tamoxifen resistance. This raises an intriguing possibility that Mbf1 contributes to various types of stress defense, metabolic processes and neurogenesis as both a nuclear coactivator and as a cytoplasmic mRNA-stabilizing protein. Although mbf1 null mutants are viable under laboratory conditions, evolutionary conservation of mbf1 suggests that it has essential role(s) under real-world stress conditions (Nishioka, 2018).
Polycomb silencing represses gene expression and provides a molecular memory of chromatin state that is essential for animal development. This study shows that Drosophila female germline stem cells (GSCs) provide a powerful system for studying Polycomb silencing. GSCs have a non-canonical distribution of PRC2 activity and lack silenced chromatin like embryonic progenitors. As GSC daughters differentiate into nurse cells and oocytes, nurse cells, like embryonic somatic cells, silence genes in traditional Polycomb domains and in generally inactive chromatin. Developmentally controlled expression of two Polycomb repressive complex 2 (PRC2)-interacting proteins, Pcl and Scm, initiate silencing during differentiation. In GSCs, abundant Pcl inhibits PRC2-dependent silencing globally, while in nurse cells Pcl declines and newly induced Scm concentrates PRC2 activity on traditional Polycomb domains. These results suggest that PRC2-dependent silencing is developmentally regulated by accessory proteins that either increase the concentration of PRC2 at target sites or inhibit the rate that PRC2 samples chromatin (DeLuca, 2020).
The work described here shows that the Drosophila female germline has multiple advantages for studying the developmental regulation of chromatin silencing both before and during differentiation. Female GSCs continuously divide to produce new undifferentiated progenitors, which expand and differentiate into nurse cells or oocytes, generating large amounts of a much simpler tissue than a developing embryo. Additionally, an inducible reporter assay compatible with the female germline was developed that sensitively responds to developmental changes in local chromatin repression in individual cells. In contrast to RNAseq, which measures steady state RNA levels, or ChIPseq, which correlates chromatin epitopes with their perceived function on gene expression, the reporters directly test how local chromatin influences the inducibility of surrounding genes, and are easily combined with tissue specific knockdowns to identify trans-acting factors contributing to reporter inducibility. Finally, a genetic engineering approach allows any construct (not just hsGFP) to be efficiently integrated into many pre-existing 'donor' sites, including those used previously with other reporters, or sites heavily silenced by repressive chromatin in differentiated cells. Although the number of different donor sites in certain types of chromatin is currently limited, new sites continue to be generated using CRISPR/Cas9 targeting and the method can ultimately be applied virtually anywhere in the genome (DeLuca, 2020).
Analysis of Polycomb repression with reporters, ChIP, and PcG-gene knockdowns provided numerous insights into how chromatin affects gene expression and female germline development in Drosophila. GSCs, the precursors of oocytes and nurse cells, contain a non-canonical, binary distribution of moderate H3K27me3 enrichment on all transcriptionally inactive loci and very low enrichment on active chromatin. A similar non-canonical H3K27me3 distribution was observed in early fly embryos, suggesting that noncanonical chromatin represents a 'ground state' for progenitors that will propagate future generations of undifferentiated germ cells or somatic cells that differentiate into specialized tissues. Such non-canonical chromatin was first identified in mouse oocytes and preimplantation embryos. If non-canonical H3K27me3 chromatin is a characteristic of undifferentiated, totipotent cells, what function might it confer to account for its conservation (DeLuca, 2020)?
Experiments confirmed previous workshowing that chromatin modified by PRC2 is essential for the Drosophila female germ cell cycle. Germline cysts lacking PRC2 are unable to stably generate oocytes, and E(z) GermLine-specific RNAi Knock Down (GLKD) nurse cells mis-express multiple genes and degenerate at about stage 5. In contrast, GSCs lacking PRC2 properly populate their niche, divide, and produce daughters that interact with female follicle cells and begin nurse cell differentiation. Removing PRC2 activity from GSCs did not generally increase the steady state abundance of genes or the inducibility of reporters in H3K27me3-enriched inactive or PcG domains. These results suggest that PRC2 and non-canonical chromatin lack vital functions in undifferentiated germline progenitors but are critical for repressing genes upon differentiation. However, a requirement for PRC2 or non-canonical chromatin under stress conditions or prolonged aging cannot be dismissed. For example, PRC2 could promote the long-term maintenance of female GSCs, similarly to how it maintains male germline progenitors in flies and mice (DeLuca, 2020).
Germline cysts and nurse cells are found in diverse animal species across the entire phylogenetic spectrum, but their function has been well studied mostly in insects such as Drosophila where they persist throughout most of oogenesis. While nurse cells have traditionally been considered germ cells rather than late-differentiating somatic cells, this study shows that that Drosophila nurse cells initiate Polycomb silencing and enrich PRC2 activity on a nearly identical collection of PcG domains as somatic cells. In more distant species, such as mice, nurse cells initially develop in a similar manner within germline cysts and contribute their cytoplasm to oocytes, but undergo programmed cell death before the vast majority of oocyte growth. Consequently, it remains an open question whether somatic differentiation plays a role in nurse cell function in mammals and many other groups (DeLuca, 2020).
The size and composition of oocyte cytoplasm are uniquely tailored to promote optimal fecundity and meet the demands of early development. In some species, including flies, nurse cells synthesize large amounts of specialized ooplasm to rapidly produce multitudes of large, pre-patterned embryos. In others, including mammals, oocytes more slowly synthesize the majority of ooplasm. Interestingly, both ooplasm synthesis strategies apparently require Polycomb silencing. However, the nurse cell-based strategy in flies primarily requires PRC2 but not PRC1 to silence hundreds of somatic genes, while the oocyte-based strategy in mice requires PRC1 but not PRC2 (DeLuca, 2020).
Different strategies of ooplasm synthesis may have evolved to be compatible with noncanonical germ cell chromatin. Staining experiments show that Drosophila oocytes maintain a widely distributed, non-canonical H3K27me3 distribution similar to pre-meiotic precursors or mouse oocytes, suggesting that non-canonical chromatin is conserved and maintained throughout the germ cell cycle. Similar to mouse oocytes, Drosophila spermatocytes also contain non-canonical chromatin and autonomously synthesize large amounts of cytoplasm by deploying PRC1 but not PRC2. Thus, three different types of germ cells are filled with large amounts of differentiated cytoplasm that requires Polycomb silencing for its synthesis, but nevertheless maintain a non-canonical, silencing-deficient PRC2 activity (DeLuca, 2020).
The conservation of undifferentiated, non-canonical chromatin despite a strong selection for Polycomb silencing during ooplasm synthesis argues that non-canonical chromatin must have a presently unappreciated fundamental purpose in germ cells. Noncanonical chromatin could regulate multigenerational processes like mutation, recombination, or transposition, that are not easily assayed in sterile individuals. Tests of these ideas will require a better understanding of how non-canonical chromatin is regulated and methods to disrupt non-canonical chromatin without disrupting other functions required for germline viability. Additionally, non-canonical chromatin could simply result from the silencing- incompetent PRC2 that was observed in progenitors (DeLuca, 2020).
Pcl was uncovered as both an inhibitor of PRC2 silencing and promoter of non-canonical chromatin in GSCs. PclGLKD dramatically altered the footprint of PRC2 activity in GSCs. PclGLKD favored H3K27me3 enrichment on PREs versus inactive domains, and increased the total amount of H3K27me1 by 13-fold and H3K27me2 by 1.4-fold and decreased the total amount of H3K27me3 by 1.8 fold. By binding DNA through its winged-helix domain, Pcl triples PRC2's residence time on chromatin and promotes higher states of H3K27 methylation in vitro. In GSCs, Pcl could simply change the result of each PRC2-chromatin binding event from H3K27me1 to me3. However, it is hard to imagine how an equivalent number of nucleosomes bearing a higher H3K27 methylation state could explain how Pcl inhibits silencing. Instead, it is proposed that Pcl inhibits silencing by reducing the number of PRC2-chomatin binding events per unit time by increasing the residence time of PRC2 on chromatin with each binding event. In this model, PclGLKD would not only convert many H3K27me3 nucleosomes into H3K27me1 nucleosomes, it would also convert many unmethylated nucleosomes into H3K27me1 nucleosomes. PclGLKD would more subtly affect H3K27me2 abundance because it simultaneously increases the number of PRC2-chromatin binding events while reducing the probability of each binding event leading to H3K27me2 versus me1 (DeLuca, 2020).
By reducing the number of PRC2 binding events, Pcl could increase the abundance of unmethylated H3K27 residues available for acetylation - a transcription promoting modification. In both flies and mammals, PRC2 transiently associates with chromatin to mono- and dimethylate H3K27 outside of traditional PcG domains, blocking H3K27 acetylation and antagonizing transcription. This study similarly found strong and widespread PRC2-dependent silencing in H3K27me1/2 enriched chromatin in nurse cells. Because inactive domain silencing was not affected by depletion of Pcl, Jarid2, or H3K27me3, it is proposed that core-PRC2, but not Pcl-PRC2 or H3K27me3, primarily silences inactive chromatin (DeLuca, 2020).
In GSCs, abundant Pcl could saturate PRC2, effectively depleting faster-sampling core-PRC2 complexes in favor of slower sampling Pcl-PRC2. In somatic embryonic cells, Pcl is present in a small fraction of PRC2 complexes. Compared to other fly tissues, Pcl mRNA is most abundant in the ovary, and within the ovary, Pcl protein is much more abundant in GSCs and nurse cell precursors than differentiated nurse cells and somatic cells. Within each differentiated germline cyst, Pcl mRNA is depleted from nurse cells and enriched in oocytes, suggesting that Pcl protein levels may be regulated by an mRNA transport mechanism induced in region 2 that also triggers the differentiation of oocytes from nurse cells (DeLuca, 2020).
Pcl, and a second PRC2-interacting protein, Scm, regulate the transition from noncanonical to canonical chromatin and initiate Polycomb repression. During nurse cell differentiation, it is proposed that Pcl depletion frees core-PRC2 to rapidly sample and silence inactive domains, while Scm (which is absent from the GSC) induction recruits high levels of PRC1 and PRC2 activity around PREs. ScmGLKD nurse cell chromatin retained a noncanonical H3K27me3 pattern characteristic of GSCs, as if differentiation at PcG domains had not occurred. In mice, Scm homologue, Scml2, similarly associates with PcG domains to recruit PRC1/2 and silence PcG targets during male germline development. However, unlike its fly orthologue in female GSCs, Scml2 is expressed in male germline precursors. This difference could explain why mammalian PGCs partially enrich PRC2 activity on CGIs while fly female GSCs do not enrich PRC2 on specific sites. While PcG domain-associated Scm is sufficient to enrich PRC2 activity above background levels found throughout inactive chromatin, a second PRC2 interacting protein, Pcl, is additionally required to promote full PRC2 and H3K27me3 enrichment on PcG domains. Because Scm oligomerizes and interacts with PRC2 in vitro, it could form an array of PRC2 binding sites anchored to PREs through Sfmbt. It is proposed that two cooperative interactions, PRC2 with PRE-tethered Scm, and Pcl with DNA, preferentially concentrate H3K27me3-generating Pcl-PRC2 versus H3K27me1/2-generating core-PRC2 on PcG domains (Figure 6D). H3K27me3 could then be further enriched by H3K27me3-induced allosteric PRC2 activation through the Esc subunit (DeLuca, 2020).
By promoting PRC1 and PRC2 concentration on PcG domains, Scm enhances silencing on PcG-localized reporters. While Scm-depleted nurse cells completed oogenesis, a subset of PcG domain-localized PcG including chinmo and the posterior Hox gene, Abd-b, escaped repression and were potentially loaded into embryos. Eggs derived from ScmGLKD nurse cells failed to hatch, and mis-expressed Abd-b in anterior segments following germ band elongation. This defect more closely resembled maternal plus zygotic than maternal-only Scm mutants, suggesting that ScmGLKD may deplete both maternal and zygotic Scm (DeLuca, 2020).
However, the additional possibility that mis-regulation of maternal Polycomb targets like chinmo contribute to the subtle embryonic defects observed in maternal-only Scm mutant clones cannot be excluded (DeLuca, 2020).
Further study of the Polycomb-mediated repression described in this study will help define the gene regulation program of Drosophila nurse cells and its contribution to oocyte growth. Additional characterization and perturbation of non-canonical chromatin throughout the germ cell cycle will yield further insights into its function in development. Finally, incorporating studies of other chromatin modifications, including H3K9me3-based repression, during germ cell development will contribute to a fuller understanding of how chromatin contributes to an immortal cell lineage (DeLuca, 2020).
Epigenetic silencing by Polycomb group (PcG) complexes can promote epithelial-mesenchymal transition (EMT) and stemness and is associated with malignancy of solid cancers. This study reports a role for Drosophila PcG repression in a partial EMT event that occurs during wing disc eversion, an early event during metamorphosis. A screen for genes required for eversion uncovered the PcG genes Sex combs extra (Sce) and Sex combs midleg (Scm). Depletion of Sce or Scm resulted in internalised wings and thoracic clefts, and loss of Sce inhibited the EMT of the peripodial epithelium and basement membrane breakdown, ex vivo. Targeted DamID (TaDa) using Dam-Pol II showed that Sce knockdown caused a genomic transcriptional response consistent with a shift towards a more stable epithelial fate. Surprisingly only 17 genes were significantly upregulated in Sce-depleted cells, including Abd-B, abd-A, caudal, and nubbin. Each of these loci were enriched for Dam-Pc binding. Of the four genes, only Abd-B was robustly upregulated in cells lacking Sce expression. RNAi knockdown of all four genes could partly suppress the Sce RNAi eversion phenotype, though Abd-B had the strongest effect. The results suggest that in the absence of continued PcG repression peripodial cells express genes such as Abd-B, which promote epithelial state and thereby disrupt eversion. These results emphasise the important role that PcG suppression can play in maintaining cell states required for morphogenetic events throughout development and suggest that PcG repression of Hox genes may affect epithelial traits that could contribute to metastasis (Jefferies, 2020).
Dynamic changes in the three-dimensional (3D) organization of chromatin are associated with central biological processes, such as transcription, replication and development. Therefore, the comprehensive identification and quantification of these changes is fundamental to understanding of evolutionary and regulatory mechanisms. This study presents Comparison of Hi-C Experiments using Structural Similarity (CHESS), an algorithm for the comparison of chromatin contact maps and automatic differential feature extraction. The robustness of CHESS to experimental variability is presented, and its biological applications were showcased on (1) interspecies comparisons of syntenic regions in human and mouse models; (2) intraspecies identification of conformational changes in Zelda-depleted Drosophila embryos; (3) patient-specific aberrant chromatin conformation in a diffuse large B-cell lymphoma sample; and (4) the systematic identification of chromatin contact differences in high-resolution Capture-C data. In summary, CHESS is a computationally efficient method for the comparison and classification of changes in chromatin contact data (Galan, 2020).
Polycomb group proteins (PcGs) drive target gene repression and form large chromatin domains. In Drosophila, DNA elements known as Polycomb group response elements (PREs) recruit PcGs to the DNA. This study shows that, within the invected-engrailed (inv-en) Polycomb domain, strong, constitutive PREs are dispensable for Polycomb domain structure and function. It is suggested that the endogenous chromosomal location imparts stability to this Polycomb domain. To test this possibility, a 79-kb en transgene was inserted into other chromosomal locations. This transgene is functional and forms a Polycomb domain. The spreading of the H3K27me3 repressive mark, characteristic of PcG domains, varies depending on the chromatin context of the transgene. Unlike at the endogenous locus, deletion of the strong, constitutive PREs from the transgene leads to both loss- and gain-of function phenotypes, demonstrating the important role of these regulatory elements. These data show that chromatin context plays an important role in Polycomb domain structure and function (De, 2019).
Polycomb group proteins (PcGs) are critical for organismal development and stem cell maintenance. PcGs were first found in Drosophila as repressors of homeotic genes, and PcG repression is one of the earliest epigenetic regulatory mechanisms to be identified. In Drosophila, nearly all PcG proteins are subunits of one of four principal protein complexes: Polycomb repressive complexes 1 and 2 (PRC1 and PRC2), Pho repressive complex (PhoRC), and Polycomb repressive deubiquitinase (PR-DUB). PcG protein complexes bind to DNA elements known as Polycomb group response elements (PREs), deposit the repressive chromatin modification mark H3K27me3, and drive chromatin compaction leading to gene repression. Genes repressed by PcG are covered with H3K27me3 and are thought to form their own topologically associating domains (TADs). In Drosophila, TADs vary from a few kilobases to several hundred kilobases in size. Regulatory DNAs present within TADs preferentially interact with genes located within the same domain, with limited contacts outside of the TAD boundaries. In Drosophila, genome-wide data suggest that mini-domains formed by actively transcribed regions form the boundaries of some TADs, while other TAD boundaries are demarcated by insulator elements. In mammals, the insulator protein CTCF colocalizes with a subset of TAD boundaries (De, 2019).
The current understanding in chromatin biology hypothesizes that the folding of chromatin into domains assists in the packaging of long stretches of DNA inside the eukaryotic nucleus. Further, it is the organization of these domains that facilitates spatial and temporal regulation of genes within them. Thus, understanding how the domains are formed is of prime importance. How large PcG domains/TADs are formed is a central question. To date, researchers have extensively studied how PcG proteins are recruited to specific DNA sequences and which proteins are present in PcG complexes. PREs are required for the recruitment of PcG proteins in Drosophila and are thought to initiate the formation of a Polycomb domain. The 113-kb PcG domain that encompasses the invected (inv) and engrailed (en) genes of Drosophila. Unexpectedly, deletion of strong, constitutive PREs from the endogenous inv-en domain had little effect on inv-en PcG domain organization. Weak PREs present in the inv-en domain were sufficient to establish and maintain the overall domain organization, and some weak PREs overlap with enhancers present within the inv-en domain. Similarly, deletion of the bxd PRE had a mild effect on Ubx expression, whereas deletion of the iab7 PRE resulted in misexpression of Abd-B in very specific parasegments. A recent report showed that deletion of two PREs from the dac locus causes prominent structural and functional changes in the locus (De, 2019).
The primary aim in this work was to study the effect of "chromatin context" on a PcG domain; the following main conclusions can be drawn from this study. First, the en PcG domain at ectopic genomic sites rescues the null mutants of inv-en. This shows that the information to form the chromatin domain is primarily present within the domain itself. Second, spreading of the chromatin state at different sites of the genome is very much dependent on the local "context" itself. Third, a repressive chromatin domain can interact with null chromatin (has no chromatin mark) but stays segregated from the active chromatin. Fourth, a fragment of DNA containing the strong, constitutive PREs is required at an ectopic location to ensure PcG silencing in all tissues. Strikingly, the endogenous locus is resilient to loss of the same DNA fragment, indicating the importance of context on PcG chromatin domain formation and function. Fifth, ubiquitously expressed flanking genes may act as boundaries for enhancers within PcG domains. These data provide experimental evidence for the hypothesis that chromosomal neighborhood plays an important role in regulating gene expression (De, 2019).
Transgenic assays to test 'functional PREs' have highlighted the effect of chromatin context on reporter gene expression. For example, the strong PREs upstream of en, present in the 1.5-kb fragment examined in this study, only act to repress reporter transgene expression in about 50% of chromosomal insertion sites. Previous work has shown that these PREs are dispensable from the endogenous inv-en locus. The normal phenotype of en80Δ1.5 flies also supports these data. However, this study shows that the HAen79@attP40 transgene has a higher level of H3K27me3 accumulation than HAen79Δ1.5@attP40. This reduced level of H3K27me3 is sufficient to maintain correct en expression in embryos and imaginal discs. Notably, the absence of strong PREs in HAen79Δ1.5@attP40 caused misexpression of en during adult abdomen development. These data emphasize the importance of context itself on the stability and resiliency of the PcG domain toward modification (mutation or deletion) of regulatory sequences present within the domain. The data also show that deletion of the 1.5-kb DNA fragment that includes the strong en PREs rendered the HAen79 transgene unable to rescue inv en mutants. Strikingly, HA-en expression from HAen79Δ1.5 was very low in the embryonic nervous system, a result not seen with HAen79 or in en80Δ1.5 flies. These data suggest that the embryonic nervous system enhancers are not able to interact well with the HA-en promoter in the absence of the 1.5-kb DNA fragment. This same deletion, when present at the endogenous locus, does not cause a loss of en expression in the nervous system. This suggests that the structure of the endogenous locus is resilient to the loss of this DNA. This study suggests that the chromosomal locationof the endogenous locus aids in the proper folding of en to facilitate enhancer-promoter communication in the embryonic nervous system. Another explanation for this observation is that the 1.5-kb PRE could be acting as a Trithorax response element (TRE) and bind to Trithorax group proteins for proper en expression. In any case, the essential role of 1.5-kb PRE/TRE at the ectopic locus is alleviated by the 'local context' effect at the endogenous locus (De, 2019).
It has been proposed that establishment and inheritance of H3K27me3 are dependent on two factors: (i) sequence-specific recruitment of the chromatin modifiers and (ii) the ability of H3K27me3 to act as a template for PRC2 to bind and modify other nucleosomes present in the vicinity. The current experiments show that the spreading of H3K27me3 to the flanking chromatin surrounding the insertion site is not very efficient, although no active chromatin mark was present in the vicinity (particularly at attP3). However, 4C-seq analysis showed that the ectopic en PcG domain interacted with flanking chromatin significantly until it encountered the active domain. These observations indicate that mere interactions between the PcG domain and flanking chromatin were not able to spread the repressive mark efficiently. This observation highlights the importance of PcG recruitment via PREs for PcG domain formation (De, 2019).
One of the surprising observations in this study was the deposition of the repressive mark over the exons of Msp300 in larvae that contain HAen79@attP40. This accumulation is also observed in Kc167 cells but not in larval samples that did not have HAen79 inserted at attP40. It is assumed that some regions in the genome are more susceptible to the PcG regulation and that the weak binding peaks of PcGs near the attP40 site might be acting as PREs to facilitate spreading of the mark. It is posited that, when a PcG domain is inserted into attP40, it renders the adjacent chromatin more likely to form an H3K27me3 domain. It is noted that the Msp300 gene is covered by both H3K27me3 and H3K36me3. This is likely due to the mixed cell population in larval brains and discs. What about the spreading of the H3K27me3 mark over the exons? This is unusual; however, pre-mRNA and cotranscriptional activity have been linked to the local chromatin structure. In a genomic study of many histone modifications in human and Caenorhabditis elegans DNA, the H3K27me3 mark was enriched over exons. In addition, physical interaction between mammalian splicing factors (U2snRNP and Sf3b1) and PcG proteins (Zfp144 and Rnf2) was reported to be required for proper repression of Hox genes. It is speculated that cotranscriptional recruitment
of PcG proteins over the Msp300 gene in the transgenic line might have established the exon-specific deposition of the H3K27me3 mark. It is proposed that the exon-specific H3K27me3 accumulation can act as an intermediate step to repress transcriptionally active target genes and establish PcG domains during development (De, 2019).
Two interesting questions for chromatin biologists are the following: How does a meter-long genome fit into a nucleus and how does this folding influence genome function? High-resolution Hi-C experiments have given structural insights into interphase chromosomes in eukaryotic nuclei. According to the Hi-C data, chromatin is organized into TADs or 'contact domains', and the TADs form compartments A (enriched with active domains) and B (enriched with inactive domains). While the Drosophila genome has 'compartments,' the existence of TADs in Drosophila is disputed, and despite the evidence that TADs and compartments are important for chromatin organization and function, basic information about how these structures are formed and maintained has been
incomplete. The current data shows that it is the intrinsic property of chromatin to segregate based on histone modifications and gene activity. This property of chromatin is not locus specific; in different chromatin contexts, repressed chromatin tends to segregate from active domains. Biochemical and molecular evidence on the antagonistic behavior between the H3K27me3 and H3K36me3 modifications also support this claim and provide evidence toward H3K36me3 as a chromatin component that restricts the PcG-mediated spread of H3K27me3. How are large PcG domains formed and maintained? The establishment of the PcG domain and the spreading of H3K27me3 start from the strong PREs present in the PcG domain during cell cycle 14. Repressive loops between PREs within PcG domains are also formed during cell cycle 14. These loops are proposed to play a synergistic role in establishing PcG domains. Surprisingly, deletion of some strong PREs in situ resulted in weak phenotypes, suggesting redundancy of PcG recruitment. This study provides evidence that apart from the minor PREs present in the large PcG domains, chromatin context itself is a critical factor that determines robustness and function of PcG domains (De, 2019).
Polycomb-group (PcG) proteins are epigenetic regulators that maintain the transcriptional repression of target genes following their initial repression by transcription factors. PcG target genes are repressed in some cells, but active in others. Therefore, a mechanism must exist by which PcG proteins distinguish between the repressed and active states and only assemble repressive chromatin environments at target genes that are repressed. This study presents experimental evidence that the repressed state of a Drosophila PcG target gene, giant (gt), is not identified by the presence of a repressor. Rather, de novo establishment of PcG-mediated silencing at gt is the default state that is prevented by the presence of an activator or coactivator, which may inhibit the catalytic activity of Polycomb-repressive complex 2 (PRC2) (Ghotbi, 2021).
Tissue homeostasis requires long-term lineage fidelity of somatic stem cells. Whether and how age-related changes in somatic stem cells impact the faithful execution of lineage decisions remains largely unknown. This study addressed this question using genome-wide chromatin accessibility and transcriptome analysis as well as single-cell RNA-seq to explore stem-cell-intrinsic changes in the aging Drosophila intestine. These studies indicate that in stem cells of old flies, promoters of Polycomb (Pc) target genes become differentially accessible, resulting in the increased expression of enteroendocrine (EE) cell specification genes. Consistently, age-related changes were found in the composition of the EE progenitor cell population in aging intestines, as well as a significant increase in the proportion of EE-specified intestinal stem cells (ISCs) and progenitors in aging flies. This study further confirmed that Pc-mediated chromatin regulation is a critical determinant of EE cell specification in the Drosophila intestine. Pc is required to maintain expression of stem cell genes while ensuring repression of differentiation and specification genes. These results identify Pc group proteins as central regulators of lineage identity in the intestinal epithelium and highlight the impact of age-related decline in chromatin regulation on tissue homeostasis (Tauc, 2021).
This study identifies a loss of lineage fidelity in ISCs of aging flies that results in an imbalance of EE vs EC differentiation and contributes to epithelial dysplasia. This loss of lineage fidelity is caused by age-related deregulation of Pc function in ISCs, resulting in de-repression and preferential expression of EE genes. It is proposed that the chronic activation of stress signaling in ISCs, triggered by local and systemic inflammatory stimuli in the aging intestine, promotes the deregulation of Pc-controlled gene activity. This is supported by the fact that genetically elevating JNK activity in ISCs disrupts lineage fidelity and causes an increase in the proportion of EEs in the gut epithelium. The increase in EE numbers contributes to epithelial dysplasia in the aging gut, as EEs can promote ISC proliferation. The stress-induced changes in lineage fidelity in ISCs thus likely set up a vicious cycle that causes progressive dysplasia and results in disruption of epithelial structure and function in the aging intestine (Tauc, 2021).
The scRNA-seq analysis of the Drosophila gut is consistent with a recent scRNA-seq study from young flies (Hung, 2020), but also captures the changes in intestinal cell states across aging. Notably, the well-characterized age-associated increase in mitotically active ISCs was observed, but a unique 'stressed ISC' cell population was identified that increases with age. The transcriptional signature distinguishing this cell population encompasses over 25% of significantly upregulated genes observed in a bulk RNA-seq study from purified old ISCs, supporting the robustness and complementarity of the two methods. This transcriptional signature was enriched in genes involved in glutathione metabolic processes, chaperone-mediated protein folding, response to heat, and regulation of cytoskeleton organization, consistent with a stressed or damaged cell state. The appearance of this 'stressed' stem cell population is further consistent with the previously described increase in inflammatory and oxidative stress in the aging intestinal epithelium, and may reflect the age-associated accumulation of oxidative, proteostatic and genomic damage in these cells. Overall, the majority of old ISCs reside in an activated cell state (~50%), whereas the 'stressed ISC' population makes up only a small percentage (<10%). How this 'stressed' population affects tissue homeostasis requires further studies. Intriguingly, the p53 and DNA repair pathways are upregulated in this cell population, while cell cycle genes are repressed, indicating that these cells may represent a correlate to mammalian senescent cells (Tauc, 2021).
The EE progenitor population within one cluster not only increases in size with age, but also upregulates pro-neural genes that are markers for neural stem cells (NSCs). It was recently shown that in a neuroendocrine tumor model, ISCs undergo an identity switch that results in the acquisition of NSC-like features (Li, 2020). This NSC gene signature was also upregulated in old ISCs analyzed by bulk RNA-seq, suggesting that a general upregulation of these genes may contribute to age-related ISC phenotypes (Tauc, 2021).
ATAC-seq data from purified ISC populations revealed only moderate changes in chromatin organization in ISCs of increasingly older animals, suggesting that ISC gene regulation is tightly controlled throughout life. At the same time, the significant increase in H3K27 dimethylation levels in aging ISCs, the fact that H3K27me2 levels are higher in EEs than in ISCs in young guts, and the observation that E(z)-mediated methylation of H3K27 is required for EE specification, all support a role for increases in H3K27me2 in skewing ISC identity towards the EE fate. Despite the high genomic abundance of H3K27me2, which accounts for up to 70% of total histone H3, the functional role of H3K27me2 remains largely uncharacterized. The broad genomic distribution of H3K27me2 was shown to suppress aberrant gene activation by controlling enhancer fidelity in mammals, and access to transcription factors and RNA Pol II to DNA in flies. If and how genomic abundance and/or distribution of H3K27me2 affects cell identity or other cellular functions has not been well explored. One study found that perturbing the ratio of H3K27me2/H3K27me3 in mouse embryonic stem cells (ESCs) affected the acquisition and repression of specific fates of these cells, indicating the importance of appropriate regulation of these marks in different cell types. The differential abundance of H3K27me2 that was observed in young ISCs and EEs further supports the importance of dynamic H3K27me2 regulation in the ISC lineage and of appropriate control of this mark to maintain lineage commitment. The loss of EE cell differentiation upon Pc or E(z) depletion in ISCs further supports a critical role for PRC in regulating H3K27 methylation status and thereby lineage fidelity. Of note, the expression of Pc, E(z) and other PcG genes, as well as the expression of trx, Trl and z, was not significantly altered in aging ISCs, suggesting that aging most likely affects their post-transcriptional regulation and/or function (Tauc, 2021).
Age-related changes in H3K27 methylation have been reported in mammalian SC populations: in aging HSCs and muscle SCs, the H3K27me3 signal exhibits broader coverage and increased intensity at transcriptional start sites and intergenic regions, indicating that there is an evolutionarily conserved effect of aging on PRC function in tissue stem cells. It would be intriguing to explore whether these alterations in H3K27me3 may underlie the age-related dysfunction in lineage potential observed in HSCs of old mice (Tauc, 2021).
Since loss of Pc induces the expression of EC genes, represses EE gene expression, and results in less accessible chromatin associated with ISC identity genes (esg, spdo) as well as pro-neural genes (dpn), it is proposed that Pc activity regulates multiple aspects of ISC specification. Despite the upregulation of EC genes after Pc depletion, ISCs did not spontaneously differentiate, ISC numbers remained normal and ISCs could still mount a proliferative response to infection. Thus, ISC function remained largely intact suggesting the primary function of Pc in ISCs is to regulate lineage commitment (Tauc, 2021).
The contrasting function of trx in EE differentiation is consistent with the known antagonism between Trx and Pc complexes and exemplifies a tightly regulated interplay of these systems in lineage commitment. In addition to upregulating EE genes, loss of trx also induced cell cycle genes and ISC proliferation, suggesting additional roles in controlling ISC function. This finding is in line with a recently published study showing TrxG factors Kismet and Trr limit ISC proliferation in the fly midgut. The fact that this group did not report changes in the EE lineage most likely reflects functional differences across TrxG complexes, the composition of which varies greatly (Tauc, 2021).
A similar function in stem cell specification has been described for PRC1 and PRC2 in the mouse, where both complexes are important to preserve ISC and progenitor cell identity in the gut, while regulating specification into specific daughter cell lineages. Loss of PRC1 function in the intestinal epithelium resulted in impairment of ISC self-renewal via de-repression non-intestinal lineage genes as well as negative regulators of the Wnt signaling pathway. Interestingly, the effects of PRC1 loss were independent of H3K27me3, revealing instead the role of H2AK119 mono-ubiquitination. PRC2 was shown to be important in ISCs only during damage-induced regeneration. In contrast, another study found significant degeneration of the SC compartment under homeostatic conditions as well. Reconciling these findings, a recent study revealed that in the absence of PRC2, mammalian cells shed H3K27me3 exclusively by replicational dilution of modified nucleosomes, and that the effects of PRC2 deletion are thus only observed in lineage progeny rather than in stem cells themselves. Both previous studies are in agreement, however, that PRC2 controls cell fate decisions, as loss of PRC2 leads to an accumulation of secretory cells, evidently due to de-repression of the secretory lineage master regulator, Atoh1, resulting in ISC differentiation. It remains unknown how aging affects PRC function and H3K27 methylation in the mammalian intestine (Tauc, 2021).
Only a few studies have rigorously investigated age-associated changes in the mammalian intestine on a histological and cell-type-specific level. One study reported changes in crypt architecture, decreased mitotic potential of ISCs and an increase in the secretory cell lineage, most likely due to increased Atoh1 expression. Given the role of PRC2 in regulating the secretory lineage in both the mammalian intestine and the fly, it is tempting to speculate a conserved age-related increase in the secretory lineage that stems from deregulation of PRC. The results support a role for increased stress signaling in driving this lineage imbalance, as overactive JNK in ISCs promotes EE differentiation. While JNK signaling has been reported to suppress Pc complex function, the data indicate that in the ISC lineage, this interaction is more complex, as both JNK activation promotes EE specification, while Pc knockdown inhibits EE specification. Further studies are needed to explore the molecular mechanisms mediating JNK/Pc interactions in the ISC lineage (Tauc, 2021).
Chronic elevation of inflammatory signaling is a well-characterized hallmark of the aging fly intestine and a hallmark of many intestinal disorders including inflammatory bowel disease (IBD), infection and colorectal cancer. Alterations in EE cell numbers and secretory activity have been reported to play a role in many diseases. In IBD, for example, EE cells were shown to contribute to pathogenesis by producing pro-inflammatory cytokines. In another study, increased numbers of EE cells were reported in human patients with chronic ulcerative colitis, potentially promoting IBD associated neoplasias. Notably, this study showed that lowering EE numbers by long-term depletion of Pc in ISCs inhibited age-induced intestinal dysplasia, supporting a pathological role for EEs in aging (Tauc, 2021).
The role of epigenetic alterations, and specifically the role of PRC, in inflammatory diseases and cancer is still under investigation. It was recently reported that suppressing EZH2 activity ameliorates experimental intestinal inflammation and delays the onset of colitis-associated cancer. However, these effects may be a consequence of EZH2 suppression in myeloid cells rather than in intestinal stem cells. Disruption in PRC2 function may also underlie human cancers, where PRC2 is often hyperactive or overexpressed. Activating EZH2 mutations, which increase total H3K27me3 levels, increase tumor survival and growth in pre-clinical models and are found in up to 24% of diffuse large B cell lymphomas. Additional work will be needed to establish whether age-related changes in PRC activity contribute to the increased onset of gastrointestinal cancers during aging (Tauc, 2021).
Taken together, these findings provide evidence for altered ISC cell states in old flies that affect intestinal homeostasis and contribute to tissue dysplasia. The results exemplify the importance of maintaining appropriate lineage decisions, as overproduction of EE cells is detrimental to the epithelium, but can be rescued by re-balancing the system towards normal EE numbers. Age-associated deregulation of lineage fidelity of ISCs due to elevated stress and misregulation of Pc are proposed as key drivers of functional decline of the intestinal epithelium. Pc group proteins may thus represent valuable therapeutic targets for age-related morbidities (Tauc, 2021).
Maternally-deposited morphogens specify the fates of embryonic cells via hierarchically regulating the expression of zygotic genes that encode various classes of developmental regulators. Once the cell fates are determined, Polycomb-group proteins frequently maintain the repressed state of the genes. This study investigates how Polycomb-group proteins repress the expression of tailless, which encodes a developmental regulator in Drosophila embryo. Previous studies have shown that maternal Tramtrack69 facilitates maternal GAGA-binding factor and Heat shock factor binding to the torso response element (tor-RE) to initiate tailless repression in the stage-4 embryo. Chromatin-immunoprecipitation and genetic-interaction studies exhibit that maternally-deposited Polycomb repressive complex 1 (PRC1) recruited by the tor-RE-associated Tramtrack69 represses tailless expression in the stage-4 embryo. A noncanonical Polycomb-group response element (PRE) is mapped to the tailless proximal region. High levels of Bric-a-brac, Tramtrack, and Broad (BTB)-domain proteins are fundamental for maintaining tailless repression in the stage-8 to -10 embryos. Trmtrack69 sporadically distributes in the linear BTB-domain oligomer, which recruits and retains a high level of PRC1 near the GCCAT cluster for repressing tll expression in the stage-14 embryos. Disrupting the retention of PRC1 decreases the levels of PRC1 and Pleiohomeotic protein substantially on the PRE and causes tailless derepression in the stage-14 embryo. Furthermore, the retained PRC1 potentially serves as a second foundation for assembling the well-characterized polymer of the Sterile alpha motif domain in Polyhomeotic= protein, which compacts chromatin to maintain the repressed state of tailless in the embryos after stage 14 (Liaw, 2022).
Polycomb group (PcG) and trithorax group (trxG) proteins contribute to the specialization of cell types by maintaining differential gene expression patterns. This study aimed at discovering novel factors that elicit an anti-silencing effect to facilitate trxG-mediated gene activation. This study has developed a cell-based reporter system and performed a genome-wide RNAi screen to discover novel factors involved in trxG-mediated gene regulation in Drosophila. More than 200 genes were discovered affecting the reporter in a manner similar to trxG genes. From the list of top candidates, (Enoki mushroom), a known histone acetyltransferase, was characterized as an important regulator of trxG in Drosophila. Mutants of enok strongly suppressed extra sex comb phenotype of Pc mutants and enhanced homeotic transformations associated with trx mutations. Enok colocalizes with both TRX and PC at chromatin. Moreover, depletion of Enok specifically resulted in an increased enrichment of PC and consequently silencing of trxG targets. This downregulation of trxG targets was also accompanied by a decreased occupancy of RNA-Pol-II in the gene body, correlating with an increased stalling at the transcription start sites of these genes. It is proposed that Enok facilitates trxG-mediated maintenance of gene activation by specifically counteracting PcG-mediated repression. This ex vivo approach led to identification of new trxG candidate genes that warrant further investigation. Presence of chromatin modifiers as well as known members of trxG and their interactors in the genome-wide RNAi screen validated the reverse genetics approach. Genetic and molecular characterization of Enok revealed a hitherto unknown interplay between Enok and PcG/trxG system. It is concluded that histone acetylation by Enok positively impacts the maintenance of trxG-regulated gene activation by inhibiting PRC1-mediated transcriptional repression (Umer, 2019).
This study has developed an ex vivo approach that led to the discovery of several new genes regulating trxG-mediated gene activation. Using a well-characterized bxd-PRE-reporter, comprised of Ubx promoter and enhancers, a cell-based assay was developed, and a genome-wide RNAi screen in Drosophila was performed. Based on the Z scores of trx and ash1 knockdown, a stringent cut-off was defined and more than 200 genes affecting the reporter in a manner similar to trxG genes were identifed. Identification of known members of trxG and their interactors as well as chromatin modifiers in the genome-wide RNAi screen validated the reverse genetics approach and efficacy of the reporter system to discover new regulators of trxG. Moreover, presence of chromatin modifiers like members of TIP60 complex and proteins associated with RNA polymerase II, known to interact with trxG, further substantiates that regulators of gene activation were predominantly identified. Although only a subset of known trxG members were identified in the screen, failure to identify all can be attributed to the highly context-dependent working of PcG/trxG system. Since two specific enhancers of Ubx drive the expression of the reporter, it might be regulated by only a subset of trxG members, which could further explain the failure to identify all members of trxG. Interestingly, some of the top scoring candidates in the screen were also recently found to be a part of the interaction network of GAGA factor, a known trxG member (Umer, 2019).
TrxG-like behavior of Enok was characterized, and its genetic and molecular link with trxG was established. Although Drosophila Enok has previously been shown to interact with PC and Ash1 , its physiological relevance with PcG/trxG or epigenetic cellular memory remains elusive. The current results demonstrate that enok behaves like a trxG gene, by antagonizing PcG, and is essential for maintaining active gene expression in Drosophila. Appearance of extra sex combs in Pc heterozygous males is a consequence of ectopic activation of homeotic genes which relies upon the trxG. However, depletion of trxG proteins counteracts the reduced dose of PC, restoring normal regulation of homeotic genes and suppressing the extra sex comb phenotype. Strong suppression of extra sex comb phenotype by two different mutants of enok illustrates that it acts as a trxG gene, consequently counteracting repression maintained by PcG. This finding is further supported by the fact that both mutant alleles of enok strongly enhance trx mutant phenotype, which also corroborates with drastic reduction in transcript levels of trxG target genes in embryos lacking functional enok. A significant overlap between Enok and TRX at chromatin further validates the genetic analysis. Since depletion of enok led to increased PC binding and enhanced H2AK118ub1 at trxG targets, it is suggested that Enok may specifically inhibit PRC1 and facilitate anti-silencing activity of trxG. In contrast, no change in enrichment of E(z) and its associated mark, H3K27me3, was observed at TSS of trxG targets in cells with reduced enok, indicating recruitment of PRC1 in a potentially H3K27me3-independent manner. Such PRC2-independent recruitment of PRC1 has also been reported previously (Umer, 2019).
In light of these results, it is proposed that Enok counteracts PRC1-mediated block of transcription, evident in the form of stalled Pol-II at the TSS of pnr and pnt in cells with depleted enok. Molecular interaction of Enok with PRC1 on developmental genes in flies and humans (Kang, 2017) further supports the notion that Enok facilitates trxG by inhibiting PRC1. In mice, MOZ (homolog of Enok) is known to play an antagonistic role to PcG member BMI1 in regulating Hox genes (Sheikh, 2015). In agreement with the finding that PC chromodomain binding to H3K27me3 requires an unmodified H3K23, the data suggest that Enok-mediated H3K23ac inhibits binding of PC to its target genes. It is proposed that in the presence of Enok at active loci, acetylated H3K23 inhibits binding of PRC1 leading to increased transcriptional activity of Pol-II. In contrast, loss of Enok leads to decreased H3K23ac, thus allowing PRC1 binding and consequent stalling of Pol-II at TSS. Since Enok was also found to associate with silent loci (bxd, Dfd, iab-7) and interact with PRC1, it is suggested that Enok is kept in an inactive state on these loci by PC in a manner similar to the inhibitory interaction between PC and CBP. Further molecular and biochemical characterization of this intricate relationship between PcG and Enok will help discover how trxG maintains dynamic gene expression patterns during development (Umer, 2019).
In summary, this study has developed a cell-based assay for an ex vivo genome-wide RNAi screen to identify potential trxG regulators in Drosophila. The RNAi screen led to the discovery of more than 200 genes that perturbed the luciferase-based reporter in a manner similar to known trxG members. This study has also provided evidence that Enok, a top trxG candidate in the screen, contributes to anti-silencing action of trxG by counteracting PcG proteins. It is proposed that H3K23 acetylation by Enok counteracts PcG-mediated suppression by inhibiting PRC1 recruitment, contributing to gene activation. Genetic and molecular evidence obtained suggests that Enok interacts with trxG and as a result with their major developmental regulatory targets, thus providing a possible molecular link through which it could influence epigenetic cell memory (Umer, 2019).
The MOZ/MORF histone acetyltransferase complex is highly conserved in eukaryotes and controls transcription, development, and tumorigenesis. However, little is known about how its chromatin localization is regulated. Inhibitor of growth 5 (ING5) tumor suppressor is a subunit of the MOZ/MORF complex. Nevertheless, the in vivo function of ING5 remains unclear. This study reports an antagonistic interaction between Drosophila Translationally controlled tumor protein (TCTP) (Tctp) and ING5 (Ing5) required for chromatin localization of the MOZ/MORF (Enok) complex and H3K23 acetylation. Yeast two-hybrid screening using Tctp identified Ing5 as a unique binding partner. In vivo, Ing5 controlled differentiation and down-regulated epidermal growth factor receptor signaling, whereas it is required in the Yorkie (Yki) pathway to determine organ size. Ing5 and Enok mutants promoted tumor-like tissue overgrowth when combined with uncontrolled Yki activity. Tctp depletion rescued the abnormal phenotypes of the Ing5 mutation and increased the nuclear translocation of Ing5 and chromatin binding of Enok. Nonfunctional Enok promoted the nuclear translocation of Ing5 by reducing Tctp, indicating a feedback mechanism between Tctp, Ing5, and Enok to regulate histone acetylation. Therefore, Tctp is essential for H3K23 acetylation by controlling the nuclear translocation of Ing5 and chromatin localization of Enok, providing insights into the roles of human TCTP and ING5-MOZ/MORF in tumorigenesis (Kim, 2023).
Regulatory decisions in Drosophila require Polycomb group (PcG) proteins to maintain the silent state and Trithorax group (TrxG) proteins to oppose silencing. Since PcG and TrxG are ubiquitous and lack apparent sequence specificity, a long-standing model is that targeting occurs via protein interactions; for instance, between repressors and PcG proteins. Instead, this study found that Pc-repressive complex 1 (PRC1) purifies with coactivators Fs(1)h [female sterile (1) homeotic] and Enok/Br140 during embryogenesis. Fs(1)h is a TrxG member and the ortholog of BRD4, a bromodomain protein that binds to acetylated histones and is a key transcriptional coactivator in mammals. Enok and Br140, another bromodomain protein, are orthologous to subunits of a mammalian MOZ/MORF acetyltransferase complex. This study confirmed PRC1-Br140 and PRC1-Fs(1)h interactions and identified their genomic binding sites. PRC1-Br140 bind developmental genes in fly embryos, with analogous co-occupancy of PRC1 and a Br140 ortholog, BRD1, at bivalent loci in human embryonic stem (ES) cells. It is proposed that identification of PRC1-Br140 'bivalent complexes' in fly embryos supports and extends the bivalency model posited in mammalian cells, in which the coexistence of H3K4me3 and H3K27me3 at developmental promoters represents a poised transcriptional state. It is further speculated that local competition between acetylation and deacetylation may play a critical role in the resolution of bivalent protein complexes during developments (Kang, 2017).
Inappropriate activation and/or repression of gene expression underlies many human diseases, yet the mechanisms that execute transitions in developmental gene expression remain poorly defined. How are genes chosen to be initially active or repressed, and how are transitions in gene activity managed with fidelity? Transcription factors clearly regulate these changes, but how can this regulation occur with such specificity when their consensus binding sites and genomic occupancy appear so promiscuous? Together, proteomic and ChIP-seq analyses suggest a model in which PRC1 and MOZ/MORF function to create a poised regulatory state during development (see Model for the role of bivalent complexes in developmental transitions of transcriptional state). As cells differentiate, bivalent protein complexes may eventually be diminished locally, as most loci resolve into either an active or silent state. It is speculated that the choice of activation may occur via increased acetylation, influenced by nearby transcription factors, and subsequent enrichment of Fs(1)h and TrxG proteins such as Ash1, which was specifically recovered in a Br140 pull-down. A transition toward silencing may involve deacetylation and a decrease in TrxG (Kang, 2017).
The retention of some bivalency after initial transcriptional choices are made in embryogenesis is likely to allow critical reversibility for subsequent gene expression programming. However, if transcriptional state is not dictated strictly by the occupancy of bivalent components, how are these states manifested? It is speculated that local post-translational modifications (PTMs) may be critical for the specification of transcriptional state and for reversibility. For example, the Enok subunit of dMOZ/MORF is known to acetylate H3K23, while this mark is incompatible with Pc chromodomain binding to H3K27me3 on the same histone tail. Interestingly, enrichment of H3K23ac from modENCODE data sets on the set of potentially bivalent genes was not observed, but further analysis will be required to investigate the significance of this finding. Competition between the cognate enzymatic activities within bivalent complexes and their interactors may be central to their ability to act as reversible switches of transcriptional state. Future studies to address this hypothesis will require improved approaches to comprehensive PTM detection as well as in vitro reconstitution of key interactions and biochemical activities of bivalent complexes containing the appropriately modified subunits (Kang, 2017).
The results are consistent with recent studies in which PRC1 is found on active genes in many systems, and PRC1 targeting is largely independent of PRC2 (Kahn, 2016). Most exciting is the likely conservation in zebrafish (Laue, 2008) and mice (Sheikh, 2015), based on the opposing genetic activities of PRC1 and MOZ/MORF complexes in regulation of the Hox genes. The reliance on a universal transducer of transcription factor activity in developmental decisions would be an elegant solution to the problem of widespread binding of sequence-specific regulators, as, in the model (see Model for the role of bivalent complexes in developmental transitions of transcriptional state), only local interactions with preset bivalency will result in functional consequences (Kang, 2017).
Key fundamental questions remain. In particular, how are PRC1 and MOZ/MORF targeted in the first place? PREs are cis-acting regulatory elements that can recruit PRC1 and PRC2 to target genes in Drosophila. PREs lack universal consensus sequences but contain combinations of motifs for many DNA-binding proteins. Therefore, diverse protein–protein interactions with the PcG could be critical for initial binding, as postulated from classical genetics. A speculative alternative is that the 5′ TSSs of developmentally regulated genes may remain epigenetically marked throughout the life cycle of the organism to specify the initial association of bivalent complexes. Both BRD4 and BRPF1 have been identified as 'bookmarking proteins' that may retain vital information throughout the cell cycle, based on their ability to remain at their chromosomal binding sites through mitosis (Dey, 2003; Laue, 2008). Furthermore, Fs(1)h and Enok are essential for oogenesis, and genic acetylation is detected very early in embryogenesis (Li, 2014). Finally, the importance of maternal E(z) suggests that H3K27me3 could be at least part of such an inherited mark for developmental genes (Kang, 2017).
In summary, the results provide evidence for bivalent protein complexes that may correspond to a bivalent transcriptional state in Drosophila embryos and mammalian stem cells. Beyond identification of these intriguing protein interactions in flies, it is speculated that their identity reveals a likely role for acetylation in the resolution of bivalency. It is envisioned that the choice toward activation may be triggered and maintained by cell-specific transcription factors that drive the acetylated state, favoring MOZ/MORF and BRD4 bromodomain-dependent association with chromatin. Cell type decisions may be governed by a constant assessment of the amount of acetylation at each TSS, consistent with the enrichment of deacetylases on even very active genes. Deacetylation would favor loss of bromodomain–acetyl interactions and ultimately the loss of coactivators, leading toward the establishment of a stably silenced state. The ability to regulate genes while only partially resolving bivalent complexes is likely to be critical for reversibility in response to changes in cell type-specific transcription factor expression and binding. It is proposed that regulatory elements possess the intrinsic ability to switch fate dependent on this local balance, with de novo targeting rarely required (Kang, 2017).
Neural progenitors produce diverse cells in a stereotyped birth order, but can specify each cell type for only a limited duration. In the Drosophila embryo, neuroblasts (neural progenitors) specify multiple, distinct neurons by sequentially expressing a series of temporal identity transcription factors with each division. Hunchback (Hb), the first of the series, specifies early-born neuronal identity. Neuroblast competence to generate early-born neurons is terminated when the hb gene relocates to the neuroblast nuclear lamina, rendering it refractory to activation in descendent neurons. Mechanisms and trans-acting factors underlying this process are poorly understood. This study identified Corto, an enhancer of Trithorax/Polycomb (ETP) protein, as a new regulator of neuroblast competence. The GAL4/UAS system was used to drive persistent misexpression of Hb in neuroblast 7-1 (NB7-1), a model lineage for which the early competence window has been well characterized, to examine the role of Corto in neuroblast competence. immuno-DNA Fluorescence in situ hybridization (DNA FISH) was used in whole embryos to track the position of the hb gene locus specifically in neuroblasts across developmental time, comparing corto mutants to control embryos. Finally, immunostaining was used in whole embryos to examine Corto's role in repression of Hb and a known target gene, Abdominal B (Abd-B). In corto mutants, the hb gene relocation to the neuroblast nuclear lamina was found to be delayed and the early competence window is extended. The delay in gene relocation occurs after hb transcription is already terminated in the neuroblast and is not due to prolonged transcriptional activity. Further, it was found that Corto genetically interacts with Posterior Sex Combs (Psc), a core subunit of polycomb group complex 1 (PRC1), to terminate early competence. Loss of Corto does not result in derepression of Hb or its Hox target, Abd-B, specifically in neuroblasts. These results show that in neuroblasts, Corto genetically interacts with PRC1 to regulate timing of nuclear architecture reorganization and support the model that distinct mechanisms of silencing are implemented in a step-wise fashion during development to regulate cell fate gene expression in neuronal progeny (Hafer, 2022).
To maintain cellular identities during development, gene expression profiles must be faithfully propagated through cell generations. The reestablishment of gene expression patterns upon mitotic exit is mediated, in part, by transcription factors (TF) mitotic bookmarking. However, the mechanisms and functions of TF mitotic bookmarking during early embryogenesis remain poorly understood. This study took advantage of the naturally synchronized mitoses of Drosophila early embryos, providing evidence that GAGA pioneer factor (GAF) acts as a stable mitotic bookmarker during zygotic genome activation. During mitosis, GAF remains associated to a large fraction of its interphase targets, including at cis-regulatory sequences of key developmental genes with both active and repressive chromatin signatures. GAF mitotic targets are globally accessible during mitosis and are bookmarked via histone acetylation (H4K8ac). By monitoring the kinetics of transcriptional activation in living embryos, this study reports that GAF binding establishes competence for rapid activation upon mitotic exit (Bellec, 2022).
This study set out to determine how gene regulation by a transcription factor might be propagated through mitosis in a developing embryo. By using a combination of quantitative live imaging and genomics, evidence is provided that the pioneer-like factor GAF acts as a stable mitotic bookmarker during zygotic genome activation in Drosophila embryos (Bellec, 2022).
The results indicate that during mitosis, GAF binds to an important fraction of its interphase targets, largely representing cis-regulatory sequences of key developmental genes. It was noticed that GAF mitotically retained targets contain a larger number of GAGA repeats than GAF interphase-only targets and that this number of GAGA repeats correlates with the broadness of accessibility. Multiple experiments, with model genes in vitro (e.g., hsp70, hsp26) or from genome-wide approaches clearly demonstrated that GAF contributes to the generation of nucleosome-free regions. The general view is that this capacity is permitted through the interaction of GAF with nucleosome remodeling factors as PBAP (SWI/SNIF), NURF (ISWI), or FACT. Although not yet confirmed with live imaging, immunostaining data suggest that NURF is removed during metaphase but re-engages chromatin by anaphase. If the other partners of GAF implicated in chromatin remodeling are evicted during early mitosis, chromatin accessibility at GAF mitotic targets could be established prior to mitosis onset and then maintained through mitosis owing to the remarkable stability of GAF binding. However, GAF interactions with other chromatin remodelers (e.g., PBAP) during mitosis and a scenario whereby mitotic accessibility at GAF targets would be dynamically established during mitosis thanks to the coordinated action of GAF and its partners cannot be excluded (Bellec, 2022).
It is proposed that the function of GAF as a mitotic bookmarker is possible because GAF has the intrinsic property to remain bound to chromatin for long periods (residence time in the order of minutes). This long engagement of GAF to DNA is in sharp contrast with the binding kinetics of many other TF, such as Zelda or Bicoid in Drosophila embryos or pluripotency TF in mouse ES cells. Another particularity of GAF binding, contrasting with other TF, resides in the multimerization of its DNA-binding sites as GAGAG repeats in a subset of its targets (76% of mitotically retained peaks display four or more repetitions of GAGAG motifs). Given the known oligomerization of GAF70 and as GAF is able to regulate transcription in a cooperative manner, it is tempting to speculate that GAF cooperative binding on long stretches of GAGAG motifs may contribute to a long residence time (Bellec, 2022).
Collectively, it is proposed that the combination of long residence time and the organization of GAF-binding sites in the genome may allow the stable bookmarking of a subset of GAF targets during mitosis (Bellec, 2022).
In this study, it was also discovered that a combination of GAF and histone modification could be at play to maintain the chromatin state during mitosis. Indeed, mitotic bookmarking may also be supported by the propagation of histone tail modifications from mother to daughter cells. Work from mammalian cultured cells revealed widespread mitotic bookmarking by epigenetic modifications, such as H3K27ac and H4K16ac. Moreover, H4K16ac transmission from maternal germline to embryos has recently been established. In the case of GAF, it is proposed that the combinatorial action of GAF and epigenetic marks, possibly selected via GAF interacting partners, will contribute to the propagation of various epigenetic programs. It would be therefore interesting to employ the established mitotic ChIP method to survey the extent to which cis-regulatory regions exhibit different mitotic histone mark modifications during embryogenesis (Bellec, 2022).
A key aspect of mitotic bookmarking is to relate mitotic binding to the rapid transcriptional activation after mitosis. This study has shown that GAF plays a role in the timing of reactivation after mitosis. However, it is noted that GAF binding during mitosis is not the only means to accelerate gene activation. Indeed, it has been shown shown that mechanisms such as enhancer priming by Zelda, paused polymerase or redundant enhancers contribute to fast gene activation. Moreover, a transcriptional memory bias can occur for a transgene not regulated by GAF. By modeling the transcriptional activation of the gene scylla, it was revealed that GAF accelerates the epigenetic steps prior to activation, selectively in the descendants of active nuclei. A model is proposed where GAF binding helps in the decision-making of the postmitotic epigenetic path. In this model, mitotic bookmarking by GAF would favor an epigenetic path with fast transitions after mitosis. In the context of embryogenesis, bookmarking would lead to the fast transmission of select epigenetic states and may contribute to gene expression precision (Bellec, 2022).
Interestingly, GAF vertebrate homolog (vGAF/Th-POK) has recently been implicated in the maintenance of chromatin domains during zebrafish development. It is therefore suspected that GAF action as a stable bookmarking factor controlling transcriptional memory during Drosophila ZGA might be conserved in vertebrates (Bellec, 2022).
Environmental temperature can affect chromatin-based gene regulation, in particular in ectotherms such as insects. Genes regulated by the Polycomb group (PcG) vary in their transcriptional output in response to changes in temperature. Expression of PcG-regulated genes typically increases with decreasing temperatures. This study examined variations in temperature-sensitive expression of PcG target genes in natural populations from different climates of Drosophila melanogaster, and differences thereof across different fly stages and tissues. Temperature-induced expression plasticity was found to be stage- and sex-specific with differences in the specificity between the examined PcG target genes. Some tissues and stages, however, showed a higher number of PcG target genes with temperature-sensitive expression than others. Overall, higher levels of temperature-induced expression plasticity was found in African tropical flies from the ancestral species range than in flies from temperate Europe. Differences were also observed between temperate flies, however, with more reduction of expression plasticity in warm-temperate than in cold-temperate populations. Although in general, temperature-sensitive expression appeared to be detrimental in temperate climates, there were also cases in which plasticity was increased in temperate flies, as well as no changes in expression plasticity between flies from different climates (Voigt, 2021).
Following acute genotoxic stress, both normal and tumorous stem cells can undergo cell-cycle arrest to avoid apoptosis and later re-enter the cell cycle to regenerate daughter cells. However, the mechanism of protective, reversible proliferative arrest, "quiescence," remains unresolved. This study shows that mitophagy is a prerequisite for reversible quiescence in both irradiated Drosophila germline stem cells (GSCs) and human induced pluripotent stem cells (hiPSCs). In GSCs, mitofission (Drp1) or mitophagy (Pink1/Parkin) genes are essential to enter quiescence, whereas mitochondrial biogenesis (PGC1α) or fusion (Mfn2) genes are crucial for exiting quiescence. Furthermore, mitophagy-dependent quiescence lies downstream of mTOR- and PRC2-mediated repression and relies on the mitochondrial pool of cyclin E. Mitophagy-dependent reduction of cyclin E in GSCs and in hiPSCs during mTOR inhibition prevents the usual G1/S transition, pushing the cells toward reversible quiescence (G0). This alternative method of G1/S control may present new opportunities for therapeutic purposes (Taslim, 2023).
The mechanisms coordinating energy consumption during stress response in multicellular organisms are not well understood. This study shows that loss of the epigenetic regulator G9a in Drosophila causes a shift in the transcriptional and metabolic responses to oxidative stress (OS) that leads to decreased survival time upon feeding the xenobiotic paraquat. During OS exposure, G9a mutants show overactivation of stress response genes, rapid depletion of glycogen, and inability to access lipid energy stores. The OS survival deficiency of G9a mutants can be rescued by a high-sugar diet. Control flies also show improved OS survival when fed a high-sugar diet, suggesting that energy availability is generally a limiting factor for OS tolerance. Directly limiting access to glycogen stores by knocking down glycogen phosphorylase recapitulates the OS-induced survival defects of G9a mutants. It is proposed that G9a mutants are sensitive to stress because they experience a net reduction in available energy due to (1) rapid glycogen use, (2) an inability to access lipid energy stores, and (3) an overinduced transcriptional response to stress that further exacerbates energy demands. This suggests that G9a acts as a critical regulatory hub between the transcriptional and metabolic responses to OS. These findings, together with recent studies that established a role for G9a in hypoxia resistance in cancer cell lines, suggest that G9a is of wide importance in controlling the cellular and organismal response to multiple types of stress (Riahi, 2019).
3' end cleavage of metazoan replication-dependent histone pre-mRNAs requires the multi-subunit holo-U7 snRNP and the stem-loop binding protein (SLBP). The exact composition of the U7 snRNP and details of SLBP function in processing remain unclear. To identify components of the U7 snRNP in an unbiased manner, a novel approach was developed for purifying processing complexes from Drosophila and mouse nuclear extracts. In this method, catalytically active processing complexes are assembled in vitro on a cleavage-resistant histone pre-mRNA containing biotin and a photo-sensitive linker, and eluted from streptavidin beads by UV irradiation for direct analysis by mass spectrometry. In the purified processing complexes, Drosophila and mouse U7 snRNP have a remarkably similar composition, always being associated with CPSF73, CPSF100, symplekin and CstF64. Many other proteins previously implicated in the U7-dependent processing are not present. Drosophila U7 snRNP bound to histone pre-mRNA in the absence of SLBP contains the same subset of polyadenylation factors but is catalytically inactive and addition of recombinant SLBP is sufficient to trigger cleavage. This result suggests that Drosophila SLBP promotes a structural rearrangement of the processing complex, resulting in juxtaposition of the CPSF73 endonuclease with the cleavage site in the pre-mRNA substrate (Skrajna, 2018).
In metazoans, 3' end processing of replication-dependent histone pre-mRNAs occurs through a single endonucleolytic cleavage, generating mature histone mRNAs that lack a poly(A) tail. This specialized 3' end processing reaction depends on the U7 snRNP, the core of which consists of a ∼60-nt U7 snRNA and a unique heptameric Sm ring. In the ring, the spliceosomal subunits SmD1 and SmD2 are replaced by the related Lsm10 and Lsm11 proteins, whereas the remaining subunits (SmB, SmD3, SmE, SmF and SmG) are shared with the spliceosomal snRNPs (Skrajna, 2018).
Lsm11 contains an extended N-terminal region that interacts with the N-terminal region of the 220 kDa protein FLASH. Together, they recruit a specific subset of the proteins that participate in 3' end processing of canonical pre-mRNAs by cleavage and polyadenylation, resulting in formation of the holo-U7 snRNP (Skrajna, 2014). This subset of polyadenylation factors is referred to as the histone pre-mRNA cleavage complex (HCC) and in mammalian nuclear extracts includes symplekin, all subunits of CPSF (CPSF160, WDR33, CPSF100, CPSF73, Fip1 and CPSF30) and CstF64 as the only CstF subunit. The remaining components of the cleavage and polyadenylation machinery, including CstF50 and CstF77, the two CF Im subunits of 68 and 25 kDa, and the two subunits of CF IIm (Clp1 and Pcf11) were consistently absent in the HCC (Yang, 2013). A similar subset of polyadenylation factors is associated with the Drosophila holo-U7 snRNP (Sabath, 2013; Skrajna, 2018 and references therein).
The substrate specificity in the processing reaction is provided by the U7 snRNA, which through its 5' terminal region base pairs with the histone downstream element (HDE), a sequence in histone pre-mRNA located downstream of the cleavage site. This interaction is assisted by the stem-loop binding protein (SLBP), which binds the highly conserved stem-loop structure located upstream of the cleavage site (Wang, 1996; Martin, 1997; Tan, 2013) and stabilizes the complex of U7 snRNP with histone pre-mRNA (Dominski, 1999), likely by contacting FLASH and Lsm11 (Skrajna, 2017). In mammalian nuclear extracts, histone pre-mRNAs that form a strong duplex with the U7 snRNA are cleaved efficiently in the absence of SLBP. In contrast, Drosophila nuclear extracts lacking SLBP are inactive in cleaving histone pre-mRNAs, suggesting that Drosophila SLBP plays an essential role in processing in addition to stabilizing binding of the U7 snRNP to histone pre-mRNA (Skrajna, 2017, Sabath, 2013, Dominski, 2003; Lanzotti, 2002; Dominski, 2005; Skrajna, 2018 and references therein).
Within the HCC, CPSF73 is the endonuclease, acting in a close partnership with its catalytically inactive homolog, CPSF100, and the heat-labile scaffolding protein symplekin (Kolev, 2005). RNAi-mediated depletion of these three HCC subunits in Drosophila cultured cells results in generation of polyadenylated histone mRNAs, an indication of their essential role in the U7-dependent processing. Depletion of the remaining components of the HCC had no effect on the 3' end of histone mRNAs and their function in the U7 snRNP, if any, is less clear. Previous in vivo studies implicated multiple other proteins, in addition to SLBP and components of the U7 snRNP, in generation of correctly processed histone pre-mRNAs. These proteins include ZFP100, CDC73/parafibromin, NELF E, Ars2, CDK9, CF Im68 and RNA-binding protein FUS/TLS (Fused in Sarcoma/Translocated in Sarcoma). ZFP100, CF Im68 and FUS were shown to interact with Lsm11, whereas Ars2 was shown to interact with FLASH, raising the possibility that they may be essential components of the cleavage machinery (Skrajna, 2018).
To determine which factors are required for the cleavage reaction, a novel method for purification of in vitro assembled Drosophila and mouse processing complexes was developed. In this method, histone pre-mRNAs containing biotin and a photo-cleavable linker in either cis or trans are incubated with a nuclear extract and the assembled processing complexes are immobilized on streptavidin beads, washed and released into solution by irradiation with long wave UV. This approach yielded remarkably pure processing complexes that were suitable for direct and unbiased analysis by mass spectrometry, providing a complete view of the holo-U7 snRNP and other proteins that associate with histone pre-mRNA for 3' end processing (Skrajna, 2018).
In this method, processing complexes were assembled in a nuclear extract on a synthetic histone pre-mRNA containing biotin and a photo-cleavable linker at the 5' end. The major cleavage site and the two neighboring nucleotides on each side were modified with a 2'O-methyl group, hence preventing endonucleolytic cleavage of the pre-mRNA and increasing the efficiency of capturing intact processing complexes. Following immobilization on streptavidin beads, the pre-mRNA and the bound proteins were washed and released to solution by irradiation with long wave UV. This UV-elution step, by eliminating all background proteins non-specifically bound to streptavidin beads, resulted in isolation of remarkably pure processing complexes that were suitable for direct analysis by mass spectrometry. This is the first successful use of the photo-cleavable linker and the UV-elution step for purification of an in vitro assembled RNA/protein complex. Parallel experiments with pre-mRNA substrates lacking 2'O-methyl nucleotides at the cleavage site demonstrated that the immobilized processing complexes retain catalytic activity. Thus, the mass spectrometry analysis of the UV-eluted material is likely to provide a global and unbiased view of all essential proteins that associate with histone pre-mRNA for 3' end processing (Skrajna, 2018).
Since chemical synthesis of RNAs containing covalently attached biotin and the photo-cleavable linker (cis configuration) is both expensive and limited to sequences not exceeding 60-70 nt, longer histone pre-mRNAs generated by T7 transcription were tested. Biotin and the photo-cleavable linker can be attached to the 3' end of these pre-mRNAs in trans via a short complementary oligonucleotide. This modification makes the UV-elution method more cost effective and potentially applicable for purification of RNA-protein complexes that require longer RNA binding targets, including spliceosomes and complexes involved in cleavage and polyadenylation (Skrajna, 2018).
In the UV-eluted mouse and Drosophila processing complexes, mass spectrometry identified SLBP and all known subunits of the U7-specific Sm ring, including Lsm10 and Lsm11. Readily detectable in mouse and Drosophila processing complexes were also FLASH and subunits of the HCC. The HCC is remarkably similar in composition between the two species, with symplekin, CPSF100, CPSF73 and CstF64 being most abundant and present in close to stoichiometric amounts, as determined by both silver staining and emPAI value analysis. The remaining CPSF subunits (CPSF160, WDR33, Fip1 and CPSF30) are present in lower amounts, suggesting that they are substoichiometric, being stably associated only with a fraction of the U7 snRNP (Skrajna, 2018).
In both mouse and Drosophila experiments, SLBP and the components of the U7 snRNP were the only proteins that consistently failed to bind histone pre-mRNAs in the presence of processing competitors: SL RNA and αU7 oligonucleotide. Other proteins were detected both in the samples containing processing complexes and in the matching negative controls, where formation of processing complexes was blocked. Among them, the most prevalent were non-specific RNA binding proteins, including hnRNP Q in mouse nuclear extracts, and IGF2BP1 in Drosophila nuclear extracts. All these proteins likely bind to sites in histone pre-mRNAs unoccupied by SLBP and U7 snRNP, and play no essential role in processing (Skrajna, 2018).
CstF50 and CstF77 were not detected in the UV-eluted mouse processing complexes and were present only in some Drosophila complexes, always with low scores, consistent with a previous conclusion that of the three CstF subunits only CstF64 stably associates with the U7 snRNP (Yang, 2013). No peptides were detected for CF Im (68 and 25 kDa) and CF IIm (Clp1 and Pcf11) in any of the mouse experiments, suggesting that these factors are also uniquely involved in cleavage and polyadenylation. Mass spectrometry identified the orthologues of the 68 and 25 kDa subunits in some Drosophila experiments, but they were clearly contaminants, persisting in the presence of the SL RNA and αU7 oligonucleotide. CF Im68 was previously reported to interact with Lsm11 and to co-purify with U7 snRNP. Based on this analysis, this subunit is unlikely to interact with Lsm11 in the processing complex (Skrajna, 2018).
Catalytically active mouse processing complexes also lacked ZFP100 (ZN473), a zinc finger protein that co-localizes with Lsm11 and stimulates expression of a reporter gene containing U7-dependent processing signals. ZFP100 was initially identified by the yeast two-hybrid system as a protein interacting with SLBP bound to the SL RNA and suggested to function as a bridging factor in the SLBP-mediated recruitment of the U7 snRNP to histone pre-mRNA. However, the absence of ZFP100 in the UV-eluted mouse processing complexes containing both SLBP and U7 snRNP strongly argues against this function. ZFP100 may instead participate in a different aspect of histone gene expression in vivo, perhaps acting as a coupling factor that integrates transcription of histone genes with 3' end processing of the nascent histone pre-mRNAs (Skrajna, 2018).
A similar role in vivo may be played by the multi-functional protein FUS and other proteins previously linked to 3' end processing of histone pre-mRNAs in mammalian cells, including Ars2, CDC73/parafibromin, NELF E and CDK9. These factors were never specifically detected in the UV-eluted mouse processing complexes, suggesting that they have no direct role in processing in vitro. Their downregulation by RNAi results in production of a small amount of polyadenylated histone mRNAs, which may be due to a defect in coupling of histone gene transcription with processing and/or cell-cycle progression (Skrajna, 2018).
Although this study identified several polyadenylation subunits in a stable association with the U7 snRNP, the experiments do not directly address which of them are essential for processing of histone pre-mRNAs. In Drosophila cultured cells, RNAi-mediated depletion of each of only three U7-associated polyadenylation subunits, symplekin, CPSF100 and CPSF73, consistently resulted in accumulation of histone mRNAs terminated with a poly(A) tail, an indication of a defect in the U7-dependent processing mechanism. Depletion of the remaining HCC subunits had no effect, suggesting that their association with the U7 snRNP is not essential for 3' end processing of histone pre-mRNAs. Symplekin, CPSF100 and CPSF73 are present in Drosophila cells as a stable sub-complex (Sullivan, 2009) and likely act together as an autonomous cleavage module recruited for processing to either histone or canonical pre-mRNAs by specialized RNA recognition sub-complexes. For canonical pre-mRNAs, this role is played by the remaining CPSF subunits, CPSF160, WDR33, Fip1 and CPSF30, recently shown to co-operate in recognizing the AAUAAA signal during the polyadenylation step. In 3' end processing of histone pre-mRNAs, the recruitment of the cleavage sub-complex is mediated by the U7 snRNA, which recognizes the substrate by the base pairing interaction, further arguing that CPSF160, WDR33, Fip1 and CPSF30 are likely non-essential bystanders in the U7 snRNP (Skrajna, 2018).
A less clear role in 3' end processing of histone pre-mRNAs is played by CstF64, which in spite of being relatively abundant in Drosophila U7 snRNP can be depleted from Drosophila cells without causing a detectable misprocessing of histone pre-mRNAs. A defect in the U7-dependent processing was however observed in human cells partially depleted of CstF64, suggesting that in mammalian cells this subunit may play a more critical role, perhaps helping to stabilize the three essential subunits of the HCC on the FLASH/Lsm11 complex. Clearly, determining which subunits are essential for cleavage will require reconstitution of a catalytically active processing complex from recombinant components (Skrajna, 2018).
This study brings a new perspective on the essential role of Drosophila SLBP in processing. It was recently demonstrated that Drosophila SLBP, like its mammalian counterpart, enhances the recruitment of U7 snRNP to histone pre-mRNA (Skrajna, 2017). A small amount of U7 snRNP binds to histone pre-mRNA in the absence of Drosophila SLBP but the bound U7 snRNP in spite of containing all major HCC subunits is catalytically inactive. This study now shows that processing complexes assembled in the absence of SLBP can be activated for cleavage by simply adding recombinant WT SLBP, providing evidence that SLBP is the only missing factor in the assembled complexes. A mutant Drosophila SLBP that is deficient in recruiting U7 snRNP to histone pre-mRNA is also unable to activate the assembled complex for cleavage. Based on these results, it is proposed that the interaction of Drosophila SLBP with the U7 snRNP promotes an essential structural rearrangement of the entire processing complexes that juxtaposes the catalytic site of CPSF73 with the pre-mRNA (see A hypothetical model explaining essential role of Drosophila in processing). It is possible that higher metazoans developed an additional positioning mechanism for the CPSF73 endonuclease, resulting in efficient cleavage in the absence of SLBP (Skrajna, 2018).
The early embryos of many animals including flies, fish, and frogs have unusually rapid cell cycles and delayed onset of transcription. These divisions are dependent on maternally supplied RNAs and proteins including histones. Previous work suggests that the pool size of maternally provided histones can alter the timing of zygotic genome activation (ZGA) in frogs and fish. This study examine the effects of under and overexpression of maternal histones in Drosophila embryogenesis. Decreasing histone concentration advances zygotic transcription, cell cycle elongation, Chk1 activation, and gastrulation. Conversely, increasing histone concentration delays transcription and results in an additional nuclear cycle before gastrulation. Numerous zygotic transcripts are sensitive to histone concentration, and the promoters of histone sensitive genes are associated with specific chromatin features linked to increased histone turnover. These include enrichment of the pioneer transcription factor Zelda and lack of SIN3A and associated histone deacetylases. These findings uncover a critical regulatory role for histone concentrations in ZGA of Drosophila (Wilky, 2019).
To understand the effects of histone concentration on the MBT maternally supplied histones were reduced by downregulating the gene encoding a crucial histone regulator, Stem-Loop Binding Protein (Slbp) via maternally driven RNAi. Under these conditions, histone H2B was reduced by ~50% and H3 by ~60% at the MBT. Approximately 50% of embryos laid by Slbp RNAi mothers (henceforth Slbp embryos) that form a successful blastoderm do not undergo the final division and attempt gastrulation in NC13. Another ~30% exhibit an intermediate phenotype of partial arrest, with only part of the embryo entering NC14. A minority of Slbp embryos begin gastrulation with all nuclei in NC14. NC12 duration was predictive of NC13 arrest, with NC12 being an average of ~5min longer in Slbp embryos that went on to arrest compared with those that did not arrest (Wilky, 2019).
Cellularization was first detected in wild-type (WT) embryos ~20 min into NC14. Partially arrested Slbp embryos also began cellularization ~20 min into NC14, with nuclei that arrested in NC13 waiting until the remainder of the embryo had entered NC14 to cellularize. Fully arrested embryos began cellularization ~20min into NC13, initiating cellularization one cycle early and ~20min earlier in overall developmental time than WT. Despite their reduced cell number, these embryos form mitotic domains and gastrulate without obvious defects, however they die before hatching (Wilky, 2019).
To examine the effects of increased histone concentration on developmental timing cell cycle progression was monitored in embryos from abnormal oocyte (abo) mutant mothers (henceforth abo embryos). abo is a histone locus-specific transcription factor, the knockdown of which increases the production of replication-coupled histones, particularly H2A and H2B (Berloco, 2001). abo increased H2B by ~90%, whereas total (combined replication-coupled and replication-independent) H3 was not affected in NC14 embryos. Approximately 60% of abo embryos displayed fertilization defects or catastrophic early nuclear divisions. Of abo embryos that formed a functioning blastoderm, ~6% underwent a complete extra nuclear division before gastrulating in NC15, whereas ~4% underwent a partial extra nuclear division. Embryos from abo mothers that completed total extra divisions had faster NC14s in which they did not cellularize and spent 40-60 min in NC15 before gastrulating. This suggests an alteration of the normal transcription-dependent developmental program. In some cases, the cell cycle program and transcriptional program may be decoupled, evidenced by the fact that some abo embryos attempted to gastrulate while still in the process of division. abo embryos that underwent extra divisions exhibited a range of gastrulation defects including expanded mitotic domains and ectopic furrow formation (Wilky, 2019).
Since alterations in histone levels can both decrease and increase the number of divisions before cell cycle slowing, it was reasoned that histone levels might affect activation of checkpoint kinase 1 (Chk1, also known as grp), which is required for cell cycle slowing at the MBT. To test this, a fluorescent biosensor of Chk1 activity was crosses into the Slbp background. Even in Slbp embryos that did not undergo early gastrulation, Chk1 activity was higher than in WT, consistent with the lengthened cell cycle. This result indicates that the observed cell cycle phenotypes in the histone-manipulated embryos are likely mediated through changes in Chk1 activity (Wilky, 2019).
As cellularization and gastrulation require zygotic transcription, it was suspected that embryos with altered development likely have altered gene expression. Single-embryo RNA-seq was performed on staged Slbp embryos that remained in NC13 for more than 30 min. These were compared with either NC-matched (NC13) or time-matched (NC14) WT embryos. To control for maternal effects of Slbp RNAi, pre-blastoderm stage WT and Slbp embryos were compared. The Slbp embryos underwent ZGA one NC earlier than WT. ~5000 genes were identified that were differentially expressed between Slbp and WT NC13, with ~60% being upregulated. The upregulated genes have largely previously been identified as new zygotic transcripts, including cell cycle regulators such as fruhstart (frs, also known as Z600) and signaling molecules such as four-jointed (fj), whereas the downregulated genes are enriched for maternally degraded transcripts. This is thought to represent a coherent change in ZGA timing instead of global transcription dysregulation, as 98% of the genes that are overexpressed in Slbp are expressed before the end of NC14 in the control or previously published datasets. Indeed, the transcriptomes of histone-depleted embryos that stopped in NC13 are more similar to WT NC14 than WT NC13, which suggests a role for cell cycle elongation in ZGA. Nonetheless, ~1500 genes are differentially expressed between Slbp NC13 and WT NC14 without accounting for differences in ploidy. Of these, the majority of the ~1000 overexpressed genes are again associated with zygotic transcription, and downregulated genes associated with maternal products. Thus, ZGA is even further accelerated in the histone knockdown than can be explained by purely time alone (Wilky, 2019).
As ZGA is accelerated by histone depletion, it was asked whether ZGA would be delayed in the histone overexpression mutant. RNA-seq was performed on pools of abo and WT embryos collected 15-30 min into NC14. >1000 genes were identified that were differentially expressed between abo and WT, with approximately equal numbers of genes up- and down-regulated. As expected, the downregulated genes in abo were enriched for previously identified zygotically expressed transcripts, and upregulated transcripts were enriched for maternally deposited genes. Thus, histone overexpression delays the onset of ZGA (Wilky, 2019).
Zygotic genes, the transcription of which is upregulated by histone depletion and downregulated by histone overexpression, contain many important developmental and cell cycle regulators including: frs, hairy (h), fushi tarazu (ftz) and odd-skipped (odd). Conversely, the maternally degraded transcripts that are destabilized by histone depletion and stabilized by histone overexpression include several cell cycle regulators such as Cyclin B (CycB), string (stg, also known as Cdc25string) and Myt1. Therefore, histone concentration can modulate the expression and stability of specific cell cycle regulators, which may contribute to the onset of MBT (Wilky, 2019).
Since histone concentration has previously been implicated in sensing the nuclear-cytoplasmic (N/C) ratio (Amodeo, 2015), this study compared the genes that are changed in both the histone under- and overexpression embryos with those that had previously been found to be dependent on either the N/C ratio or developmental time (Lu, 2009). Both previously identified N/C ratio-dependent and time-dependent genes (Lu, 2009) followed the same general trends as the total zygotic gene sets, indicating that histone availability cannot explain these previous classifications (Wilky, 2019).
Next, attempts were made to disentangle the effects of cell cycle length from transcription in the histone overexpression mutant. Single-embryo time-course RNA-seq was performed on abo and WT embryos collected 3 min before mitosis of NC10-NC13 and 3 min into NC14. In addition, unfertilized embryos (henceforth NC0) of both genotypes were collected to control for differences in maternal contribution. Even with a stringent selection process that accounted for cell cycle time and embryo health, a small set of robustly upregulated (179) and downregulated (260) genes was detected across NC10-NC14. Of the newly transcribed genes, 111 genes were detected with delayed transcription, including frs and only 37 that are upregulated. These results were confirmed using qPCR. When compared with previous datasets, zygotic genes tend to be underexpressed, as was the case for the pooled abo dataset; however, the majority of these enrichments are not statistically significant. Nonetheless the majority of these underexpressed genes are expressed during NC14 in WT. This geneset, in combination with the time-matched Slbp comparison, enables further examination of the chromatin features that underlie histone sensitivity for transcription independent of cell cycle changes (Wilky, 2019).
To identify chromatin features associated with histone sensitivity, the presence was compared of 143 modENCODE chromatin signals near the transcriptional start site (TSS±500 bp) of genes whose expression was altered by changes in histone concentration independent of cell cycle time. A clear pattern was found of unique chromatin features for the histone-sensitive genes, compared with all newly transcribed genes, that was highly similar between the histone over- and underexpression experiments. The pioneer transcription factor Zld, known to be important for nucleosome eviction during ZGA, was enriched in the promoters of histone-sensitive genes. Insulator proteins such as BEAF-32 and CP190 were depleted in histone-sensitive genes. Promoters of histone-sensitive genes also show a strong reduction for SIN3A, a transcriptional repressor associated with cell cycle regulation. SIN3A is known to recruit HDACs to TSSs, and almost all HDACs also show significant de-enrichment at the TSSs of histone-sensitive genes. Taken together, these marks make up a unique chromatin signature that may sensitize a locus to changes in histone concentration, as is likely for pioneer factors such as Zld. Other aspects of this signature may indicate that these genes are subsequently subject to later developmental regulation, as indicated by H3K4me3 and H3K27me3 (Wilky, 2019).
This study has demonstrated that histone concentration regulates the timing of the MBT in Drosophila, resulting in both early gastrulation and extra pre-MBT divisions from histone reduction and increase, respectively. Histone concentration also regulates ZGA. Thousands of genes are prematurely transcribed in histone-depleted embryos and hundreds of genes are delayed in histone-overexpressing embryos. The majority of these genes appear to be downstream of changes in cell cycle duration, suggesting a model in which histones directly regulate cell cycle progression. In other cell types, histone loss halts the cell cycle via accumulation of DNA damage and stalled replication forks. In the early embryo, changes in histone availability may similarly create replication stress to directly or indirectly activate Chk1 as this study has shown. In turn, Chk1 would inhibit Stg and/or Twine to slow the cell cycle. This mechanism is supported by previous observations that loss of zygotic histones causes the downregulation of Stg in the first post-MBT cell cycle. In this case, the observed transcriptional changes would be independent or downstream of the altered cell cycle (Wilky, 2019).
Alternatively, direct changes in transcription downstream of histone availability may feed into the cell cycle. In bulk, histone-sensitive transcripts might underlie the replication stress that has been previously proposed to slow the cell cycle at the MBT. Consistent with this, the cell cycle lengthening and partial arrest phenotypes observed in mutant RNA Pol II embryos occur at a similar frequency to those observed as the result of histone depletion. Another possibility is that specific histone-sensitive transcripts are responsible for cell cycle elongation. One promising candidate for a histone-sensitive cell cycle regulator is the N/C ratio-sensitive CDK inhibitor frs, as zygotic transcription of frs plays a crucial role in stopping the cell cycle at the MBT. In contrast, tribbles, an N/C ratio-dependent inhibitor of Twine that has also been implicated in cell cycle slowing, does not show a consistent response between histone perturbations. In this previously proposed model, maternal histone stores may compete with pioneer transcription factors to set the timing of transcription initiation. Indeed, the central Drosophila pioneer transcription factor Zld is enriched at the promoters of histone-sensitive genes. Moreover, this study has identified a broader set of chromatin features that may sensitize individual loci to changes in histone concentrations. These include less obvious candidates for global early transcriptional regulators, such as SIN3A, HDACs and class I insulator proteins that may protect transcripts from changes in histone concentrations. This work highlights the importance of histone concentration in regulating the timing of MBT and provides evidence that promoters of histone-sensitive genes possess a unique chromatin signature. However, future studies will be required to isolate the specific downstream effectors that respond to changes in histone concentrations in the early embryo (Wilky, 2019).
Regulation of transcription is the main mechanism responsible for precise control of gene expression. Whereas the majority of transcriptional regulation is mediated by DNA-binding transcription factors that bind to regulatory gene regions, an elegant alternative strategy employs small RNA guides, Piwi-interacting RNAs (piRNAs) to identify targets of transcriptional repression. This study shows that in Drosophila the small ubiquitin-like protein SUMO and the SUMO E3 ligase Su(var)2-10 are required for piRNA-guided deposition of repressive chromatin marks and transcriptional silencing of piRNA targets. Su(var)2-10 links the piRNA-guided target recognition complex to the silencing effector by binding the piRNA/Piwi complex and inducing SUMO-dependent recruitment of the SetDB1 (Eggless)/Wde histone methyltransferase effector. It is proposed that in Drosophila, the nuclear piRNA pathway has co-opted a conserved mechanism of SUMO-dependent recruitment of the SetDB1/Wde chromatin modifier to confer repression of genomic parasites (Ninova, 2020a).
The majority of transcriptional control is achieved by transcription factors that bind short sequence motifs on DNA. In many eukaryotic organisms, transcriptional repression can also be guided by small RNAs, which (in complex with Argonaute proteins) recognize their genomic targets using complementary interactions with nascent RNA. Small RNA-based regulation provides flexibility in target selection without the need for new transcription factors and as such is well suited for genome surveillance systems to identify and repress the activity of harmful genetic elements such as transposons (Ninova, 2020a).
Transcriptional repression guided by small RNAs correlates with the deposition of repressive chromatin marks, particularly histone 3 lysine 9 methylation (H3K9me) in S. pombe, plants, and animals. In addition, plants and mammals also employ CpG DNA methylation for target silencing. Small RNA/Ago-induced transcriptional gene silencing is best understood in S. pombe, where the RNA-induced transcriptional silencing complex (RITS) was studied biochemically and genetically. In contrast to yeast, the molecular mechanism of RITS in Metazoans remains poorly understood. Small RNA-induced transcriptional repression mechanisms might have independently evolved several times during evolution and thus might mechanistically differ from that of S. pombe (Ninova, 2020a).
In Metazoans, small RNA-guided transcriptional repression is mediated by Piwi proteins, a distinct clade of the Argonaute family, and their associated Piwi-interacting RNAs (piRNAs). Both in Drosophila and in mouse, the two best-studied Metazoan systems, nuclear Piwis are responsible for transcriptional silencing of transposons. Based on the current model, targets are recognized through binding of the Piwi/piRNA complex to nascent transcripts of target genes. In both Drosophila and mouse, piRNA-dependent silencing of transposons correlates with accumulation of repressive chromatin marks (H3K9me3 and, in mouse, CpG methylation of DNA) on target sequences. These marks can recruit repressor proteins, such as HP1, providing a mechanism for transcriptional silencing. However, how recognition of nascent RNA by the Piwi/piRNA complex leads to deposition of repressive marks at the target locus is not well understood. Several proteins, Asterix (Arx)/Gtsf1, Panoramix (Panx)/Silencio, and Nxf2, were shown to associate with Piwi and are required for transcriptional silencing. Accumulation of H3K9me3 on Piwi/Panx targets requires the activity of the histone methyltransferase SetDB1 (also known as Egg). However, a mechanistic link between the Piwi/Arx/Panx/Nxf2 complex, which recognizes targets, and the effector chromatin modifier has not been established (Ninova, 2020a and references therein).
This study identified Su(var)2-10/dPIAS to provide the link between the Piwi/piRNA and the SetDB1 complex in piRNA-induced transcriptional silencing. In Drosophila, Su(var)2-10 mutation causes suppression of position effect variegation, a phenotype indicative of its involvement in chromatin repression. Su(var)2-10 associates with chromatin and regulates chromosome structure. It also emerged in screens as a putative interactor of the central heterochromatin component HP1, a repressor of enhancer function, and a small ubiquitin-like modifier (SUMO) pathway component. However, its molecular functions in chromatin silencing were not investigated. Su(var)2-10 belongs to the conserved PIAS/Siz protein family, of which the yeast, plant, and mammalian homologs act as E3 ligases for SUMOylation of several substrates. This paper reports the role of Su(var)2-10 in germ cells of the ovary, where chromatin maintenance and transposon repression are essential to grant genomic stability across generations. Germ cell depletion of Su(var)2-10 phenocopies loss of Piwi; both lead to strong transcriptional activation of transposons and loss of repressive chromatin marks over transposon sequences. Su(var)2-10 genetically and physically interacts with Piwi and its auxiliary factors, Arx and Panx. It was demonstrated that the repressive function of Su(var)2-10 is dependent on its SUMO E3 ligase activity and the SUMO pathway. These data point to a model in which Su(var)2-10 acts downstream of the piRNA/Piwi complex to induce local SUMOylation, which in turn leads to the recruitment of the SetDB1/Wde complex. SUMO modification was shown to play a role in the formation of silencing chromatin in various systems from yeast to mammals, including the recruitment of the silencing effector SETDB1 and its co-factor MCAF1 by repressive transcription factors. Together, these findings indicate that the piRNA pathway utilizes a conserved mechanism of silencing complex recruitment through SUMOylation to confer transcriptional repression (Ninova, 2020a).
In both insect and mammals, piRNA-guided transcriptional silencing is associated with the deposition of repressive chromatin marks on genomic targets. In Drosophila, the conserved histone methyltransferase SetDB1 (Egg) is responsible for deposition of the silencing H3K9me3 mark at Piwi targets. However, the molecular mechanism leading to the recruitment of SetDB1 by the Piwi/piRNA complex remained unknown. Thus study showed that in Drosophila SUMO and the SUMO E3 ligase Su(var)2-10 act together downstream of the piRNA-guided complex to recruit the histone methyltransferase complex SetDB1/Wde and cause transcriptional silencing. The results suggest a model for the molecular mechanism of piRNA-guided transcriptional silencing in which Su(var)2-10 provides the connection between the target recognition complex composed of piRNA/Piwi/Panx/Arx and the chromatin effector complex composed of SetDB1 and Wde (Ninova, 2020a).
This study has identified a new role for the SUMO pathway in piRNA-guided transcriptional silencing. The SUMO pathway plays important roles in heterochromatin formation and maintenance, and genome stability in different organisms from yeast to humans. Among different functions, SUMO is required for recruitment and activity of the histone methyltransferase complex composed of SetDB1 and MCAF1 (Wde in Drosophila), which confers transposon silencing in mammals. Remarkably, SUMO-dependent recruitment of SetDB1 to TEs in mammalian somatic cells does not require piRNAs but is instead mediated by the large vertebrate-specific family of Krüppel-associated box domain-zinc finger proteins (KRAB-ZFPs) that bind specific DNA motifs. Although distinct members of the KRAB-ZFP family recognize different sequence motifs in target transposons, repression of all targets by various KRAB-ZFPs requires the universal co-repressor KAP1/TIF1b (KRAB-associated protein 1). KAP1 is a SUMO E3 ligase, and its auto-SUMOylation leads to SetDB1 recruitment. The current results suggest that Drosophila Su(var)2-10 can be SUMOylated, and SetDB1 and Wde have functional SIMs, suggesting that Su(var)2-10 auto-SUMOylation might induce SetDB1/Wde recruitment. These results suggest that two distinct transposon repression pathways, by DNA-binding proteins and by piRNAs, both rely on SUMO-dependent recruitment of the conserved silencing effector to the target (Ninova, 2020a).
The results in Drosophila and studies in mammals suggest that in both clades self-SUMOylation of SUMO E3 ligases might be involved in recruitment of SetDB1 to chromatin. However, these results do not exclude the possibility that the recruitment of SetDB1 is facilitated by SUMOylation of additional chromatin proteins by Su(var)2-10. Studies in yeast led to the 'SUMO spray' hypothesis that postulates that SUMOylation of multiple different proteins localized in physical proximity promotes the assembly of multi-unit effector complexes. Local concentration of multiple SUMO moieties leads to efficient recruitment of SUMO-interacting proteins. According to this hypothesis, multiple SUMO-SIM interactions within a protein complex act synergistically, and thus SUMOylation of any single protein is neither necessary nor sufficient to trigger downstream processes. Assembly of such 'SUMO spray' on chromatin might be governed by the same principles of multiple weak interactions as was recently recognized for the formation of various phase-separated liquid-droplet compartments in the cell. The presence of Su(var)2-10 on a chromatin locus might lead to SUMOylation of multiple chromatin-associated proteins that are collectively required for the recruitment of effector chromatin modifiers. The SUMOylation consensus (ΨKxE/D) is very simple and therefore quite common in the fly proteome. Consistent with this, several hundred SUMOylated proteins were identified in proteomic studies in Drosophila. Thus, it is possible that collective SUMOylation of multiple chromatin-associated proteins contributes to recruitment and stabilization of the SetDB1 complex on chromatin (Ninova, 2020a).
The cascade of events leading to repression initiated by target recognition by piRNA/Piwi, followed by interaction with Su(var)2-10 and subsequent SUMO-dependent recruitment of SetDB1/Wde, suggests that the three complexes tightly cooperate. But do these three complexes (Piwi, Su(var)2-10, and SetDB1) always work together, or does each complex have additional functions independent of the other two? Genome-wide analysis suggests that the vast majority of Piwi targets are repressed through SUMO/Su(var)2-10 and, likely, SetDB1/Wde, suggesting that Piwi always requires these other complexes for its function in transcriptional silencing. On the other hand, multiple instances were found of host genes that are repressed by Su(var)2-10 and SetDB1 but do not require piRNAs. Su(var)2-10 and SetDB1 are also expressed outside of the gonads and were implicated in chromatin silencing in somatic tissues that lack an active piRNA pathway. It is speculated that Su(var)2-10 might bind to specific targets directly through its SAP domain or might get recruited by specific DNA-binding proteins, similar to the way SetDB1 is recruited to ERVs by KRAB-ZFP in mammals, though specific factors are yet to be uncovered (Ninova, 2020a).
Though both Drosophila and mouse have nuclear Piwi proteins involved in transcriptional silencing of transposons, these proteins, PIWI and MIWI2, are not one-to-one orthologs. Unlike Drosophila, other insects including the silkworm Bombyx mori, the flour beetle Tribolium castaneum, and the honeybee Apis mellifera encode only two Piwi proteins, and at least in B. mori, these proteins do not localize to the nucleus. These observations suggest that the nuclear Piwi pathway in Drosophila has evolved independently in this lineage. In light of this evolutionary interpretation, the interaction of the Piwi complex and the E3 SUMO ligase Su(var)2-10 indicates that in Drosophila the nuclear piRNA pathway co-opted an ancient mechanism of SUMO-dependent recruitment of the histone-modifying complex for transcriptional silencing of transposons. The molecular mechanism of piRNA-induced transcriptional repression in other clades such as mammals might have evolved independently of the corresponding pathway in flies. It will be interesting to investigate if mammals also use SUMO-dependent recruitment of silencing complexes for transcriptional repression of piRNA targets (Ninova, 2020a).
Chromatin is critical for genome compaction and gene expression. On a coarse scale, the genome is divided into euchromatin, which harbors the majority of genes and is enriched in active chromatin marks, and heterochromatin, which is gene-poor but repeat-rich. The conserved molecular hallmark of heterochromatin is the H3K9me3 modification, which is associated with gene silencing. This study found that in Drosophila, deposition of most of the H3K9me3 mark depends on SUMO and the SUMO ligase Su(var)2-10, which recruits the histone methyltransferase complex SetDB1 (Eggless)/Wde. In addition to repressing repeats, H3K9me3 influences expression of both hetero- and euchromatic host genes. High H3K9me3 levels in heterochromatin are required to suppress spurious transcription and ensure proper gene expression. In euchromatin, a set of conserved genes is repressed by Su(var)2-10/SetDB1-induced H3K9 trimethylation, ensuring tissue-specific gene expression. Several components of heterochromatin are themselves repressed by this pathway, providing a negative feedback mechanism to ensure chromatin homeostasis (Ninova, 2020b).
This study shows that in addition to the effects on TE silencing (Ninova, 2020a), Su(var)2-10 and H3K9me3 influence the expression of protein-coding genes. Su(var)2-10-dependent H3K9me3 deposition on TEs affects the expression of genes located in heterochromatin and of euchromatic genes adjacent to TE insertions. Su(var)2-10 is also involved in TE-independent H3K9me3 deposition on host genes, which is essential for the suppression of ectopic expression of tissue-specific genes, thereby conferring correct cell type identity (Ninova, 2020b).
Approximately half of the human genome comprises TE sequences, and the TE fraction is as high as 90% in several plant species. One new TE insertion per generation is estimated to propagate to the offspring. Somatic TE insertions, although difficult to detect, are likely even more prevalent. Thus, TE activity is a major source of genetic variation that can occur on a very short timescale. The effects of TEs on the host transcriptome have been the subject of many studies ever since Barbara McClintock identified 'control' elements that regulate gene expression before genome compositions were known. TEs can disrupt gene expression by inserting into coding regions or into or close to cis-regulatory sequences. TE insertions are not always disruptive: insertions into non-coding regions can bring new regulatory elements that change gene expression patterns, resulting in increased fitness. Instances of positive selection for TE insertions are well documented in Drosophila. TE-derived promoters also drive the expression of numerous mouse and human genes, suggesting that TE insertions can be co-opted into gene regulatory pathways (Ninova, 2020b).
In addition to changes in the DNA sequence, TE insertions may introduce local epigenetic effects. Active TEs are transcriptionally silenced by H3K9 trimethylation and/or DNA methylation. The H3K9me3 mark can spread several kilobases outside the TE region, affecting adjacent cis-regulatory elements of host genes, and thereby interfering with their normal expression. TE insertions with high levels of H3K9me3 are strongly selected against, supporting a model that TEs can alter the expression of host genes through epigenetic changes (Ninova, 2020b).
The finding that Su(var)2-10 is responsible for the deposition of H3K9me3 on TE bodies and flanking sequences allows separation if the effect of direct damage to cis-regulatory elements from the effect on chromatin. Evidence was found that TE insertions can lead to H3K9me3-dependent changes in gene expression, as shown for the jheh3 and frl loci. Notably, the BARI insertion at the jheh3 locus was shown to be positively selected in the D. melanogaster population, indicating that Su(var)2-10-dependent epigenetic silencing caused by a TE insertion can be used for beneficial rewiring of host gene regulatory networks (Ninova, 2020b).
The current results suggest that TEs can rewire gene regulatory networks on a short timescale, at least in part via their effects on chromatin. Euchromatic H3K9me3 peaks due to TE insertions are widespread in Drosophila, indicating that TE insertions may be a common cause of gene regulatory variation. New TE insertions during development generate genomic diversity between different cell types in human and mouse with implications for tumorigenesis and brain development. Future studies are required to elicit the epigenetic effects of somatic TE insertions on gene regulatory networks (Ninova, 2020b).
Heterochromatin domains include nearly 30% of the fly genome. Although relatively gene-poor, heterochromatin hosts several hundred protein-coding genes. Studies of chromosomal rearrangements suggested that heterochromatic localization is required for the proper expression of heterochromatic genes. However, the molecular mechanism of the positive effect of the heterochromatin environment on expression is not fully understood (Ninova, 2020b).
Consistent with previous studies, this study observed many active genes in H3K9me3-rich heterochromatic regions and found that for many active heterochromatic genes, Su(var)2-10-induced H3K9 methylation is not only permissive but also required for proper expression (Ninova, 2020b).
How can the same chromatin mark lead to the repression of genes in euchromatin and activation in heterochromatin? H3K9me3 is present over the gene bodies and regions flanking heterochromatic genes, but is depleted at promoters, which instead carry typical active marks such as H3K4me3 and Pol II occupancy. Thus, H3K9me3 over gene bodies appears to be compatible with transcription. H3K9me3 loss upon Su(var)2-10 GLKD correlated with increased levels of intronic RNAs and the appearance of H3K4me2/3 and Pol II signals in introns, indicating the upregulation of spurious transcripts originating from within host-gene introns. One possible source of such transcripts is the activation of TE promoters that are highly abundant within introns and flanking sequences of heterochromatic genes. It is proposed that transcription from TE promoters located in introns and flanking sequences interferes with proper gene expression through transcriptional interference (Ninova, 2020b).
H3K9me3 loss also disrupted the normal isoform regulation of heterochromatic genes, as was observed both truncated and extended mRNA isoforms with coding potential distinct from the canonical gene mRNA upon the depletion of Su(var)2-10. The activation of cryptic promoters may disrupt proper gene expression through multiple mechanisms, such as reduction in canonical mRNA output or dominant negative effects of the extended or truncated protein isoforms. Not all heterochromatic genes that lose H3K9me3 upon Su(var)2-10 germline knockdown (GLKD) show signs of interfering transcripts or cryptic promoters, indicating that H3K9me3 may have other functions in heterochromatic gene activation. For example, the compaction of heterochromatin by HP1 may bring distant enhancers of heterochromatic genes into physical proximity of promoters to activate expression. The results, combined with previous studies, indicate that genes positioned in heterochromatin require high H3K9me3 levels for proper expression and isoform selection (Ninova, 2020b).
Discrete Su(var)2-10-dependent H3K9me3 peaks are present in a number of euchromatic genes. Some of these peaks have no TEs in their vicinity, and their H3K9me3-based repression is conserved between D. melanogaster and D. virilis, two species that separated >45 million years ago and have no common TE insertions. The expression of many of these TE-independently repressed genes is restricted to specific tissues such as testis, the digestive system, or the CNS, and the loss of H3K9me3 leads to ectopic expression in the female germline. The finding is in line with a recent report that SetDB1 depletion in the female germline was associated with the loss of H3K9me3 and the mis-expression of male-specific genes. H3K9me3, SetDB1, and the SUMO pathway were also implicated in lineage-specific gene expression and cell fate commitment in mammals. These data suggest that a TE-independent H3K9me3 deposition via the SUMO-SetDB1 pathway plays an evolutionarily conserved role in restricting gene expression to proper cell lineages (Ninova, 2020b).
SUMO- and Su(var)2-10-dependent H3K9me3 repression also regulates several factors involved in heterochromatin formation and maintenance, such as SUMO (smt3), Wde, Sov, and CG30403. Wde is the homolog of the mammalian MCAF1/ATF7IP, which is required for the nuclear localization and stability of SetDB1 and promotes its methyltransferase activity. Drosophila Wde also associates with SetDB1, and their germline depletion results in a similar phenotype, supporting the role of Wde as a conserved SetDB1 co-factor. The current data in Drosophila and studies in mammals suggest that SUMO is involved in SetDB1/Wde recruitment to its targets. HP1 is an H3K9me3 reader that is responsible for the structural properties of heterochromatin and also serves as a hub for many other heterochromatin proteins. Both Sov and CG30403 interact with HP1, and Sov is critical for heterochromatin maintenance (Ninova, 2020b).
The genes encoding Wde, SUMO, Sov, and CG30403 reside in euchromatin and are repressed by local H3K9me3. Unlike tissue-restricted genes, which are often completely repressed by Su(var)2-10 in the female germline, these factors are not fully silenced, although they are upregulated upon Su(var)2-10 depletion. The results indicate that these four genes are part of a negative feedback mechanism that controls heterochromatin formation. Negative feedback in biological circuits maintains protein levels within a certain range, providing homeostatic regulation. It is proposed that SUMO-dependent repression of heterochromatin proteins provides such homeostatic regulation to maintain the proper ratio and boundaries of hetero- and euchromatin. According to this model, specific genes, such as wde, act as sensors of the overall H3K9me3 level. Insufficient levels of H3K9 methylation lead to elevated sensor gene expression due to decreased H3K9me3 at their promoters, which in turn enhances H3K9me3 deposition and heterochromatin formation throughout the genome. Concomitant repression of sensor genes ensures that H3K9me3 is restricted to proper genomic domains and does not spread to euchromatic regions that should remain active. Inspection of ENCODE data showed that the mammalian homolog of wde, ATF7IP, is decorated by H3K9me3 in some human cell lines, suggesting that this mode of regulation may be deeply conserved (Ninova, 2020b).
A reminiscent negative feedback loop was identified in yeast. The single H3K9 methyltransferase clr4 is suppressed by H3K9me3 to restrict ectopic spreading of silencing chromatin. In mammals, genes encoding proteins from the KRAB-ZFP family of transcriptional repressors reside in H3K9me3- and HP1-enriched loci. Thus, autoregulation of heterochromatin effectors is a conserved mode of chromatin regulation, although the genes involved in the feedback mechanism differ between different organisms. In the future, it will be important to dissect the network architecture of heterochromatin regulation. As heterochromatin formation and maintenance was reported to be disrupted in cancer and during aging, this mechanism may be a promising target of therapeutic interventions (Ninova, 2020b).
H3K9me3 writer enzymes are targeted to genomic loci by different mechanisms. In the case of TE repression in germ cells, piRNAs bound to nuclear Piwi proteins serve as sequence-specific guides that bind complementary nascent transcripts and recruit Su(var)2-10, which induces H3K9me3 deposition by SetDB1. Su(var)2-10 identifies non-TE targets in a piRNA-independent fashion, in agreement with a broader function of Su(var)2-10 in development. The observation that H3K9me3 peaks at homologous euchromatic genes are also present in the distantly related D. virilis points to a conserved mechanism of H3K9me3 deposition in host-gene regulation (Ninova, 2020b).
The molecular mechanism of piRNA-independent recruitment of Su(var)2-10 remains to be explored. Su(var)2-10 has a putative DNA binding SAP domain that may be sufficient for its binding to DNA. However, motif enrichment analysis failed to identify a common sequence motif among TE-independent Su(var)2-10 targets (MEME-ChIP), suggesting that different partners may recruit Su(var)2-10 to distinct targets. In mammals, a large family of transcription factors, the KRAB-ZFPs, are responsible for SetDB1 recruitment and H3K9me3 deposition on many different targets, primarily endogenous retroviruses. Individual members of the KRAB-ZFP family influence distinct targets due to differences in DNA-binding specificities of their zinc-finger DNA-binding domains. Notably, SetDB1 recruitment through KRAB-ZFPs occurs through a SUMO-dependent mechanism. The KRAB-ZFP family is vertebrate specific, and there are no known proteins in D. melanogaster that can recruit H3K9me3 activity. A preliminary search for direct Su(var)2-10 interactors using a yeast two-hybrid screen identified several proteins with putative DNA-binding domains. Thus, it is proposed that analogous to the KRAB-ZFP pathway in mammals, Su(var)2-10 may link DNA-binding proteins to the SetDB1 silencing machinery. Future studies are necessary to identify the proteins that guide Su(var)2-10 to target loci and to elucidate TE-independent recruitment mechanisms of the silencing machinery (Ninova, 2020b).
Linker histones H1 are principal chromatin components, whose contribution to the epigenetic regulation of chromatin structure and function is not fully understood. In metazoa, specific linker histones are expressed in the germline, with female-specific H1s being normally retained in the early-embryo. Embryonic H1s are present while the zygotic genome is transcriptionally silent and they are replaced by somatic variants upon activation, suggesting a contribution to transcriptional silencing. This study directly address this question by ectopically expressing dBigH1 in Drosophila S2 cells, which lack dBigH1. dBigH1 was shown to bind across chromatin, replaces somatic histone H1 and reduces nucleosome repeat length (NRL). Concomitantly, dBigH1 expression down-regulates gene expression by impairing RNApol II binding and histone acetylation. These effects depend on the acidic N-terminal ED-domain of dBigH1 since a truncated form lacking this domain binds across chromatin and replaces dH1 like full-length dBigH1, but it does not affect NRL either transcription. In vitro reconstitution experiments using Drosophila preblastodermic embryo extracts corroborate these results. Altogether these results suggest that the negatively charged N-terminal tail of dBigH1 alters the functional state of active chromatin compromising transcription (Climent-Canto, 2020).
Linker histones H1 constitute an evolutionarily conserved family of chromosomal proteins that play an important structural role in regulating chromatin compaction and higher order chromatin organization. In metazoan species, histones H1 usually exist as multiple variants, some of which are specifically expressed in the germline. For instance, four of the eleven mice/human H1 isoforms are germline specific, of which three are expressed in males (H1T, HILS1 and H1T2) and one in females (H1oo). Female-specific variants usually accumulate in the oocyte and are retained during early embryogenesis. In comparison to most metazoa, H1 complexity in Drosophila is much reduced since it contains a single somatic dH1 variant, which is ubiquitously expressed throughout development, and a single germline specific variant dBigH1, which is expressed in both the female and male germlines, and it is retained in the early embryo. Embryonic H1s persist as long as the zygotic genome remains transcriptionally silent, being replaced by somatic variants when transcription begins during zygotic genome activation (ZGA). In Drosophila, dBigH1 is present during early embryogenesis until ZGA onset at cellularization. At this stage, dBigH1 is replaced by somatic dH1 in somatic cells, whereas it is retained in the primordial germ cells (PGC), which remain transcriptionally silent (Climent-Canto, 2020).
These observations suggest that dBigH1, and embryonic H1s in general, are general transcriptional regulators that contribute to silencing. Linker histones H1 have been usually associated with transcription repression, but somatic H1s are readily detected across expressed genes. In this regard, it has been reported that somatic H1s can even enhance the synergism between transcription factors. In contrast, in the presence of dBigH1, chromatin appears to be transcriptionally silent, suggesting that dBigH1 enhances transcriptional silencing. This study analyzed the mechanisms of action of dBigH1. For this purpose, ectopic expression experiments were performed in Drosophila S2 cells, which lack dBigH1. These experiments confirm the contribution of dBigH1 to transcriptional silencing, identifying the acidic N-terminal ED-domain as responsible for its negative effect on transcription (Climent-Canto, 2020).
The mechanism of dBigH1 action in transcription regulation was addressed using ectopic dBigH1 expression experiments in S2 cells. Though weakly, the genomic distribution of ectopically expressed dBigH1 positively correlates with those observed in embryos and testes, where dBigH1 is naturally expressed, suggesting that the mechanisms governing dBigH1 deposition might be partially conserved in S2 cells. The results suggest that binding of dBigH1 negatively affects transcription. Upon dBigH1 expression, more than two-thirds of the differentially expressed genes were down-regulated. This effect was probably underestimated since, though only in one replicate, was a global decrease in gene expression observed that, considering the methodology used for normalization, could hamper identification of differentially down-regulated genes. This down-regulation occurred at the transcriptional level, as down-regulated genes showed reduced RNApol II content. Conversely, RNApol II content of up-regulated genes was not increased upon dBigH1 expression, suggesting that the observed up-regulation was not transcriptional. Moreover, in vitro experiments showed that dBigH1 inhibited transcription of a chromatin template. Consistent with the negative effect on transcription, dBigH1 expression specifically decreased H3K36me3 levels at CDS of down-regulated genes (Climent-Canto, 2020).
The results show that dBigH1 replaces dH1. In these experiments, dBigH1 binding to chromatin was likely taking place in the absence of DNA replication, as dBigH1 induction was sustained for 24h and, during this time, cell density did not increase noticeably. Whether dBigH1 deposition involves active dH1 replacement remains to be determined. In vitro, incubation of purified nuclei with Drosophila embryo extract results in binding of dBigH1 to chromatin without dH1 displacement, suggesting that the replacement observed in S2 cells responds to an active process. Along the same lines, it was observed that dBigH1 was preferentially deposited at regions enriched in dH1. Replacement of somatic H1s by embryonic H1s has been reported in nuclear transfer experiments and NAP-1 has been shown to be involved in both B4/H1M deposition and somatic H1s removal in Xenopus. Further work is required to determine the mechanisms regulating dBigH1 deposition (Climent-Canto, 2020).
The acidic ED-domain of dBigH1 is required to inhibit transcription since expression of the truncated dBigH1ΔED form, which also replaced somatic dH1, did not down-regulate gene expression either affected RNApol II loading or H3Kac levels. The presence of the negatively charged acidic ED-domain in dBigH1 is peculiar as histones are highly positively charged. It is possible that, due to the negative charge of the ED-domain, the structural organization of chromatin is compromised in the presence of dBigH1. Actually, the overall NRL changed upon dBigH1 expression, but not when dBigH1ΔED was expressed. Interestingly, although the ED-domain of dBigH1 is not conserved outside of the Drosophila genus, embryonic H1s are generally more acidic than somatic ones. In this regard, it was shown that both the Xenopus B4/H1M and the mammalian H1oo embryonic linker histones alter chromatin organization and dynamics. An altered chromatin organization would perturb access to chromatin and/or functioning of chromatin remodelers/modifiers and transcription factors that, ultimately, would affect RNApol II loading and transcription. In fact, regardless of the actual transcriptional outcome, dBigH1 expression globally affected H3Kac. In contrast, it was reported earlier that incubation of purified nuclei with Drosophila embryo extract, which also results in dBigH1 binding, increased H3Kac levels. However, it is important to note that the increase in H3Kac levels observed in this case was independent of dBigH1 binding (Climent-Canto, 2020).
It might be argued that the down-regulation observed upon dBigH1 expression is a consequence of increased global linker histones content. However, similar or even higher levels of expression of the truncated dBigH1ΔED and dBigH1ΔNTD forms did not affect transcription. Moreover, binding of dBigH1 is compensated by removal of dH1, thus total linker histones content is not greatly increased (Climent-Canto, 2020).
dBigH1 binding affected expression of a relatively small subset of genes. This may reflect the fact that in the experimental setup used in this study, dBigH1 accounted for only 20-25% of total linker histones. Thus, from this point of view, affected genes appear to correspond to a subset of genes more sensitive to dBigH1 levels. In this regard, it was observed that down-regulated genes had strong RNApol II pausing, which tended to decrease upon dBigH1 expression. In addition, though dBigH1 binding reduced H3Kac globally, only the down-regulated genes were transcriptionally affected. Interestingly, impairing RNApol II pausing generally down-regulates gene expression, while reduced H3Kac levels preferentially affects expression of highly paused genes. These observations suggest that the higher sensitivity to dBigH1 expression of down-regulated genes is likely due to the way their transcription is regulated (Climent-Canto, 2020).
In summary, this study has presented evidence supporting that the acidic N-terminal tail of the embryonic dBigH1 linker histone of Drosophila compromises transcription by altering the functional epigenetic state of active chromatin. Other embryonic H1s might share similar properties since they are generally more acidic than their somatic counterparts (Climent-Canto, 2020).
Many membraneless organelles form through liquid-liquid phase separation, but how their size is controlled and whether size is linked to function remain poorly understood. The histone locus body (HLB) is an evolutionarily conserved nuclear body that regulates the transcription and processing of histone mRNAs. This study shows that Drosophila HLBs form through phase separation. During embryogenesis, the size of HLBs is controlled in a precise and dynamic manner that is dependent on the cell cycle and zygotic histone gene activation. Control of HLB growth is achieved by a mechanism integrating nascent mRNAs at the histone locus, which facilitates phase separation, and the nuclear concentration of the scaffold protein Multi-sex combs (Mxc), which is controlled by the activity of cyclin-dependent kinases. Reduced Cdk2 activity results in smaller HLBs and the appearance of nascent, misprocessed histone mRNAs. Thus, these experiments identify a mechanism linking nuclear body growth and size with gene expression (Hur, 2020).
Before zygotic genome activation (ZGA), the quiescent genome undergoes reprogramming to transition into the transcriptionally active state. However, the mechanisms underlying euchromatin establishment during early embryogenesis remain poorly understood. This study shows that histone H4 lysine 16 acetylation (H4K16ac) is maintained from oocytes to fertilized embryos in Drosophila and mammals. H4K16ac forms large domains that control nucleosome accessibility of promoters prior to ZGA in flies. Maternal depletion of MOF acetyltransferase leading to H4K16ac loss causes aberrant RNA Pol II recruitment, compromises the 3D organization of the active genomic compartments during ZGA, and causes downregulation of post-zygotically expressed genes. Germline depletion of histone deacetylases revealed that other acetyl marks cannot compensate for H4K16ac loss in the oocyte. Moreover, zygotic re-expression of MOF was neither able to restore embryonic viability nor onset of X chromosome dosage compensation. Thus, maternal H4K16ac provides an instructive function to the offspring, priming future gene activation (Samata, 2020).
The fusion of the maternal and paternal gametes triggers a remarkable transition from two fully differentiated cells to a totipotent zygote that gives rise to all tissues during embryogenesis. In flies, the development of the embryo during the first 13 synchronized nuclear divisions relies on maternally provided proteins and transcripts. These maternal elements are replaced by newly synthesized ones during the major wave of zygotic genome activation (ZGA) at the nuclear cycle (nc) 14 at embryonic stage (st) 5 when the zygotic genome has reformed to accommodate the transcriptional active status. Increase in nucleosome accessibility as well as gradual enrichment of RNA Polymerase II (RNA Pol II) are observed from nc11. The repressive mark H3K27me3 is inherited from the maternal germline restricting the activation of developmental genes, but most of the other acetyl and methyl marks only become prominent genome-wide at ZGA. The few transcripts that are activated before ZGA (during the minor zygotic wave) are under the control of the pioneer transcription factor Zelda, which mediates local chromatin accessibility. However, the mechanisms that guide the reprogramming of the entire genome are not fully understood (Samata, 2020).
Acetylation of histone tails is known to promote transcriptional activation. Among the histone modifications positively correlated with transcription activation, H4K16ac is unique because it prevents chromatin compaction in vitro. However, the developmental dynamics and the biological significance of this modification in the embryonic genome prior to ZGA remain unclear. H4K16ac is deposited by the histone acetyltransferase (HAT) males absent on the first (MOF). The MOF-containing male-specific lethal (MSL) complex is responsible for chromosome-wide upregulation of the male X chromosome to equalize its expression to the female X as well as to autosomal genes. The complex consists of five proteins (MSL1, MSL2, MSL3, MOF, and MLE) together with two long non-coding RNAs, RNA on the X 1 and 2 (roX1, roX2), and is capable of specifically recognizing the single male X chromosome. Interestingly, all MSL proteins, apart from MSL2, are maternally deposited as transcripts and proteins, which remain stable through the early embryonic stages. However, it has not been determined whether they form a functional complex (Samata, 2020).
By analyzing precisely staged Drosophila embryos before and after ZGA and by performing genetic and genomic experiments, this study shows that H4K16ac is intergenerationally transmitted from the female germline and has a fundamental role in controlling chromatin accessibility in the absence of ongoing transcription during early embryogenesis. Furthermore, it poises promoters for future gene activation (Samata, 2020).
MOF represents the major enzyme catalyzing H4K16ac. It is proposed that maternally provided MOF plays a dual role in 'depositing' and 'maintaining' H4K16ac. First, maternal MOF establishes H4K16ac in the maturing oocyte. Following fertilization, MOF exploits its unique ability to remain associated with mitotic chromosomes in order to actively propagate H4K16ac, hence acting as the maintenance factor for H4K16ac during the first and subsequent embryonic divisions. Continuous presence of MOF is essential because histone acetylation marks typically exhibit fast turnover. Thus, the proposed intergenerational H4K16ac transmission model is distinct from the mechanisms of inheritance of methylation marks, many of which rely on different catalytic modes for de novo deposition and propagation (Samata, 2020).
The combination of maternal deposition and early embryonic maintenance of the H4K16ac information is critical for marking genes prior to ZGA for future activation. Absence of this information leads to misregulation of H4K16ac targets and subsequently to increased embryonic lethality. Other acetyl histone marks were not able to restore proper gene expression in the absence of MOF, demonstrating a specific requirement for H4K16ac in oocytes. Furthermore, expression of the 'maintaining' (zygotic) MOF upon ZGA cannot compensate for loss of the 'depositing' (maternal) MOF function. It will be interesting to characterize the specificities of maternal and zygotic MOF and further explore whether genome structure, developmental timing, or other determinants affect the H4K16ac deposition pattern (Samata, 2020).
Maternally deposited H4K16ac primes a subset of genes for subsequent transcriptional activation upon the onset of ZGA and later in development. The transcription factor Zelda is responsible for activation of the first zygotic transcripts. However, neither Zelda-mediated transcription nor the chromatin marks around these Zelda-dependent regions explain the global emergence of chromatin accessibility in early embryos. The current data indicate that the establishment of H4K16ac-mediated nucleosome accessibility on numerous Zelda-independent promoters before ZGA creates a permissive chromatin state that enhances RNA Pol II recruitment and facilitates gene expression activation (Samata, 2020).
Maternal depletion of MOF also led to profound chromatin architecture changes. Even though the structure of TAD boundaries remained unaffected, substantial defects were observed in compartmentalization during ZGA. Analysis of Hi-C datasets from Drosophila st5 embryos whose transcription was abrogated by drug treatment (Hug, 2017) revealed similar phenotypes to those observed in embryos after maternal H4K16ac loss. However, the compartmentalization defects in maternal mof RNAi offspring were apparent only in early and not late embryos, despite persistent transcription misregulation at both stages. Thus, an aberrant transcriptional program is not the primary driving force for the genome compartmentalization defects observed upon H4K16ac loss early on. It is concluded that although maternal H4K16ac contributes toward establishing global genome organization early on, other factors can compensate for this loss at later embryonic stages (Samata, 2020).
The 'future' dosage compensated genes on the X chromosome are among the numerous targets that show H4K16ac signal prior to the onset of their transcription. By characterizing the developmental dynamics of H4K16ac, this study describes the sequence of events that leads to establishment of dosage compensation on the male X chromosome. H4K16ac decorates all chromosomes prior to ZGA but becomes strongly enriched on the male X chromosome during later stages. It is proposed that initiation of dosage compensation at both X-linked genes and high-affinity sites (HASs) relies on the instructive H4K16ac signal from the mother. Without maternal H4K16ac, MSL complex targeting is compromised in males and the mature dosage-compensated phase cannot be reached. Thus, the X chromosome serves as a readout of H4K16ac memory on a chromosomal scale (Samata, 2020).
H4K16ac is deposited in oocytes by a maternal MSL sub-complex composed of MSL1, MOF, and MSL3. This first step prepares the chromatin landscape for establishment of nucleosome accessibility and dosage compensation initiation in a sex-independent manner. Assembly of the canonical MSL complex requires the expression of the male-specific protein MSL2, whose expression starts at stage 5. MSL2 targets the X chromosome at ZGA and together with MOF mediates transcriptional activation of genes close to HASs in a male-specific manner. Interestingly though, no MSL complex 'spreading' in the vicinity of HASs is observed at this stage. Because X-chromosome territory formation coincides with the expression of the roX2 long non-coding RNA, it is possible that efficient MSL complex spreading is mediated by the contribution of roX2/MSL interactions. Thus, the MSL complex targeting and spreading on the X chromosome represent two temporally discrete steps with distinct requirements. Moreover, the three-dimensional organization of HASs may function as an additional stabilizing factor for X-territory maturation. Indeed, interactions that were observed between HASs were more abundant in stage 15 compared to stage 5 embryos, possibly because of the stronger chromatin compartmentalization in late embryos . It is therefore possible that maturation of the active compartment is required for the stronger clustering of HASs (Samata, 2020).
Although the dosage compensation defects represent a clear readout of the importance of maternal H4K16ac, the influence of this early mark is not restricted to male progeny. Furthermore, this study found the H4K16ac marking of the oocyte to be evolutionarily conserved in three Drosophila species (Drosophila melanogaster, D. virilis, and D. busckii) as well as in mammals. Given that mammals have a different dosage compensation mechanism, retention of H4K16ac in the early mammalian zygote likely indicates the importance of this histone modification in embryogenesis (Samata, 2020).
Maternal inheritance of H3K27me3 mediates gene silencing in both Drosophila and mammals. DNA methylation, H3K4me3, and H3K36me3 mediate zygotic genome activation in other organisms. However, these modifications are absent from the young Drosophila zygotes. A variety of mechanisms have thus evolved to propagate instructions to the next generation via histone modifications in the germline. Future work will elucidate the function of H4K16ac early presence in mice and human (Samata, 2020).
The histone locus body (HLB) assembles at replication-dependent (RD) histone loci and concentrates factors required for RD histone Drosophila Histone Locus Body assembly and function involves multiple interactionsmRNA biosynthesis. The D. melanogaster genome has a single locus comprised of ~100 copies of a tandemly arrayed 5 kB repeat unit containing one copy of each of the 5 RD histone genes. To determine sequence elements required for D. melanogaster HLB formation and histone gene expression, transgenic gene arrays were used containing 12 copies of the histone repeat unit that functionally complement loss of the ~200 endogenous RD histone genes. A 12x histone gene array in which all H3-H4 promoters were replaced with H2a-H2b promoters [12x(PR)] does not form an HLB or express high levels of RD histone mRNA in the presence of the endogenous histone genes. In contrast, this same transgenic array is active in HLB assembly and RD histone gene expression in the absence of the endogenous RD histone genes and rescues the lethality caused by homozygous deletion of the RD histone locus. The HLB formed in the absence of endogenous RD histone genes on the mutant 12x array contains all known factors present in the wild type HLB including CLAMP, which normally binds to GAGA repeats in the H3-H4 promoter. These data suggest that multiple protein-protein and/or protein-DNA interactions contribute to HLB formation, and that the large number of endogenous RD histone gene copies sequester available factor(s) from attenuated transgenic arrays, thereby preventing HLB formation and gene expression (Koreski, 2020).
An important organizing principle in cells is the use of membraneless compartments to spatially and temporally regulate diverse biological processes. Numerous membraneless compartments have been identified in both the nucleus (e.g., nucleoli, Cajal bodies, histone locus bodies) and the cytoplasm (e.g., P-bodies, stress granules, germ granules) and are collectively referred to as biomolecular condensates. There is increasing evidence suggesting that biomolecular condensates are formed through liquid-liquid phase separation or condensation (Alberti, 2019). This occurs when proteins and/or nucleic acids in the nucleoplasm or cytoplasm coalesce or demix into a condensed phase that often resembles liquid droplets. Large nuclear condensates that are visible under light microscopy are most often referred to as nuclear bodies (NBs) and represent an important organizing feature of the nucleus (Koreski, 2020).
The histone locus body (HLB) is a conserved NB that assembles at replication-dependent (RD) histone genes and concentrates factors required for RD histone mRNA biogenesis. RD histone mRNAs are the only eukaryotic mRNAs that are not polyadenylated. The unique stem loop at the 3'-end of RD histone mRNAs results from a processing reaction requiring a specialized suite of factors, some of which are constitutively localized in the HLB (Duronio, 2017). The HLB provides a powerful system to study how NBs form and function because it contains a well-characterized set of factors involved in producing a unique class of cell-cycle-regulated mRNAs.P demonstrated that concentrating factors (e.g., FLASH [FLICE-associated huge protein] and U7 snRNP) in the Drosophila melanogaster HLB is critical for efficient histone pre-mRNA processing. However, a full understanding of how the HLB participates in histone mRNA biosynthesis requires knowledge of HLB assembly at the molecular level (Koreski, 2020).
Prior studies of NBs have provided several important assembly concepts that are applicable to the HLB. Many NB components have an intrinsic ability to self-associate, an observation leading to two different models of NB assembly: (1) interactions among NB components occur stochastically, wherein individual factors can be recruited to the body in any order; or (2) components assemble in an ordered or hierarchical pathway, wherein the recruitment of components is predicated on prior recruitment of others. The HLB appears to employ a hybrid version of these two possibilities. For example, genetic loss of function experiments suggest a partially ordered assembly pathway of the Drosophila HLB with some components being required for subsequent recruitment of others. The scaffolding protein Mxc (multi sex combs), the Drosophila ortholog of human NPAT (nuclear protein, ataxia-telangiectasia locus), and FLASH likely form the core of the HLB and are required for subsequent recruitment of other factors (White, 2011). Tethering experiments in mammalian cells indicate that ectopic HLB formation also may be induced by several different HLB components, supporting a stochastic model of assembly (Koreski, 2020).
The initiation event in self-organizing NB assembly is the key step in the process and is not well understood. A prevalent model postulates a 'seeding' event that initiates the nucleation of critical components that form a platform for further recruitment of other components. In some instances, RNA is thought to help seed NB assembly, and NBs such as the nucleolus and ectopic paraspeckles can form at sites of specific transcription. Blocking transcription prevents complete HLB assembly in both zebrafish and flies. However, the HLB is present at RD histone genes even in G1 when the genes are not active, raising the possibility that histone genes themselves participate in seeding HLB assembly (Koreski, 2020).
In Drosophila, the RD histone genes are present at a single locus with ~100 copies of a tandemly arrayed 5-kb repeat unit, each of which contains one copy of the divergently transcribed H2a-H2b and H3-H4 gene pairs as well as the gene for linker histone H1. Using transgenes containing a wild-type or mutant derivative of a single histone repeat, previous work demonstrated that the bidirectional H3-H4 promoter stimulated HLB assembly and transcription of the single histone repeat in salivary glands (Salzler, 2013). It was subsequently shown that the conserved GAGA repeat elements present in the H3-H4 promoter region are targeted by the zinc-finger transcription factor CLAMP (chromatin-linked adaptor for MSL proteins), and that this interaction promotes HLB assembly (Rieder, 2017). Thus, the H3-H4 promoter region might act to seed HLB assembly (Koreski, 2020).
This work leveraged transgenic histone gene arrays to test whether the H3-H4 promoter region is necessary for in vivo function of the RD histone locus. Replacement of H3-H4 promoters with H2a-H2b promoters was shown to result in an attenuated transgenic histone gene array that does not function in the presence of the intact endogenous RD histone locus, but surprisingly provides full in vivo function, including normal HLB assembly and histone gene expression, when the endogenous RD histone locus is absent. These results suggest that multiple elements in the histone genes and core HLB proteins are involved in HLB assembly (Koreski, 2020).
This study used a histone gene replacement platform to analyze the cis acting elements within the Drosophila histone repeat unit that are necessary for HLB formation and histone gene expression. Previously it was shown using a single, transgenic histone gene repeat unit that the promoter region of the divergently transcribed H3-H4 gene pair is capable of stimulating HLB formation (Salzler, 2013). Subsequently this functionality was further mapped using a 12x gene array to conserved GAGA repeats in this region that are targeted by the CLAMP protein (Rieder, 2017). This study presents evidence that a 12xPR histone gene array devoid of the H3-H4 promoter and lacking any CLAMP binding elements cannot assemble an HLB in the presence of the ∼100 RD histone gene copies at the endogenous locus (HisC). However, the 12xPR array surprisingly can rescue homozygous deletion of HisC and fully support the entire Drosophila life cycle. In the HisC deletion background, the 12xPR array assembles an HLB and expresses the same amount of properly processed histone mRNAs as the endogenous genes or as a 12xDWT wild-type array. Below the implications of these observations on HLB assembly and organization are discussed (Koreski, 2020).
Biomolecular condensates form via a seeding event that promotes a high concentration of factors at a discrete location, leading to recruitment of additional factors that ultimately result in a structure that can be observed by light microscopy. A number of putative seeding events for biomolecular condensates have been described, but in many cases the precise mechanism of seeding is not known. Nucleic acids, particularly RNA, have been proposed to seed different NBs. Both the nucleolus and the HLB are associated with specific genomic loci, and it is likely that the DNA (or chromatin) and/or nascent RNA at the locus participates in the seeding event. The activation of zygotic transcription of rRNA leads to the precise spatiotemporal formation of the nucleolus in Drosophila embryos (Falahati, 2016). In the absence of rDNA, Drosophila nucleolar components still form high concentration assemblies, but these are smaller, more numerous, and do not form at the same time in the early embryo as the wild-type nucleolus (Koreski, 2020).
Drosophila HLB components also stochastically assemble smaller and more unstable foci in embryos lacking the RD histone locus, suggesting that HLBs and the nucleolus form similar seeding events. Indeed, the dynamics of HLB assembly in single early Drosophila embryos display properties consistent with liquid-liquid phase transition seeded by HisC (Hur, 2020). Blocking transcription in the early embryo prevents normal HLB growth (Hur, 2020), and a defective H3-H4 promoter (with mutated TATA boxes) does not support HLB formation in the context of a single copy histone gene repeat in salivary glands (Salzler, 2013). These data suggest that active transcription is essential for forming a complete HLB (Koreski, 2020).
It is important to note that HLBs assemble and persist in nonproliferating Drosophila tissues that do not express histone mRNAs and are also present in G0/G1 mammalian cells. Histone gene expression is activated as a result of phosphorylation of Mxc/NPAT by Cyclin E/Cdk2, resulting in changes in the HLB that promote histone gene transcription and pre-mRNA processing. It is proposed that in early embryonic development the histone locus DNA and/or chromatin seeds HLB assembly in Drosophila, with the H3-H4 promoter region being particularly important. It is further proposed that subsequent transcriptional activation of histone genes then drives HLB growth and maturation (Koreski, 2020).
Formation of an HLB on a transgenic RD histone gene array requires that this array compete effectively with the endogenous HisC locus for recruitment of HLB components. This is the situation with 12xHWT and 12XDWT arrays, which form HLBs in the presence of HisC. These results also indicate that there are no other elements within HisC that are necessary for HLB formation. Because the 12xPR array does not form an HLB in the presence of HisC, but it does so in the absence of HisC, it is hypothesized that the endogenous RD histone gene array sequesters critical HLB components, likely including Mxc and CLAMP, thereby preventing HLB assembly at the transgenic locus. By removing the H3-H4 promoter from the transgene, an element was removed that provided additional interactions with HLB components, notably CLAMP, weakening the overall ability of the locus to stably nucleate an HLB (Koreski, 2020).
Interactions among multivalent proteins, or multivalent protein-nucleic acid interactions, are driving forces in the assembly of biomolecular condensates. Mxc is likely the critical factor that together with histone genes seeds Drosophila HLB formation and activates histone gene expression. Mxc is a large (~1800 aa) protein that oligomerizes in vivo and likely provides a scaffold for multivalent interactions. A C-terminal truncation mutant of Mxc that fails to recruit histone pre-mRNA processing factors still forms an HLB and activates histone gene expression at sufficient levels to complete development, underscoring the multivalent nature of Mxc (Koreski, 2020).
Surprisingly, the HLB that assembles on the 12xPR array in the absence of HisC contains CLAMP, even though this study removed all of the known CLAMP binding sites from the histone repeat. Although CLAMP may bind another sequence in the 12xPR array, no other favorable GAGA repeats are present, and it was not possible to detect CLAMP bound to any other location in the histone array by ChIP-qPCR and ChIP-seq experiments. More likely, CLAMP interacts with other HLB components, possibly Mxc or the Mxc-FLASH complex, providing multivalent contacts between CLAMP and other HLB components. Deleting the GAGA sequences from the H3-H4 promoter did not affect transcription of the H3 or H4 genes in the absence of HisC, suggesting that CLAMP's major function is to promote HLB assembly and not to act as a canonical DNA binding transcription factor. Supporting this interpretation is the observation that another, more abundant transcription factor that binds to GAGA repeats, GAF, is not found at the HLB unless CLAMP is absent (Rieder, 2017), consistent with CLAMP's critical interactions with both the GAGA repeats and the HLB factors in seeding the HLB (Koreski, 2020).
Because the 12xPR array is capable of assembling a completely functional HLB in the absence of HisC, the H3-H4 promoter is not absolutely essential for HLB formation. One possibility is that there are multiple pathways for assembling functional HLBs. Previous work suggests that not all seeding events are equivalent in their ability to assemble biomolecular condensates. In artificial systems, changes in scaffold stoichiometry, which can stem from changes in valence, alter the recruitment of components. Further, mathematical modeling has revealed that scaffolds can nucleate distinct complexes when at different concentrations, and that this can qualitatively alter the transcriptional output. Additionally, P-bodies can form in multiple ways through different protein-protein or protein-nucleic acid interactions, with different interactions predominating under different conditions. Therefore, different nucleators of the HLB (i.e., the H3-H4 promoter or other sequences in the locus) may result in similar but not identical outcomes. Collectively these results suggest that HLB formation results from the contribution of many molecular interactions, and the loss of any single one may be overcome by other multivalent interactions within the body (Koreski, 2020).
Cells orchestrate histone biogenesis with strict temporal and quantitative control. To efficiently regulate histone biogenesis, the repetitive Drosophila melanogaster replication-dependent histone genes are arrayed and clustered at a single locus. Regulatory factors concentrate in a nuclear body known as the histone locus body (HLB), which forms around the locus. Historically, HLB factors are largely discovered by chance, and few are known to interact directly with DNA. It is therefore unclear how the histone genes are specifically targeted for unique and coordinated regulation. To expand the list of known HLB factors, a candidate-based screen was performed by mapping 30 publicly available ChIP datasets and 27 factors to the Drosophila histone gene array. Novel transcription factor candidates were identified, including the Drosophila Hox proteins Ultrabithorax, Abdominal-A and Abdominal-B, suggesting a new pathway for these factors in influencing body plan morphogenesis. Additionally, six other transcription factors were identified that target the histone gene array: JIL-1, Hr78, the long isoform of fs(1)h as well as the generalized transcription factors TAF-1, TFIIB, and TFIIF. This foundational screen provides several candidates for future studies into factors that may influence histone biogenesis. Further, this study emphasizes the powerful reservoir of publicly available datasets, which can be mined as a primary screening technique (Hodkinson, 2023).