InteractiveFly: GeneBrief
Chromatin-linked adaptor for MSL proteins: Biological Overview | References
Gene name - Chromatin-linked adaptor for MSL proteins
Synonyms - Cytological map position - 40F2-40F2 Function - zinc finger transcription factor Keywords - dosage compensation - essential in both males and females - also functions outside of the dosage compensation pathway. |
Symbol - Clamp
FlyBase ID: FBgn0032979 Genetic map position - chr2L:22,165,720-22,169,143 NCBI classification - Zinc finger, C2H2 type Cellular location - nuclear |
Recent literature | Urban, J., Kuzu, G., Bowman, S., Scruggs, B., Henriques, T., Kingston, R., Adelman, K., Tolstorukov, M. and Larschan, E. (2017). Enhanced chromatin accessibility of the dosage compensated Drosophila male X-chromosome requires the CLAMP zinc finger protein. PLoS One 12(10): e0186855. PubMed ID: 29077765
Summary: The essential process of dosage compensation is required to equalize gene expression of X-chromosome genes between males (XY) and females (XX). In Drosophila, the conserved Male-specific lethal (MSL) histone acetyltransferase complex mediates dosage compensation by increasing transcript levels from genes on the single male X-chromosome approximately two-fold. Consistent with its increased levels of transcription, the male X-chromosome has enhanced chromatin accessibility, distinguishing it from the autosomes. This study demonstrates that the non-sex-specific CLAMP (Chromatin-linked adaptor for MSL proteins) zinc finger protein that recognizes GA-rich sequences genome-wide promotes the specialized chromatin environment on the male X-chromosome and can act over long genomic distances (~14 kb). Although MSL complex is required for increasing transcript levels of X-linked genes, it is not required for enhancing global male X-chromosome chromatin accessibility, and instead works cooperatively with CLAMP to facilitate an accessible chromatin configuration at its sites of highest occupancy. Furthermore, CLAMP regulates chromatin structure at strong MSL complex binding sites through promoting recruitment of the Nucleosome Remodeling Factor (NURF) complex. In contrast to the X-chromosome, CLAMP regulates chromatin and gene expression on autosomes through a distinct mechanism that does not involve NURF recruitment. Overall, these results support a model where synergy between a non-sex-specific transcription factor (CLAMP) and a sex-specific cofactor (MSL) creates a specialized chromatin domain on the male X-chromosome. |
Urban, J. A., Urban, J. M., Kuzu, G. and Larschan, E. N. (2017). The Drosophila CLAMP protein associates with diverse proteins on chromatin. PLoS One 12(12): e0189772. PubMed ID: 29281702
Summary: Gaining new insights into gene regulation involves an in-depth understanding of protein-protein interactions on chromatin. A powerful model for studying mechanisms of gene regulation is dosage compensation, a process that targets the X-chromosome to equalize gene expression between XY males and XX females. Previous work has identified a zinc finger protein in Drosophila melanogaster that plays a sex-specific role in targeting the Male-specific lethal (MSL) dosage compensation complex to the male X-chromosome, called the Chromatin-Linked Adapter for MSL Proteins (CLAMP). CLAMP has been found to have non-sex-specific roles as an essential protein that regulates chromatin accessibility at promoters genome-wide. To identify associations between CLAMP and other factors in both male and female cells, two complementary mass spectrometry approaches were used. This study demonstrates that CLAMP associates with the transcriptional regulator complex Negative Elongation Factor (NELF; see Nelf-E) in both sexes and determine that CLAMP reduces NELF recruitment to several target genes. In sum, this study has identified many new CLAMP-associated factors and provide a resource for further study of this little understood essential protein. |
Kaye, E. G., Booker, M., Kurland, J. V., Conicella, A. E., Fawzi, N. L., Bulyk, M. L., Tolstorukov, M. Y. and Larschan, E. (2018). Differential occupancy of two GA-binding proteins promotes targeting of the Drosophila dosage compensation complex to the male X chromosome. Cell Rep 22(12): 3227-3239. PubMed ID: 29562179
Summary: Little is known about how variation in sequence composition alters transcription factor occupancy to precisely recruit large transcription complexes. A key model for understanding how transcription complexes are targeted is the Drosophila dosage compensation system in which the male-specific lethal (MSL) transcription complex specifically identifies and regulates the male X chromosome. The chromatin-linked adaptor for MSL proteins (CLAMP) zinc-finger protein targets MSL to the X chromosome but also binds to GA-rich sequence elements throughout the genome. Furthermore, the GAGA-associated factor (GAF) transcription factor also recognizes GA-rich sequences but does not associate with the MSL complex. This study demonstrated that MSL complex recruitment sites are optimal CLAMP targets. Specificity for CLAMP binding versus GAF binding is driven by variability in sequence composition within similar GA-rich motifs. Therefore, variation within seemingly similar cis elements drives the context-specific targeting of a large transcription complex. |
Bag, I., Dale, R. K., Palmer, C. and Lei, E. P. (2019). The zinc-finger protein CLAMP promotes gypsy chromatin insulator function in Drosophila. J Cell Sci. PubMed ID: 30718365
Summary: Chromatin insulators are DNA-protein complexes that establish independent higher order DNA domains to influence transcription. Insulators are functionally defined by two different properties: they can block communication between an enhancer and a promoter and also act as a barrier between heterochromatin and euchromatin. In Drosophila, the gypsy insulator complex contains three core components; Su(Hw), CP190 and Mod(mdg4)67.2. This study identified a novel role for Chromatin-linked adaptor for MSL proteins (CLAMP) in promoting gypsy chromatin insulator function. When clamp is depleted, gypsy-dependent enhancer blocking and barrier activities are strongly reduced. CLAMP associates physically with the core gypsy insulator complex, and ChIP-seq analysis reveals extensive overlap particularly with promoter-bound CP190 on chromatin. Depletion of CLAMP disrupts CP190 binding at a minority of shared sites, but depletion of CP190 results in extensive loss of CLAMP chromatin association. Finally, reduction of CLAMP disrupts CP190 localization within the nucleus. These results support a positive functional relationship between CLAMP and CP190 to promote gypsy chromatin insulator activity. |
Tikhonova, E., Fedotova, A., Bonchuk, A., Mogila, V., Larschan, E. N., Georgiev, P. and Maksimenko, O. (2019). The simultaneous interaction of MSL2 with CLAMP and DNA provides redundancy in the initiation of dosage compensation in Drosophila males. Development. PubMed ID: 31320325
Summary: The binding of the Drosophila male-specific lethal dosage compensation complex (DCC) exclusively to male X chromosome provides an excellent model system to understand mechanisms of selective recruitment of protein complexes to chromatin. Previous studies showed that the male-specific organizer of the complex, MSL2, and ubiquitous DNA-binding protein CLAMP are key players in the specificity of X chromosome binding. The CXC domain of MSL2 binds to genomic sites of DCC recruitment in vitro. Another conserved domain of MSL2, named Clamp-binding domain (CBD) directly interacts with the N-terminal zinc finger domain of CLAMP. This study found that inactivation of CBD or CXC individually only modestly affected recruitment of the DCC to the X chromosome in males. However, combination of these two genetic lesions within the same MSL2 mutant resulted in an increased loss of DCC recruitment to the X chromosome. Thus, proper MSL2 positioning requires an interaction with either CLAMP or DNA to initiate dosage compensation in Drosophila males. |
Rieder, L. E., Jordan, W. T., 3rd and Larschan, E. N. (2019). Targeting of the dosage-compensated male X-chromosome during early Drosophila development. Cell Rep 29(13): 4268-4275. PubMed ID: 31875538
Summary: Dosage compensation, which corrects for the imbalance in X-linked gene expression between XX females and XY males, represents a model for how genes are targeted for coordinated regulation. However, the mechanism by which dosage compensation complexes identify the X chromosome during early development remains unknown because of the difficulty of sexing embryos before zygotic transcription using X- or Y-linked reporter transgenes. This study used meiotic drive to sex Drosophila embryos before zygotic transcription and ChIP-seq to measure the dynamics of dosage compensation factor targeting. The Drosophila male-specific lethal dosage compensation complex (MSLc) requires the ubiquitous zinc-finger protein chromatin-linked adaptor for MSL proteins (CLAMP) to identify the X chromosome. A multi-stage process was observed in which MSLc first identifies CLAMP binding sites throughout the genome, followed by concentration at the strongest X-linked MSLc sites. Insight is provided into the dynamics of binding site recognition by a large transcription complex during early development. |
Jordan, W. and Larschan, E. (2021). The zinc finger protein CLAMP promotes long-range chromatin interactions that mediate dosage compensation of the Drosophila male X-chromosome. Epigenetics Chromatin 14(1): 29. PubMed ID: 34187599
Summary: Drosophila dosage compensation is an important model system for defining how active chromatin domains are formed. The male-specific lethal dosage compensation complex (MSLc) increases transcript levels of genes along the length of the single male X-chromosome to equalize with that expressed from the two female X-chromosomes. The strongest binding sites for MSLc cluster together in three-dimensional space largely independent of MSLc because clustering occurs in both sexes. CLAMP, a non-sex specific, ubiquitous zinc finger protein, binds synergistically with MSLc to enrich the occupancy of both factors on the male X-chromosome. This study demonstrates that CLAMP promotes the observed three-dimensional clustering of MSLc binding sites. Moreover, the X-enriched CLAMP protein more strongly promotes longer-range three-dimensional interactions on the X-chromosome than autosomes. Genome-wide, CLAMP promotes three-dimensional interactions between active chromatin regions together with other insulator proteins. This study has defined how long-range interactions which are modulated by a locally enriched ubiquitous transcription factor promote hyper-activation of the X-chromosome to mediate dosage compensation. |
Duan, J., Rieder, L., Colonnetta, M. M., Huang, A., McKenney, M., Watters, S., Deshpande, G., Jordan, W., Fawzi, N. and Larschan, E. (2021). CLAMP and Zelda function together to promote Drosophila zygotic genome activation. Elife 10. PubMed ID: 34342574
Summary: During the essential and conserved process of zygotic genome activation (ZGA), chromatin accessibility must increase to promote transcription. Drosophila is a well-established model for defining mechanisms that drive ZGA. Zelda (ZLD) is a key pioneer transcription factor (TF) that promotes ZGA in the Drosophila embryo. However, many genomic loci that contain GA-rich motifs become accessible during ZGA independent of ZLD. Therefore, it was hypothesized that other early TFs that function with ZLD have not yet been identified, especially those that are capable of binding to GA-rich motifs such as CLAMP. This study demonstrated that Drosophila embryonic development requires maternal CLAMP to: 1) activate zygotic transcription; 2) increase chromatin accessibility at promoters of specific genes that often encode other essential TFs; 3) enhance chromatin accessibility and facilitate ZLD occupancy at a subset of key embryonic promoters. Thus, CLAMP functions as a pioneer factor which plays a targeted yet essential role in ZGA. |
Eggers, N. and Becker, P. B. (2021). Cell-free genomics reveal intrinsic, cooperative and competitive determinants of chromatin interactions. Nucleic Acids Res 49(13): 7602-7617. PubMed ID: 34181732
Summary: Metazoan transcription factors distinguish their response elements from a large excess of similar sequences. This study explored underlying principles of DNA shape read-out and factor cooperativity in chromatin using a unique experimental system. Chromatin on Drosophila genomes was reconstructed in extracts of preblastoderm embryos, mimicking the naive state of the zygotic genome prior to developmental transcription activation. The intrinsic binding specificities of three recombinant transcription factors, alone and in combination, were then compared with GA-rich recognition sequences genome-wide. For MSL2, all functional elements reside on the X chromosome, allowing to distinguish physiological elements from non-functional 'decoy' sites. The physiological binding profile of MSL2 is approximated through interaction with other factors: cooperativity with CLAMP and competition with GAF, which sculpts the profile by occluding non-functional sites. An extended DNA shape signature is differentially read out in chromatin. These results reveal novel aspects of target selection in a complex chromatin environment. |
Tikhonova, E., Mariasina, S., Efimov, S., Polshakov, V., Maksimenko, O., Georgiev, P. and Bonchuk, A. (2022). Structural basis for interaction between CLAMP and MSL2 proteins involved in the specific recruitment of the dosage compensation complex in Drosophila. Nucleic Acids Res 50(11): 6521-6531. PubMed ID: 35648444
Summary: Transcriptional regulators select their targets from a large pool of similar genomic sites. The binding of the Drosophila dosage compensation complex (DCC) exclusively to the male X chromosome provides insight into binding site selectivity rules. Previous studies showed that the male-specific organizer of the complex, MSL2, and ubiquitous DNA-binding protein CLAMP directly interact and play an important role in the specificity of X chromosome binding. The highly specific interaction between the intrinsically disordered region of MSL2 and the N-terminal zinc-finger C2H2-type (C2H2) domain of CLAMP was examined in this study. The NMR structure was obtainted of the CLAMP N-terminal C2H2 zinc finger, which has a classic C2H2 zinc-finger fold with a rather unusual distribution of residues typically used in DNA recognition. Substitutions of residues in this C2H2 domain had the same effect on the viability of males and females, suggesting that it plays a general role in CLAMP activity. The N-terminal C2H2 domain of CLAMP is highly conserved in insects. However, the MSL2 region involved in the interaction is conserved only within the Drosophila genus, suggesting that this interaction emerged during the evolution of a mechanism for the specific recruitment of the DCC on the male X chromosome in Drosophilidae. |
Colonnetta, M. M., Schedl, P. and Deshpande, G. (2023). Germline/soma distinction in Drosophila embryos requires regulators of zygotic genome activation. Elife 12. PubMed ID: 36598809
Summary: In Drosophila melanogaster embryos, somatic versus germline identity is the first cell fate decision. Zygotic genome activation (ZGA) orchestrates regionalized gene expression, imparting specific identity on somatic cells. ZGA begins with a minor wave that commences at nuclear cycle (NC)8 under the guidance of chromatin accessibility factors (Zelda, CLAMP, GAF), followed by the major wave during NC14. By contrast, primordial germ cell (PGC) specification requires maternally deposited and posteriorly anchored germline determinants. This is accomplished by a centrosome coordinated release and sequestration of germ plasm during the precocious cellularization of PGCs in NC10. This study reports a novel requirement for Zelda and CLAMP during the establishment of the germline/soma distinction. When their activity is compromised, PGC determinants are not properly sequestered, and specification is disrupted. Conversely, the spreading of PGC determinants from the posterior pole adversely influences transcription in the neighboring somatic nuclei. These reciprocal aberrations can be correlated with defects in centrosome duplication/separation that are known to induce inappropriate transmission of the germ plasm. Interestingly, consistent with the ability of bone morphogenetic protein (BMP) signaling to influence specification of embryonic PGCs, reduction in the transcript levels of a BMP family ligand, decapentaplegic (dpp), is exacerbated at the posterior pole. |
Ray, M., Conard, A. M., Urban, J., Mahableshwarkar, P., Aguilera, J., Huang, A., Vaidyanathan, S. and Larschan, E. (2023). Sex-specific splicing occurs genome-wide during early Drosophila embryogenesis. Elife 12. PubMed ID: 37466240
Summary: Sex-specific splicing is an essential process that regulates sex determination and drives sexual dimorphism. Yet, how early in development widespread sex-specific transcript diversity occurs was unknown because it had yet to be studied at the genome-wide level. This study used the powerful Drosophila model to show that widespread sex-specific transcript diversity occurs early in development, concurrent with zygotic genome activation. A new pipeline is presented called time2Splice to quantify changes in alternative splicing over time. Furthermore, it was determined that one of the consequences of losing an essential maternally deposited pioneer factor called CLAMP (chromatin-linked adapter for MSL proteins) is altered sex-specific splicing of genes involved in diverse biological processes that drive development. Overall, this study shows that sex-specific differences in transcript diversity exist even at the earliest stages of development. |
Aguilera, J., Duan, J., Lee, S. M., Ray, M., Larschan, E. (2023). The CLAMP GA-binding transcription factor regulates heat stress-induced transcriptional repression by associating with 3D loop anchors. bioRxiv, PubMed ID: 37873306
Summary: In order to survive when exposed to heat stress (HS), organisms activate stress response genes and repress constitutive gene expression to prevent the accumulation of potentially toxic RNA and protein products. Although many studies have elucidated the mechanisms that drive HS-induced activation of stress response genes across species, little is known about repression mechanisms or how genes are targeted for activation versus repression context-specifically. The mechanisms of heat stress-regulated activation have been well-studied in Drosophila, in which the GA-binding transcription factor GAF is important for activating genes upon heat stress. This study shows that a functionally distinct GA-binding transcription factor (TF) protein, CLAMP (Chromatin-linked adaptor for MSL complex proteins), is essential for repressing constitutive genes upon heat stress but not activation of the canonical heat stress pathway. HS induces loss of CLAMP-associated 3D chromatin loop anchors associated with different combinations of GA-binding TFs prior to HS if a gene becomes repressed versus activated. Overall, this study demonstrated that CLAMP promotes repression of constitutive genes upon HS, and repression and activation are associated with the loss of CLAMP-associated 3D chromatin loops bound by different combinations of GA-binding TFs. |
Heterogametic species require chromosome-wide gene regulation to compensate for differences in sex chromosome gene dosage. In Drosophila melanogaster, transcriptional output from the single male X-chromosome is equalized to that of XX females by recruitment of the male-specific lethal (MSL) complex, which increases transcript levels of active genes 2-fold. The MSL complex contains several protein components and two non-coding RNA on the X (roX) RNAs that are transcriptionally activated by the MSL complex. Targeting of the MSL complex to the X-chromosome has been shown to be dependent on the Chromatin-linked adapter for MSL proteins (CLAMP) zinc finger protein. To better understand CLAMP function, the CRISPR/Cas9 genome editing system was used to generate a frameshift mutation in the clamp gene that eliminates expression of the CLAMP protein. clamp null females were found to die at the third instar larval stage, while almost all clamp null males die at earlier developmental stages. Moreover, it was found that in clamp null females roX gene expression is activated, whereas in clamp null males roX gene expression is reduced. Therefore, CLAMP regulates roX abundance in a sex-specific manner. These results provide new insights into sex-specific gene regulation by an essential transcription factor (Urban, 2017).
Many species employ a sex determination system that generates an inherent imbalance in sex chromosome copy number, such as the XX/XY system in most mammals and some insects. In this system, one sex has twice the number of X-chromosome-encoded genes compared to the other. Therefore, a mechanism of dosage compensation is required to equalize levels of X-linked transcripts, both between the sexes and between the X-chromosome and autosomes. Dosage compensation is an essential mechanism that corrects for this imbalance by coordinately regulating the gene expression of most X-linked genes (Urban, 2017).
In Drosophila melanogaster, transcription from the single male X-chromosome is increased 2-fold by recruitment of the male-specific lethal (MSL) complex. The MSL complex is composed of two structural proteins, MSL1 and MSL2, three accessory proteins, MSL3, males absent on the first (MOF), and maleless (MLE), and two functionally redundant non-coding RNAs, RNA on the X (roX1) and roX2. Previous work has shown that recruitment of the MSL complex to the X-chromosome requires the zinc finger protein chromatin-linked adapter for MSL proteins (CLAMP) (Soruco, 2013; Urban, 2017 and references therein).
In addition to its role in male MSL complex recruitment, it was suggested that CLAMP has an additional non-sex-specific essential function because targeting of the clamp transcript by RNA interference results in a pupal lethal phenotype in both males and females (Soruco, 2013). Further understanding of CLAMP function in the context of the whole organism required a null mutant. However, due to the pericentric location of the clamp gene, no deficiencies or null mutations were available. Using the CRISPR/Cas9 system, a frameshift mutation was introduced in the clamp gene, leading to an early termination codon before the major zinc finger binding domain. This frameshift mutation generated the clamp2 allele, which eliminates detectable CLAMP protein production and is therefore a protein null allele. The majority of clamp2 mutant males die prior to the third instar stage. On the other hand, females die at the third instar stage, suggesting sex-specific functions for CLAMP. Furthermore, CLAMP regulates the roX genes in a sex-specific manner, activating their accumulation in males and repressing their accumulation in females. Overall, we present a new tool for studying dosage compensation and suggest that CLAMP functions to assure that roX RNA accumulation is sex specific (Urban, 2017).
Previous work demonstrated that CLAMP has an essential role in MSL complex recruitment to the male X-chromosome (Soruco, 2013). However, it was not possible to perform in vivo studies to further investigate CLAMP function because there was no available null mutant line. The current work present a CLAMP protein null mutant and determine that this protein is essential in both sexes. This allele will provide a key tool for future in vivo studies on the role of CLAMP in dosage compensation, as well as identification of the essential function of CLAMP in both sexes (Urban, 2017).
The initial characterization of the clamp2 protein null allele revealed sexually dimorphic roles for CLAMP in regulation of the roX genes. CLAMP was seen to promotes roX2 transcription in males but represses transcription of both roX genes in females. It is likely that recruitment of the MSL complex to the roX2 locus by CLAMP promotes roX2 expression in males. In females, where the MSL complex is not present, CLAMP may function to repress these loci as an additional mechanism to ensure that dosage compensation is male-specific. Additionally, it was determined that most clamp2 homozygous males die earlier in development than clamp2 homozygous females. Earlier lethality in males is likely due to a misregulation of the dosage compensation process as a result of the loss of CLAMP-mediated MSL complex recruitment. However, CLAMP is enriched at the 5' regulatory regions of thousands of genes across the genome. Therefore, it is likely that other non-sex-specific regulatory pathways are disrupted resulting in female lethality (Urban, 2017).
Furthermore, CLAMP is an essential protein because our CRISPR/Cas9-generated protein null clamp allele is homozygous lethal in both males and females. These results indicate that CLAMP has a previously unstudied non-sex-specific role that is essential to the viability of both males and females. An interesting observation that arose from this characterization is that polytene chromosome organization is disrupted in clamp2 mutant females, suggesting that CLAMP may play a role in regulation of genome-wide chromatin organization of interphase chromosomes. A function in regulating chromatin organization provides one possible explanation for how CLAMP performs sexually dimorphic functions. For example, CLAMP may repress roX expression in females by promoting the recruitment of a repressive chromatin-modifying factor in the absence of the MSL complex. In contrast, CLAMP may activate roX2 in males by creating a chromatin environment permissive for MSL complex recruitment in males. Although roX1 and roX2 are functionally redundant, the results suggest that CLAMP specifically activates roX2 but not roX1 in males. Interestingly, Villa (2016) recently reported that roX2, but not roX1, is likely to be an early site of MSL complex recruitment (Villa, 2016), suggesting that CLAMP may function early in the process of dosage compensation (Urban, 2017).
Overall, the newly generated clamp2 protein null allele provides an important tool to study how the essential CLAMP protein regulates its many target genes in vivo. The generation of the clamp2 allele will facilitate future studies that will reveal a mechanistic understanding of how a single transcription factor can promote different sex-specific functions within an organism (Urban, 2017).
The histone locus body (HLB) assembles at replication-dependent (RD) histone loci and concentrates factors required for RD histone Drosophila Histone Locus Body assembly and function involves multiple interactionsmRNA biosynthesis. The D. melanogaster genome has a single locus comprised of ~100 copies of a tandemly arrayed 5 kB repeat unit containing one copy of each of the 5 RD histone genes. To determine sequence elements required for D. melanogaster HLB formation and histone gene expression, transgenic gene arrays were used containing 12 copies of the histone repeat unit that functionally complement loss of the ~200 endogenous RD histone genes. A 12x histone gene array in which all H3-H4 promoters were replaced with H2a-H2b promoters [12x(PR)] does not form an HLB or express high levels of RD histone mRNA in the presence of the endogenous histone genes. In contrast, this same transgenic array is active in HLB assembly and RD histone gene expression in the absence of the endogenous RD histone genes and rescues the lethality caused by homozygous deletion of the RD histone locus. The HLB formed in the absence of endogenous RD histone genes on the mutant 12x array contains all known factors present in the wild type HLB including CLAMP, which normally binds to GAGA repeats in the H3-H4 promoter. These data suggest that multiple protein-protein and/or protein-DNA interactions contribute to HLB formation, and that the large number of endogenous RD histone gene copies sequester available factor(s) from attenuated transgenic arrays, thereby preventing HLB formation and gene expression (Koreski, 2020).
An important organizing principle in cells is the use of membraneless compartments to spatially and temporally regulate diverse biological processes. Numerous membraneless compartments have been identified in both the nucleus (e.g., nucleoli, Cajal bodies, histone locus bodies) and the cytoplasm (e.g., P-bodies, stress granules, germ granules) and are collectively referred to as biomolecular condensates. There is increasing evidence suggesting that biomolecular condensates are formed through liquid-liquid phase separation or condensation (Alberti, 2019). This occurs when proteins and/or nucleic acids in the nucleoplasm or cytoplasm coalesce or demix into a condensed phase that often resembles liquid droplets. Large nuclear condensates that are visible under light microscopy are most often referred to as nuclear bodies (NBs) and represent an important organizing feature of the nucleus (Koreski, 2020).
The histone locus body (HLB) is a conserved NB that assembles at replication-dependent (RD) histone genes and concentrates factors required for RD histone mRNA biogenesis. RD histone mRNAs are the only eukaryotic mRNAs that are not polyadenylated. The unique stem loop at the 3'-end of RD histone mRNAs results from a processing reaction requiring a specialized suite of factors, some of which are constitutively localized in the HLB (Duronio, 2017). The HLB provides a powerful system to study how NBs form and function because it contains a well-characterized set of factors involved in producing a unique class of cell-cycle-regulated mRNAs.P demonstrated that concentrating factors (e.g., FLASH [FLICE-associated huge protein] and U7 snRNP) in the Drosophila melanogaster HLB is critical for efficient histone pre-mRNA processing. However, a full understanding of how the HLB participates in histone mRNA biosynthesis requires knowledge of HLB assembly at the molecular level (Koreski, 2020).
Prior studies of NBs have provided several important assembly concepts that are applicable to the HLB. Many NB components have an intrinsic ability to self-associate, an observation leading to two different models of NB assembly: (1) interactions among NB components occur stochastically, wherein individual factors can be recruited to the body in any order; or (2) components assemble in an ordered or hierarchical pathway, wherein the recruitment of components is predicated on prior recruitment of others. The HLB appears to employ a hybrid version of these two possibilities. For example, genetic loss of function experiments suggest a partially ordered assembly pathway of the Drosophila HLB with some components being required for subsequent recruitment of others. The scaffolding protein Mxc (multi sex combs), the Drosophila ortholog of human NPAT (nuclear protein, ataxia-telangiectasia locus), and FLASH likely form the core of the HLB and are required for subsequent recruitment of other factors (White, 2011). Tethering experiments in mammalian cells indicate that ectopic HLB formation also may be induced by several different HLB components, supporting a stochastic model of assembly (Koreski, 2020).
The initiation event in self-organizing NB assembly is the key step in the process and is not well understood. A prevalent model postulates a 'seeding' event that initiates the nucleation of critical components that form a platform for further recruitment of other components. In some instances, RNA is thought to help seed NB assembly, and NBs such as the nucleolus and ectopic paraspeckles can form at sites of specific transcription. Blocking transcription prevents complete HLB assembly in both zebrafish and flies. However, the HLB is present at RD histone genes even in G1 when the genes are not active, raising the possibility that histone genes themselves participate in seeding HLB assembly (Koreski, 2020).
In Drosophila, the RD histone genes are present at a single locus with ~100 copies of a tandemly arrayed 5-kb repeat unit, each of which contains one copy of the divergently transcribed H2a-H2b and H3-H4 gene pairs as well as the gene for linker histone H1. Using transgenes containing a wild-type or mutant derivative of a single histone repeat, previous work demonstrated that the bidirectional H3-H4 promoter stimulated HLB assembly and transcription of the single histone repeat in salivary glands (Salzler, 2013). It was subsequently shown that the conserved GAGA repeat elements present in the H3-H4 promoter region are targeted by the zinc-finger transcription factor CLAMP (chromatin-linked adaptor for MSL proteins), and that this interaction promotes HLB assembly (Rieder, 2017). Thus, the H3-H4 promoter region might act to seed HLB assembly (Koreski, 2020).
This work leveraged transgenic histone gene arrays to test whether the H3-H4 promoter region is necessary for in vivo function of the RD histone locus. Replacement of H3-H4 promoters with H2a-H2b promoters was shown to result in an attenuated transgenic histone gene array that does not function in the presence of the intact endogenous RD histone locus, but surprisingly provides full in vivo function, including normal HLB assembly and histone gene expression, when the endogenous RD histone locus is absent. These results suggest that multiple elements in the histone genes and core HLB proteins are involved in HLB assembly (Koreski, 2020).
This study used a histone gene replacement platform to analyze the cis acting elements within the Drosophila histone repeat unit that are necessary for HLB formation and histone gene expression. Previously it was shown using a single, transgenic histone gene repeat unit that the promoter region of the divergently transcribed H3-H4 gene pair is capable of stimulating HLB formation (Salzler, 2013). Subsequently this functionality was further mapped using a 12x gene array to conserved GAGA repeats in this region that are targeted by the CLAMP protein (Rieder, 2017). This study presents evidence that a 12xPR histone gene array devoid of the H3-H4 promoter and lacking any CLAMP binding elements cannot assemble an HLB in the presence of the ∼100 RD histone gene copies at the endogenous locus (HisC). However, the 12xPR array surprisingly can rescue homozygous deletion of HisC and fully support the entire Drosophila life cycle. In the HisC deletion background, the 12xPR array assembles an HLB and expresses the same amount of properly processed histone mRNAs as the endogenous genes or as a 12xDWT wild-type array. Below the implications of these observations on HLB assembly and organization are discussed (Koreski, 2020).
Biomolecular condensates form via a seeding event that promotes a high concentration of factors at a discrete location, leading to recruitment of additional factors that ultimately result in a structure that can be observed by light microscopy. A number of putative seeding events for biomolecular condensates have been described, but in many cases the precise mechanism of seeding is not known. Nucleic acids, particularly RNA, have been proposed to seed different NBs. Both the nucleolus and the HLB are associated with specific genomic loci, and it is likely that the DNA (or chromatin) and/or nascent RNA at the locus participates in the seeding event. The activation of zygotic transcription of rRNA leads to the precise spatiotemporal formation of the nucleolus in Drosophila embryos (Falahati, 2016). In the absence of rDNA, Drosophila nucleolar components still form high concentration assemblies, but these are smaller, more numerous, and do not form at the same time in the early embryo as the wild-type nucleolus (Koreski, 2020).
Drosophila HLB components also stochastically assemble smaller and more unstable foci in embryos lacking the RD histone locus, suggesting that HLBs and the nucleolus form similar seeding events. Indeed, the dynamics of HLB assembly in single early Drosophila embryos display properties consistent with liquid-liquid phase transition seeded by HisC (Hur, 2020). Blocking transcription in the early embryo prevents normal HLB growth (Hur, 2020), and a defective H3-H4 promoter (with mutated TATA boxes) does not support HLB formation in the context of a single copy histone gene repeat in salivary glands (Salzler, 2013). These data suggest that active transcription is essential for forming a complete HLB (Koreski, 2020).
It is important to note that HLBs assemble and persist in nonproliferating Drosophila tissues that do not express histone mRNAs and are also present in G0/G1 mammalian cells. Histone gene expression is activated as a result of phosphorylation of Mxc/NPAT by Cyclin E/Cdk2, resulting in changes in the HLB that promote histone gene transcription and pre-mRNA processing. It is proposed that in early embryonic development the histone locus DNA and/or chromatin seeds HLB assembly in Drosophila, with the H3-H4 promoter region being particularly important. It is further proposed that subsequent transcriptional activation of histone genes then drives HLB growth and maturation (Koreski, 2020).
Formation of an HLB on a transgenic RD histone gene array requires that this array compete effectively with the endogenous HisC locus for recruitment of HLB components. This is the situation with 12xHWT and 12XDWT arrays, which form HLBs in the presence of HisC. These results also indicate that there are no other elements within HisC that are necessary for HLB formation. Because the 12xPR array does not form an HLB in the presence of HisC, but it does so in the absence of HisC, it is hypothesized that the endogenous RD histone gene array sequesters critical HLB components, likely including Mxc and CLAMP, thereby preventing HLB assembly at the transgenic locus. By removing the H3-H4 promoter from the transgene, an element was removed that provided additional interactions with HLB components, notably CLAMP, weakening the overall ability of the locus to stably nucleate an HLB (Koreski, 2020).
Interactions among multivalent proteins, or multivalent protein-nucleic acid interactions, are driving forces in the assembly of biomolecular condensates. Mxc is likely the critical factor that together with histone genes seeds Drosophila HLB formation and activates histone gene expression. Mxc is a large (~1800 aa) protein that oligomerizes in vivo and likely provides a scaffold for multivalent interactions. A C-terminal truncation mutant of Mxc that fails to recruit histone pre-mRNA processing factors still forms an HLB and activates histone gene expression at sufficient levels to complete development, underscoring the multivalent nature of Mxc (Koreski, 2020).
Surprisingly, the HLB that assembles on the 12xPR array in the absence of HisC contains CLAMP, even though this study removed all of the known CLAMP binding sites from the histone repeat. Although CLAMP may bind another sequence in the 12xPR array, no other favorable GAGA repeats are present, and it was not possible to detect CLAMP bound to any other location in the histone array by ChIP-qPCR and ChIP-seq experiments. More likely, CLAMP interacts with other HLB components, possibly Mxc or the Mxc-FLASH complex, providing multivalent contacts between CLAMP and other HLB components. Deleting the GAGA sequences from the H3-H4 promoter did not affect transcription of the H3 or H4 genes in the absence of HisC, suggesting that CLAMP's major function is to promote HLB assembly and not to act as a canonical DNA binding transcription factor. Supporting this interpretation is the observation that another, more abundant transcription factor that binds to GAGA repeats, GAF, is not found at the HLB unless CLAMP is absent (Rieder, 2017), consistent with CLAMP's critical interactions with both the GAGA repeats and the HLB factors in seeding the HLB (Koreski, 2020).
Because the 12xPR array is capable of assembling a completely functional HLB in the absence of HisC, the H3-H4 promoter is not absolutely essential for HLB formation. One possibility is that there are multiple pathways for assembling functional HLBs. Previous work suggests that not all seeding events are equivalent in their ability to assemble biomolecular condensates. In artificial systems, changes in scaffold stoichiometry, which can stem from changes in valence, alter the recruitment of components. Further, mathematical modeling has revealed that scaffolds can nucleate distinct complexes when at different concentrations, and that this can qualitatively alter the transcriptional output. Additionally, P-bodies can form in multiple ways through different protein-protein or protein-nucleic acid interactions, with different interactions predominating under different conditions. Therefore, different nucleators of the HLB (i.e., the H3-H4 promoter or other sequences in the locus) may result in similar but not identical outcomes. Collectively these results suggest that HLB formation results from the contribution of many molecular interactions, and the loss of any single one may be overcome by other multivalent interactions within the body (Koreski, 2020).
MSL2, the DNA-binding subunit of the Drosophila dosage compensation complex, cooperates with the ubiquitous protein CLAMP to bind MSL recognition elements (MREs) on the X chromosome. This study explored the nature of the cooperative binding to these GA-rich, composite sequence elements in reconstituted naive embryonic chromatin. It was found that the cooperativity requires physical interaction between both proteins. Remarkably, disruption of this interaction does not lead to indirect, nucleosome-mediated cooperativity as expected, but to competition. The protein interaction apparently not only increases the affinity for composite binding sites, but also locks both proteins in a defined dimeric state that prevents competition. High Affinity Sites of MSL2 on the X chromosome contain variable numbers of MREs. The cooperation between MSL2/CLAMP was not influenced by MRE clustering or arrangement, but happens largely at the level of individual MREs. The sites where MSL2/CLAMP bind strongly in vitro locate to all chromosomes and show little overlap to an expanded set of X-chromosomal MSL2 in vivo binding sites generated by CUT&RUN. Apparently, the intrinsic MSL2/CLAMP cooperativity is limited to a small selection of potential sites in vivo. This restriction must be due to components missing in the reconstitution, such as roX2 lncRNA (Eggers, 2023).
The experimental strategy of this study involves the assembly of physiological chromatin on genomic DNA to provide a complex substrate for TF interaction assays. Since the assembly system is derived from Drosophila preblastoderm embryos, the chromatin that arises has all hallmarks of pre-MBT embryos: it does not contain significant concentrations of endogenous TFs or components of the transcription machinery. The addition of defined TFs somewhat mimics aspects of zygotic genome activation, when newly translated proteins initiate the complex genome expression program. The analysis of large compendia of potential TF binding sites in different genomic contexts allows statistical generalization, in addition to anecdotal illustration (Eggers, 2023).
This experimental system is well suited to discovering intrinsic protein-DNA interactions of well-characterized components. These may even include heterologous TFs of other species that bind short recognition motifs statistically distributed in Drosophila genomes. Characterizing the intrinsic binding properties of TFs in a complex chromatin environment is important, even if TF binding specificities commonly observed in vitro do not explain physiological chromosome interactions (Eggers, 2023).
The focus of this study has been on the DNA binding proteins that target the 'male-specific-lethal' (MSL) dosage compensation complex (DCC) exclusively to X chromosome-specific DNA elements to deploy the vital activation of transcription. In vivo, the MSL proteins do not bind autosomal DNA. This study found that (i) MSL2, the DNA-binding subunit of the DCC, bears intrinsic sequence selectivity for a complex X chromosome-specific DNA signature, which is termed PionX sites; (ii) MSL2 cooperates with the abundant GA-binding protein CLAMP to associate with bona fide high affinity sites (HAS) on the X chromosome in vivo. In vitro this cooperation results in the recruitment of MSL2 to hundreds of strong CLAMP binding sites on all chromosomes; (iii) the GA-binding GAF competes with CLAMP/MSL2, preventing binding to a class of unphysiological sites. This study investigated the mechanism underlying the CLAMP-MSL2 cooperativity (Eggers, 2023).
The cooperativity between CLAMP and MSL2 was found to depend on the physical interaction between the two proteins. Although a protein complex is formed in solution, it is possible or even likely that the interaction is promoted by DNA. It is expected that upon disruption of the TF interaction the direct cooperativity would turn into indirect, nucleosome-mediated cooperativity, the dominant mechanism of synergism between TFs. Surprisingly, it was found that under those conditions, the stronger GA-binder, CLAMP, competes with MSL2 for binding to composite sites. These sites are unusually long: MEME detects PWMs of 20 bp and more in peaks, so there should be space for two or more proteins to bind side by side. This is in line with the comprehensive study of Taipale and colleagues, who showed that often cooperating TFs bind composite sequence elements where individual TF recognition motifs overlap (Jolma, 2015). These findings argue that the physical interaction not just serves to provide additional surfaces to reduce the off-rate of TFs, but in this case assures that the two cooperating factors bind DNA in a defined arrangement that assures that the long, composite binding sites are fully used and to prevent dominance by the more avid GA-binder CLAMP, which outcompetes MSL2 if not physically connected to it (Eggers, 2023).
Since CLAMP binds degenerate GA repeats, it is possible that two CLAMP molecules can bind to these long sequences, or that single CLAMP molecules bind with variable translational positions, thus occluding GA sequences required for MSL2 interaction. Accordingly, MSL2 can resist competition at sites that contain the PionX signature, but will be competed away from sites of lower affinity that contain degenerate GA repeats. In such a situation, the physical interaction between CLAMP ZF1 and the CBD of MSL2 may ensure that both proteins lock into a defined geometry that prevents competition (Eggers, 2023).
MSL2 and CLAMP mutually cooperate to bind stably to X-chromosomal HAS in vivo. However, the chromosomal MSL2 binding profile only overlaps to a low extent with the pattern of MSL2-CLAMP binding in vitro. In some ways, this is not surprising given that MSL2 functions as a subunit of the large DCC. It is possible that MSL1-induced dimerization improves the selectivity of DNA recognition. The DCC also contains the long, non-coding roX RNA. Indeed, several studies concluded that the presence of roX RNA is required for faithful localization of MSL2 on the X chromosome. This study found MSL2 delocalized to strong CLAMP binding sites on all chromosomes in the absence of roX RNA in vivo. Although the underlying mechanism is unknown, it is noted that roX2 interacts with the C-terminus of MSL2, which also harbors the intrinsic GA-binding specificity of MSL2 as well as the CBD. It was speculated that roX may modulate the DNA and protein interactions of MSL2, perhaps by restricting the cooperativity with CLAMP (Eggers, 2023).
Remarkably, Rieder (2019) recently observed that at nuclear division cycle 13, prior to the expression of roX2 and the assembly of a mature DCC, MSL2 and CLAMP localize to CLAMP binding sites on all chromosomes. This study hypothesized that the binding profile in reconstituted preblastoderm chromatin might reflect some of these earliest MSL2 interactions with chromosomes prior to roX2 expression. However, a comparison of the profiles did not support this hypothesis (Eggers, 2023).
Zygotic gene activation is also accompanied by the appearance of the linker histone H1, which may modulate TF cooperativity. This possibility was explored in earlier work, but it was found found that genome-wide incorporation of H1 did not affect the cooperativity of CLAMP and MSL2 qualitatively, but led to a general dampening of all TF binding to chromatin (Eggers, 2023).
Taking advantage of the superior resolution and signal/noise output of C&R analyses determined MSL2 binding sites in S2 cells. More that 700 distinct peaks of MSL2 interaction was obtained, all of which were localized on the X chromosome. Stratifying the peaks according to their signal strength reveals a hierarchy of sequences. Motifs with 5' extension, resembling the PionX motif ranged among the sites with highest affinity, followed by sites of intermediate affinity that display the MRE consensus sequence. Weaker binding sites contain degenerate MRE motifs dominated by GA repeats. These data support earlier conclusions from low-resolution studies on polytene chromosomes about the existence of a hierarchy of binding sites, where sites of higher affinity are primary attractants for MSL proteins and sites of lower affinity profit from local TF enrichment (Eggers, 2023).
The X-chromosomal HAS or 'chromosome entry sites, CES' were initially defined at low resolution on polytene chromosomes. With the advent of the ChIP-seq methodology it became clear that the peaks of DCC binding are several hundred base pairs long and often contain two or more MRE sequences. MEME analysis often focuses only on the strongest MRE within a larger ChIP-seq peak. Intuitively, loci with multiple MREs may attract more MSL2 than sites with a single MRE. It is also conceivable that higher affinity sites have a defined arrangement of sites, which may facilitate cooperative interactions between MSL complexes bound to different elements. However, these questions have not been systematically investigated (Eggers, 2023).
With respect to the diversity of MRE number and arrangement, which may or may not include PionX motifs, the intrinsic binding sites obtained in vitro and the physiological X chromosomal binding sites appear very similar. In both cases, no strong correlation was found between the number, length or arrangement of MRE sequences with ChIP-seq peak intensity. It is concluded that the binding strength is mainly determined by the affinity of MSL2 for individual MRE motifs, where the cooperativity with CLAMP plays an important role. The predominant binding event can be inferred from EChO analyses. MSL2 appears particularly dependent on this cooperation, since embryonic nuclei only contain very limited amounts of MSL2, barely enough to distribute one copy for each binding site mapped by C&R (Eggers, 2023).
It has been suggested that MREs originally evolved from pyrimidine-rich splicing enhancers in introns and were not under selection for precise location, but rather for proximity to genes. The finding that MSL2/CLAMP binding does not require a defined architecture of binding sites resonates with the observation that defined spacing and orientation of multiple binding sites are commonly not needed to assure integration of different TF input (Eggers, 2023).
In conclusion, these analyses show that the direct contact between MSL2 and CLAMP ensures cooperativity and avoids competition between the two proteins at individual MREs. Remarkably, in vivo this intrinsic mechanism only operates at X chromosomal sites where the DCC is functional. Future research will focus on finding the molecular mechanism that restricts this cooperative binding to the X chromosome (Eggers, 2023).
Dosage compensation is an essential process that equalizes transcript levels of X-linked genes between sexes by forming a domain of coordinated gene expression. Throughout the evolution of Diptera, many different X-chromosomes acquired the ability to be dosage compensated. Once each newly evolved X-chromosome is targeted for dosage compensation in XY males, its active genes are upregulated two-fold to equalize gene expression with XX females. In Drosophila melanogaster, the CLAMP zinc finger protein links the dosage compensation complex to the X-chromosome. However, the mechanism for X-chromosome identification has remained unknown. This study combined biochemical, genomic and evolutionary approaches to reveal that expansion of GA-dinucleotide repeats likely accumulated on the X-chromosome over evolutionary time to increase the density of CLAMP binding sites, thereby driving the evolution of dosage compensation. Overall, this study presents new insight into how subtle changes in genomic architecture, such as expansions of a simple sequence repeat, promote the evolution of coordinated gene expression (Kuzu, 2016).
Upon the evolution of heterogametic species, the process of dosage compensation became essential to ensure the appropriate balance of gene expression between males and females and the X and autosomes. Distinguishing the X-chromosome from autosomes is the key step in this process because MSL complex must be targeted to the correct chromosome to ensure the fidelity of dosage compensation. This study has demonstrate that in several species this process likely involved enriching the evolving X-chromosomes for long GA-repeat binding sites that can be recognized by the highly conserved CLAMP protein that recruits MSL complex (Kuzu, 2016).
CLAMP binding sites are not X-specific as the CLAMP protein binds to similar GA-rich sequences all over the genome. It is proposed that a higher density of sites within CES that contain longer GA-repeats evolved to optimize CLAMP binding on X to better target MSL complex for dosage compensation. Then, it is likely that the increased density of CLAMP at CES functions together with other cofactors with known roles in MSL complex recruitment such as H3K36me3 and roX RNAs. Once this initial process of X-chromosome identification occurs, synergistic interactions between maternally loaded CLAMP and the MSL complex [20] increase the X-enrichment of both factors (Kuzu, 2016).
Interestingly, the CLAMP motif is much longer than most transcription factor binding sites. It is possible that the length of the CLAMP binding site ensures specificity by reducing the promiscuity of its binding and allowing it to compete with other similar proteins. In addition, recent work on transcriptional regulators in budding yeast has implicated the sequence context of transcription factor binding sites outside of the core binding site as critical for the recognition process. Therefore, current approaches to identifying transcription factor binding site motifs have likely underestimated their length due to the approaches used that often allow detection of only short motifs. In the future, it will be important to determine transcription factor recognition motifs using approaches like gcPBM that uses in vivo sequences to identify direct binding site motifs (Kuzu, 2016).
There are several mechanisms by which the GA-repeat number could have been increased including expansions due to slippage of DNA polymerase. Helitron transposons containing GA-rich sequences have also been implicated in the X-enrichment of these sequences in D. miranda. It is possible that expansions of GA dinucleotides occurred within these transposons after they landed on the X-chromosome. These GA-repeat expansions could have been further propagated by gene conversion events that also occurred during the evolution of dosage compensation. Finally, long repeat sequences such as the 1.688 elements that produce siRNAs function during dosage compensation via an unknown mechanism (Menon, 2014). Therefore, it is possible that GA-repeat elements have been expanded over evolutionary time because of a general role in promoting dosage compensation. To support this hypothesis, a recent report identified GA-rich binding motifs almost identical to those that we characterized as CLAMP binding sites within the strongest MSL complex binding sites in three additional Drosophila species (Kuzu, 2016).
Motifs that contain GA-repeats have been implicated in diverse processes that all involve generating open chromatin regions. GA-repeat containing motifs are highly enriched at sites that promote pausing of RNA Polymerase II and at developmentally regulated DNase I hypersensitivity sites. Furthermore, a GA-repeat motif is one of the two motifs that are enriched at genes that are activated first during the maternal to zygotic transition. The well-studied GAGA factor (GAF) protein also recognizes similar sequences to the CLAMP protein and has been implicated in pausing of RNA Polymerase II and opening of chromatin. Overall, it is likely that the dosage compensation machinery has evolved to take advantage of targeting GA-repeats that mark open chromatin regions to ensure that it only identifies active genes for further transcriptional upregulation by the MSL complex (Kuzu, 2016).
It is possible that GA-rich sequences have roles in dosage compensation outside of Diptera. For example, it has been proposed that upregulation of the single active X occurs in mammals and this process is mediated by targeting the conserved MOF histone acetyltransferase component of MSL complex. Moreover, GA-repeats were found to be significantly enriched within regions of the X-chromosome that escape X-inactivation (X escape regions). There are no strong homologues of CLAMP in mammals but there are several possible functional orthologs such as the ETS family transcription factor GABP1 (GA binding protein-1). Furthermore, in C. elegans, there is an early upregulation of both X-chromosomes that is also mediated by the MOF histone acetyltransferase. One of the zinc finger proteins that targets the C. elegans dosage compensation machinery is SCC-2 (sister chromatid cohesion-2) which recognizes a GA-repeat sequence very similar to the CLAMP binding motif. Therefore, it is possible that GA-repeats are involved in dosage compensation beyond Diptera and this will be an exciting area for future investigation (Kuzu, 2016).
The Drosophila male-specific lethal (MSL) dosage compensation complex increases transcript levels on the single male X chromosome to equal the transcript levels in XX females. However, it is not known how the MSL complex is linked to its DNA recognition elements, the critical first step in dosage compensation. This study demonstrated that a previously uncharacterized zinc finger protein, CLAMP (chromatin-linked adaptor for MSL proteins), functions as the first link between the MSL complex and the X chromosome. CLAMP directly binds to the MSL complex DNA recognition elements and is required for the recruitment of the MSL complex. The discovery of CLAMP identifies a key factor required for the chromosome-specific targeting of dosage compensation, providing new insights into how subnuclear domains of coordinate gene regulation are formed within metazoan genomes (Soruco, 2013).
To determine whether CLAMP and the MSL complex colocalize at 'chromatin entry sites' CESs in vivo, CLAMP chromatin immunoprecipitation (ChIP) sequencing (ChIP-seq) experiments were performede in male SL2 cells and CLAMP occupancy profiles were compared with available MSL complex occupancy profiles. CLAMP occupancy was found at many loci throughout the genome (X: 2695; autosomes: 10,009) and strong colocalization was found with MRE sequences. Moreover, CLAMP and the MSL complex exhibit their highest occupancy levels when colocalized. Nonoverlapping CLAMP occupancy sites occur at 5' ends of active genes on all chromosomes, and nonoverlapping MSL-binding sites are localized over the bodies of active X-linked genes. Furthermore, CLAMP sites occur precisely over MREs within CESs compared with the broader domains of MSL complex occupancy. Together, these patterns are consistent with a role for CLAMP in recruiting the MSL complex to CESs, followed by spreading of the MSL complex to the bodies of active genes and additional non-sex-specific roles for CLAMP at 5' ends of active genes (Soruco, 2013).
Next, the MEME software package was used to generate a position weight matrix (PWM) representing the most enriched sequence within 500 regions surrounding CLAMP occupancy sites (2 kb each) at the 5' and 3' ends of genes genome-wide. All CLAMP localization motifs on the X chromosome and on autosomes shared an 8-bp GA-rich core region that is very similar to the previously identified MRE consensus sequence. Therefore, it was hypothesized that CLAMP functions as an MRE recognition factor (Soruco, 2013).
To test this hypothesis, the in vitro DNA-binding specificity of CLAMP was characterized. Protein-binding microarrays (PBMs) was used to assay CLAMP binding to all possible 10-bp sequences. Each PBM is a dsDNA array that contains all possible 10-bp sequences embedded within variable flanking sequences. After testing the complete seven-finger CLAMP construct and many additional combinations of zinc fingers, it was possible able to express soluble protein for the C-terminal six-finger and four-finger zinc finger regions of CLAMP. The in vitro PBM analysis for these proteins yielded statistically significant 8-bp motifs with high PBM enrichment, in excellent agreement with the CLAMP in vivo binding-site motif and the previously defined MRE consensus. Therefore, CLAMP binds directly to the MRE motif in vitro (Soruco, 2013).
Next, whether MREs are required for the recruitment of CLAMP in vivo in male and female larvae was examined using available stocks with three intact or mutated MREs embedded within a minimal 150-bp fragment of CES5C2 inserted site specifically into an autosomal location. Occupancy of both the MSL complex and CLAMP is greater on the X chromosome compared with the autosomal insertion. The autosomal insertion with three intact MREs shows greater occupancy of CLAMP than that in which the 8-bp core of each of the three MREs is mutated. Therefore, MREs are required for CLAMP recruitment in both males and females, as expected for a non-sex-specific factor (Soruco, 2013).
To determine whether enrichment at CESs is specific to CLAMP or is a common feature shared with other factors that can recognize MRE-like sequences, a well-characterized Drosophila DNA-binding protein, the GAGA factor (GAF), was examined. In contrast to CLAMP, which is strongly enriched at CESs, GAF occupancy was lower at CESs compared with other GAF-binding sites throughout the genome. These data suggest that CLAMP, but not GAF, directly recognizes MRE elements within CESs. In addition, CLAMP was recently shown to physically associate with the MSL complex on chromatin using an approach that combined ChIP with mass spectrometry (Wang, 2013). Therefore, CLAMP directly recognizes MREs and physically associates with the MSL complex, providing the first link between the MSL complex and the X chromosome (Soruco, 2013).
Previous studies demonstrated that CLAMP is required for recruitment of the MSL complex to five CESs in SL2 male tissue culture cells (Larschan, 2012). To determine whether CLAMP is required for MSL complex recruitment to the entire X chromosome in flies, a publicly available RNAi transgenic Drosophila line was used to reduce CLAMP mRNA levels. Because CLAMP is highly expressed in all tissues, the RNAi construct was expressed ubiquitously. Although the CLAMP RNAi construct did not completely eliminate CLAMP mRNA levels, immunostaining of polytene chromosomes for CLAMP and CLAMP ChIP occupancy were strongly reduced. CLAMP RNAi causes a pupal lethal phenotype in both males and females (Larschan, 2012; Soruco, 2013 and references therein).
Next, ChIP for the MSL2 core component was examined to examine MSL complex recruitment to five CESs in the presence and absence of CLAMP RNAi treatment. The results showed that CLAMP RNAi strongly reduced the occupancy of the MSL complex in male larvae. H3 occupancy was measured by ChIP to determine whether CLAMP RNAi globally disrupts chromatin and only modest changes were found after CLAMP RNAi. Also, polytene staining for the MSL3 protein was performed after CLAMP RNAi and no detectable MSL3 staining was observed, revealing that CLAMP is required for MSL complex recruitment to the entire X chromosome. Thus, CLAMP promotes MSL complex localization to the entire X chromosome in flies (Soruco, 2013).
It was previously shown that loss of targeting of the MSL complex reduces complex stability, likely due to ubiquitylation of complex members by MSL2. Therefore, measuring MSL protein levels after CLAMP RNAi would not distinguish between effects on complex stability and targeting to the X chromosome. However, it has been demonstrated previously that CLAMP RNAi does not affect mRNA levels of MSL complex components (Larschan, 2012), and CLAMP is strongly enriched at 92% of CESs, suggesting that indirect effects on MSL complex stability are unlikely (Larschan, 2012; Soruco, 2013 and references therein).
The MSL complex is highly enriched on the X chromosome compared with autosomal controls but mediates at most a twofold increase in transcription. Furthermore, only a 1.4-fold decrease in expression of X-linked genes is seen after MSL2 RNAi in male SL2 cells. To determine how CLAMP regulates gene expression globally, mRNA sequencing (mRNA-seq) experiments were conducted in male SL2 cells before and after CLAMP RNAi treatment. As expected for a non-sex-specific factor present on both the X chromosome and autosome, CLAMP RNAi caused changes in gene expression genome-wide. However, CLAMP RNAi caused a greater decrease in the gene expression of X-linked genes compared with autosomal genes, consistent with a function in MSL complex targeting. Consistent with its localization throughout the genome and enrichment on the X chromosome, CLAMP RNAi affects gene expression on both the X chromosome and autosomes with a bias toward X-linked genes (Soruco, 2013).
In contrast to average X-linked genes, the genes that encode the roX noncoding RNA components are highly dependent on the MSL complex. Therefore, this study measured the levels of roX RNAs before and after CLAMP RNAi treatment by quantitative RT-PCR (qRT-PCR) in male and female third instar larvae. In males, roX2 Levels were reduced fourfold after CLAMP RNAi, indicating that CLAMP serves as an activator of roX2 expression. In contrast, roX1 levels were largely unchanged in males after CLAMP RNAi. In females, roX1 was significantly increased after CLAMP RNAi treatment (6.5-fold), and roX2 levels remained largely unchanged. These patterns of CLAMP-mediated regulation agree with the established differential regulation of roX1 and roX2 and recent genetic evidence implicating roX2 as the first site of X identification (Soruco, 2013).
CLAMP is enriched on the X chromosome independent of the MSL complex
Identifying CLAMP as the protein that directly recognizes MREs provides a key opportunity to reveal new insight into the process of X identification. It was hypothesized that X enrichment of CLAMP independent of the MSL complex would indicate that CLAMP could function as a beacon on the X chromosome to distinguish it from autosomes (Soruco, 2013).
To determine how the MSL complex regulates CLAMP occupancy, a comparison was made of CLAMP occupancy at MREs on the X chromosome and on autosomes in male cells before and after MSL2 RNAi and in female cells that lack the MSL complex. Overall, the percentage of MREs occupied by CLAMP was greater on the X chromosome when compared with autosomes in both male cells (50% on X vs. 32% on autosomes) and female cells (62% on X vs. 38% on autosomes). Therefore, CLAMP is enriched at X-chromosome MREs compared with autosomal MREs, independent of the MSL complex (Soruco, 2013).
Next, how CLAMP occupancy varies in its dependence on the MSL complex at MREs within CESs was determined. To define the total number of likely CESs for subsequent analysis, how the largest class of previously characterized CESs (306 loci) overlapped with CLAMP sites present in male cells was determined and 265 common sites (87% overlap) were found. Next, these sites were clustered based on their overall binding pattern and occupancy level of CLAMP in male cells, male cells after hypomorphic MSL2 RNAi, and female cells. The average occupancy at each of these classes of CESs was plotted compared with the average CLAMP site on the X chromosome and the autosome. In this way, three distinct classes of CESs were defined. Group A: Largely MSL-dependent CESs exhibit a strongly decreased CLAMP occupancy in the absence of the MSL complex, and their occupancy in female cells is much below that of the average CLAMP site (177 sites). Group B: Partially MSL-dependent CESs, including both roX loci, exhibit higher than average levels of CLAMP occupancy in the absence of the MSL complex and further increase in their occupancy and breadth in the presence of the MSL complex (43 sites). Group C: Largely MSL-independent CESs have higher than average CLAMP occupancy in both male and female cell lines (45 sites). ChIP-qPCR was used to validate these classes in both cell culture and larvae (Soruco, 2013).
Thus, group B and group C CESs, including both roX loci, have higher than average levels of MSL-independent CLAMP occupancy compared with group A sites, which have lower than average MSL-independent CLAMP occupancy. These data suggest that group B and group C CESs may function as beacons to promote X identification because they are occupied by CLAMP independent of the MSL complex. In addition, CLAMP associates with chromatin interdependently at group A and group B sites. Therefore, synergistic interactions between CLAMP and the MSL complex at group A and group B sites likely increase the occupancy of both factors (Soruco, 2013).
To define additional common features that distinguish each subclass of CESs, the following properties were compared: the number of tandem MREs within each CES, the two-dimensional clustering of CESs along the X chromosome, and the average nucleosome occupancy in the absence of the MSL complex. First, a strong increase was found in the percentage of group B and group C sites with three or more tandem MREs present within each CES locus compared with group A sites. Second, group B sites are clustered in two dimensions along the length of the X within a 5-Mb region surrounding roX2, with the majority of sites between the roX genes. In contrast, group A sites are uniformly distributed, and group C sites exhibit minimal clustering distal to the group B sites. Third, chromatin organization was examined at each subclass of sites in the absence of the MSL complex using publicly available female cell data sets. Under high-salt extraction conditions, which are likely to provide the most accurate nucleosome occupancy profiles over CESs, the decreased nucleosome density at group B and group C loci in female cells resembles that previously reported for male cells (Soruco, 2013).
In summary, the number of tandem MREs within each CES locus, two-dimensional clustering along the length of the X chromosome, and MSL-independent nucleosome occupancy are features that distinguish subclasses of CESs and were not previously identified by analysis of the average properties of all CESs. By defining three distinct subclasses of CESs, this study has provide a new tool for examining how the MSL complex identifies the X chromosome (Soruco, 2013).
The identification of CLAMP as the previously unknown link that tethers the MSL complex to the X chromosome supports the following mechanism for X identification: (1) Prior to MSL complex targeting, maternally loaded CLAMP binds directly to MREs within group B and group C CESs, including roX loci, due to clustering of tandem MREs within these loci. (2) During early embryogenesis, MSL-independent transcription of roX genes is stimulated and is likely to facilitate cotranscriptional MSL complex assembly. CLAMP is enriched at the CESs adjacent to roX loci independent of the MSL complex and tethers the MSL complex to the X chromosome. (3) The MSL complex catalyzes acetylation of H4K16 (H4K16ac), thereby opening chromatin and increasing the accessibility of MRE sequences for recognition by CLAMP. Synergistic interactions between the MSL complex and CLAMP are likely to contribute to the high occupancy of both factors at the roX loci, followed by spreading to additional CESs. Through this proposed mechanism, we suggest that CLAMP functions together with roX RNAs to target the MSL complex to the X chromosome. The high degree of conservation of both CLAMP and MREs across Drosophila species (Alekseyenko, 2013) suggests that they are key components of a conserved mechanism for establishing an active chromatin domain within a metazoan genome (Soruco, 2013).
Search PubMed for articles about Drosophila Clamp
Alberti, S., Gladfelter, A. and Mittag, T. (2019). Considerations and challenges in studying liquid-liquid phase separation and biomolecular condensates. Cell 176(3): 419-434. PubMed ID: 30682370
Alekseyenko, A. A., Ellison, C. E., Gorchakov, A. A., Zhou, Q., Kaiser, V. B., Toda, N., Walton, Z., Peng, S., Park, P. J., Bachtrog, D. and Kuroda, M. I. (2013). Conservation and de novo acquisition of dosage compensation on newly evolved sex chromosomes in Drosophila. Genes Dev 27(8): 853-858. PubMed ID: 23630075
Duronio, R. J. and Marzluff, W. F. (2017). Coordinating cell cycle-regulated histone gene expression through assembly and function of the Histone Locus Body. RNA Biol 14(6): 726-738. PubMed ID: 28059623
Eggers, N., Gkountromichos, F., Krause, S., Campos-Sparr, A. and Becker, P. B. (2023). Physical interaction between MSL2 and CLAMP assures direct cooperativity and prevents competition at composite binding sites. Nucleic Acids Res. PubMed ID: 37602401
Falahati, H., Pelham-Webb, B., Blythe, S. and Wieschaus, E. (2016). Nucleation by rRNA dictates the precision of nucleolus assembly. Curr Biol 26(3): 277-285. PubMed ID: 26776729
Hur, W., Kemp, J. P., Jr., Tarzia, M., Deneke, V. E., Marzluff, W. F., Duronio, R. J. and Di Talia, S. (2020). CDK-regulated phase separation seeded by histone genes ensures precise growth and function of Histone Locus Bodies. Dev Cell 54(3): 379-394. PubMed ID: 32579968
Jolma A., Yin Y., Nitta K.R., Dave K., Popov A., Taipale M., Enge M., Kivioja T., Morgunova E., Taipale J. (2015), DNA-dependent formation of transcription factor pairs alters their binding specificity. Nature 527:384-388 PubMed ID: 26550823
Koreski, K. P., Rieder, L. E., McLain, L. M., Chaubal, A., Marzluff, W. F. and Duronio, R. J. (2020). Drosophila Histone Locus Body assembly and function involves multiple interactions. Mol Biol Cell: mbcE20030176. PubMed ID: 32401666
Kuzu, G., Kaye, E. G., Chery, J., Siggers, T., Yang, L., Dobson, J. R., Boor, S., Bliss, J., Liu, W., Jogl, G., Rohs, R., Singh, N. D., Bulyk, M. L., Tolstorukov, M. Y. and Larschan, E. (2016). Expansion of GA dinucleotide repeats increases the density of CLAMP binding sites on the X-chromosome to promote Drosophila dosage compensation. PLoS Genet 12(7): e1006120. PubMed ID: 27414415
Larschan, E., Soruco, M. M., Lee, O. K., Peng, S., Bishop, E., Chery, J., Goebel, K., Feng, J., Park, P. J. and Kuroda, M. I. (2012). Identification of chromatin-associated regulators of MSL complex targeting in Drosophila dosage compensation. PLoS Genet 8(7): e1002830. PubMed ID: 22844249
Menon, D. U., Coarfa, C., Xiao, W., Gunaratne, P. H. and Meller, V. H. (2014). siRNAs from an X-linked satellite repeat promote X-chromosome recognition in Drosophila melanogaster. Proc Natl Acad Sci U S A 111(46): 16460-16465. PubMed ID: 25368194
Rieder, L. E., Jordan, W. T., 3rd and Larschan, E. N. (2019). Targeting of the dosage-compensated male X-chromosome during early Drosophila development. Cell Rep 29(13): 4268-4275. PubMed ID: 31875538
Salzler, H. R., Tatomer, D. C., Malek, P. Y., McDaniel, S. L., Orlando, A. N., Marzluff, W. F. and Duronio, R. J. (2013). A sequence in the Drosophila H3-H4 Promoter triggers histone locus body assembly and biosynthesis of replication-coupled histone mRNAs. Dev Cell 24(6): 623-634. PubMed ID: 23537633
Soruco, M. M., Chery, J., Bishop, E. P., Siggers, T., Tolstorukov, M. Y., Leydon, A. R., Sugden, A. U., Goebel, K., Feng, J., Xia, P., Vedenko, A., Bulyk, M. L., Park, P. J. and Larschan, E. (2013). The CLAMP protein links the MSL complex to the X chromosome during Drosophila dosage compensation. Genes Dev 27(14): 1551-1556. PubMed ID: 23873939
Urban, J. A., Doherty, C. A., Jordan, W. T., Bliss, J. E., Feng, J., Soruco, M. M., Rieder, L. E., Tsiarli, M. A. and Larschan, E. N. (2016). The essential Drosophila CLAMP protein differentially regulates non-coding roX RNAs in male and females. Chromosome Res [Epub ahead of print]. PubMed ID: 27995349
Villa, R., Schauer, T., Smialowski, P., Straub, T. and Becker, P. B. (2016). PionX sites mark the X chromosome for dosage compensation. Nature 537(7619): 244-248. PubMed ID: 27580037
Wang, C. I., Alekseyenko, A. A., LeRoy, G., Elia, A. E., Gorchakov, A. A., Britton, L. M., Elledge, S. J., Kharchenko, P. V., Garcia, B. A. and Kuroda, M. I. (2013). Chromatin proteins captured by ChIP-mass spectrometry are linked to dosage compensation in Drosophila. Nat Struct Mol Biol 20(2): 202-209. PubMed ID: 23295261
date revised: 19 June 2024
Home page: The Interactive Fly © 2011 Thomas Brody, Ph.D.