bicoid: Biological Overview | Evolutionary Homologs | Regulation | Targets of Activity | Protein Interactions | Miscellaneous Interactions: Control of Bicoid mRNA subcellular distribution | Developmental Biology | Effects of Mutation | References

Gene name - bicoid

Synonyms -

Cytological map position - 84A1

Function - transcription factor

Keywords - morphogen, anterior-posterior axis - anterior group

Symbol - bcd

FlyBase ID:FBgn0000166

Genetic map position - 3-[47.5]

Classification - homeodomain

Cellular location - nuclear



NCBI link: Entrez Gene
bcd orthologs: Biolitmine
Recent literature
Durrieu, L., Kirrmaier, D., Schneidt, T., Kats, I., Raghavan, S., Hufnagel, L., Saunders, T. E. and Knop, M. (2018). Bicoid gradient formation mechanism and dynamics revealed by protein lifetime analysis. Mol Syst Biol 14(9): e8355. PubMed ID: 30181144
Summary:
Embryogenesis relies on instructions provided by spatially organized signaling molecules known as morphogens. Understanding the principles behind morphogen distribution and how cells interpret locally this information remains a major challenge in developmental biology. This study introduces introduce morphogen-age measurements as a novel approach to test models of morphogen gradient formation. Using a tandem fluorescent timer as a protein age sensor, this study fljnd a gradient of increasing age of Bicoid along the anterior-posterior axis in the early Drosophila embryo. Quantitative analysis of the protein age distribution across the embryo reveals that the synthesis-diffusion-degradation model is the most likely model underlying Bicoid gradient formation, and rules out other hypotheses for gradient formation. Moreover, the timer can detect transitions in the dynamics associated with syncytial cellularization. These results provide new insight into Bicoid gradient formation and demonstrate how morphogen-age information can complement knowledge about movement, abundance, and distribution, which should be widely applicable to other systems.
Mir, M., Stadler, M. R., Ortiz, S. A., Hannon, C. E., Harrison, M. M., Darzacq, X. and Eisen, M. B. (2018). Dynamic multifactor hubs interact transiently with sites of active transcription in Drosophila embryos. Elife 7. PubMed ID: 30589412
Summary:
The regulation of transcription requires the coordination of numerous activities on DNA, yet how transcription factors mediate these activities remains poorly understood. This study used lattice light-sheet microscopy to integrate single-molecule and high-speed 4D imaging in developing Drosophila embryos to study the nuclear organization and interactions of the key transcription factors Zelda and Bicoid. In contrast to previous studies suggesting stable, cooperative binding, this study shows that both factors interact with DNA with surprisingly high off-rates. Both factors form dynamic subnuclear hubs, and Bicoid binding is enriched within Zelda hubs. Remarkably, these hubs are both short lived and interact only transiently with sites of active Bicoid-dependent transcription. Based on these observations, it is hypothesized that, beyond simply forming bridges between DNA and the transcription machinery, transcription factors can organize other proteins into hubs that transiently drive multiple activities at their gene targets.
Osman, N. M., Kitapci, T. H., Vlaho, S., Wunderlich, Z. and Nuzhdin, S. V. (2018). Inference of transcription factor regulation patterns using gene expression covariation in natural populations of Drosophila melanogaster. Biophysics (Oxf) 63(1): 43-51. PubMed ID: 30739944
Summary:
Gene regulatory networks control the complex programs that drive development. Deciphering the connections between transcription factors (TFs) and target genes is challenging, in part because TFs bind to thousands of places in the genome but control expression through a subset of these binding events. It is hypothesized that natural variation of expression levels and predictions of TF binding sites can be combined to identify TF targets. RNA-seq data was combined from 71 genetically distinct F1 Drosophila melanogaster embryos, and the correlations were calculated between TF and potential target gene expression levels, which was called "regulatory strength." To separate direct and indirect TF targets, it was hypothesize that direct TF targets will have a preponderance of binding sites in their upstream regions. Using 14 TFs active during embryogenesis, it was found that 12 TFs showed a significant correlation between their binding strength and regulatory strength on downstream targets, and 10 TFs showed a significant correlation between the number of binding sites and the regulatory effect on target genes. The general roles, e.g. Bicoid's role as an activator, and the particular interactions that were observed between these TFs, e.g. Twist's role as a repressor of sloppy paired and odd paired, generally coincide with the literature.
Shah, K., Cao, W. and Ellison, C. E. (2019). Adenine methylation in Drosophila is associated with the tissue-specific expression of developmental and regulatory genes. G3 (Bethesda). PubMed ID: 30988038 N6-methyladenine (6mA or m6dA) is a DNA modification that has long been known to play an important role in a variety of biological functions in prokaryotes. This modification has only recently been described in eukaryotes, where it seems to have evolved species-specific functions ranging from nucleosome positioning to transposon repression. In Drosophila, 6mA has been shown to be important for enforcing the tissue specificity of neuronal genes in the brain and suppressing transposable element expression in the ovaries. This study analyzes the raw signal data from nanopore sequencing to identify 6mA positions in the D. melanogaster genome at single-base resolution. This modification is enriched upstream from transcription start sites, within the introns and 3' UTRs of genes, as well as in simple repeats. These 6mA positions are enriched for sequence motifs that are recognized by known transcriptional activators involved in development, such as Bicoid and Caudal, and the genes that carry this modification are enriched for functions involved in development, regulation of transcription, and neuronal activity. These genes show high expression specificity in a variety of tissues besides the brain, suggesting that this modification may play a more general role in enforcing the specificity of gene expression across many tissues, throughout development, and between the sexes.
Cai, X., Fahmy, K. and Baumgartner, S. (2019). bicoid RNA localization requires the trans-Golgi network. Hereditas 156: 30. PubMed ID: 31528161
Summary:
The formation of the bicoid (bcd) mRNA gradient is a crucial step for Bcd protein gradient formation in Drosophila. In the past, a microtubule (MT)-based cortical network had been shown to be indispensable for bcd mRNA transport to the posterior. This study reports the identification of a MT-binding protein CLASP/Chb as the first component associated with this cortical MT network. Since CLASPs in vertebrates were shown to serve as an acentriolar microtubule organization center (aMTOC) in concert with trans-Golgi proteins, this study examined the effect of the Drosophila trans-Golgins on bcd localization and gradient formation. Using a genetic approach, it was demonstrated that the Drosophila trans-Golgins dGCC88, dGolgin97 and dGCC185 indeed affect bcd mRNA localization during oocyte development. Consequently, the bcd mRNA is already mislocalized before the egg is fertilized. The expression domains of genes downstream of the hierarchy of bcd, e.g. of the gap gene empty spiracles or of the pair-rule gene even-skipped are changed, indicating an altered segmental anlagen, due to a faulty bcd gradient. Thus, at the end of embryogenesis, trans-Golgin mutants show bcd-like cuticle phenotypes. These data provides evidence that the Golgi as a cellular member of the secretory pathway exerts control on bcd localization which indicates that bcd gradient formation is probably more intricate than previously presumed.
Huang, A., Rupprecht, J. F. and Saunders, T. E. (2020). Embryonic geometry underlies phenotypic variation in decanalized conditions. Elife 9. PubMed ID: 32048988
Summary:
During development, many mutations cause increased variation in phenotypic outcomes, a phenomenon termed decanalization. Phenotypic discordance is often observed in the absence of genetic and environmental variations, but the mechanisms underlying such inter-individual phenotypic discordance remain elusive. Using the anterior-posterior (AP) patterning of the Drosophila embryo, embryonic geometry was identified as a key factor predetermining patterning outcomes under decanalizing mutations. With the wild-type AP patterning network, it was found that AP patterning is robust to variations in embryonic geometry; segmentation gene expression remains reproducible even when the embryo aspect ratio is artificially reduced by more than twofold. In contrast, embryonic geometry is highly predictive of individual patterning defects under decanalized conditions of either increased bicoid (bcd) dosage or bcd knockout. The phenotypic discordance can be traced back to variations in the gap gene expression, which is rendered sensitive to the geometry of the embryo under mutations.
Eck, E., Liu, J., Kazemzadeh-Atoufi, M., Ghoreishi, S., Blythe, S. A. and Garcia, H. G. (2020). Quantitative dissection of transcription in development yields evidence for transcription factor-driven chromatin accessibility. Elife 9. PubMed ID: 33074101
Summary:
Thermodynamic models of gene regulation can predict transcriptional regulation in bacteria, but in eukaryotes chromatin accessibility and energy expenditure may call for a different framework. This study systematically tested the predictive power of models of DNA accessibility based on the Monod-Wyman-Changeux (MWC) model of allostery, which posits that chromatin fluctuates between accessible and inaccessible states. The regulatory dynamics of hunchback by the activator Bicoid and the pioneer-like transcription factor Zelda was dissected in living Drosophila embryos; no thermodynamic or non-equilibrium MWC model could recapitulate hunchback transcription. Therefore, a model was explored where DNA accessibility is not the result of thermal fluctuations but is catalyzed by Bicoid and Zelda, possibly through histone acetylation; this model did predict hunchback dynamics. Thus, this theory-experiment dialogue uncovered potential molecular mechanisms of transcriptional regulatory dynamics, a key step toward reaching a predictive understanding of developmental decision-making.
Onal, P., Imaya Gunasinghe, H., Yui Umezawa, K., Zheng, M., Ling, J., Azeez, L., Dalmeus, A., Tazin, T. and Small, S. (2021). Suboptimal Intermediates Underlie Evolution of the Bicoid Homeodomain. Mol Biol Evol. PubMed ID: 33599280
Summary:
Changes in regulatory networks generate materials for evolution to create phenotypic diversity. For transcription networks, multiple studies have shown that alterations in binding sites of cis-regulatory elements correlate well with the gain or loss of specific features of the body plan. Less is known about alterations in the amino acid sequences of the transcription factors (TFs) that bind these elements. This study examined the evolution of Bicoid (Bcd), a homeodomain (HD) protein that is critical for anterior embryo patterning in Drosophila. The ancestor of Bcd (AncBcd) emerged after a duplication of a Zerknullt (Zen)-like ancestral protein (AncZB) in a suborder of flies. AncBcd diverged from AncZB, gaining novel transcriptional and translational activities. This study focused on the evolution of the HD of AncBcd, which binds to DNA and RNA, and is comprised of four subdomains: an N-terminal arm (NT) and three helices; H1, H2, and Recognition Helix (RH). Using chimeras of subdomains and gene rescue assays in Drosophila, this study showd that robust patterning activity of the Bcd HD (high frequency rescue to adulthood) is achieved only when amino acid substitutions in three separate subdomains (NT, H1, and RH) are combined. Other combinations of subdomains also yield full rescue, but with lower penetrance, suggesting alternative suboptimal activities. The results suggest a multi-step pathway for the evolution of the Bcd HD that involved intermediate HD sequences with suboptimal activities, which constrained and enabled further evolutionary changes. They also demonstrate critical epistatic forces that contribute to the robust function of a DNA-binding domain.
Shlemov, A., Alexandrov, T., Golyandina, N., Holloway, D., Baumgartner, S. and Spirov, A. V. (2021). Quantification reveals early dynamics in Drosophila maternal gradients. PLoS One 16(8): e0244701. PubMed ID: 34411119
Summary:
The Bicoid (Bcd) protein is a primary determinant of early anterior-posterior (AP) axis specification in Drosophila embryogenesis. This study produced confocal microscope images of whole early embryos, stained for bcd mRNA or the Staufen (Stau) protein involved in its transport. Each profile was quantified by a two- (or three-) exponential equation. The parameters of these equations were used to analyze the early developmental dynamics of bcd. Analysis of 1D profiles was compared with 2D intensity surfaces from the same images. This approach reveals strong early changes in bcd and Stau, which appear to be coordinated. Three stages in early development can be unambiguously discriminated using the exponential parameters: pre-blastoderm (1-9 cleavage cycle, cc), syncytial blastoderm (10-13 cc) and cellularization (from 14A cc). Key features which differ in this period are how fast the first exponential (anterior component) of the apical profile drops with distance and whether it is higher or lower than the basal first exponential. Both bcd and Stau show several redistributions in the head cytoplasm, quite probably related to nuclear activity. The continued spreading of bcd can be tracked from the time of nuclear layer formation (later pre-blastoderm) to the later syncytial blastoderm stages by the progressive loss of steepness of the apical anterior exponential (for both bcd and Stau). Finally, at the beginning of cc14 (cellularization stage) a distinctive flip is seen from the basal anterior gradient being higher to the apical gradient being higher (for both bcd and Stau). Quantitative analysis reveals substantial (and correlated) bcd and Stau redistributions during early development, supporting that the distribution and dynamics of bcd mRNA are key factors in the formation and maintenance of the Bcd protein morphogenetic gradient.
Cai, X., Rondeel, I. and Baumgartner, S. (2021). Modulating the bicoid gradient in space and time. Hereditas 158(1): 29. PubMed ID: 34404481
Summary:
The formation of the Bicoid (Bcd) gradient in the early Drosophila is one of the most fascinating observations in biology and serves as a paradigm for gradient formation, yet its mechanism is still not fully understood. Two distinct models were proposed in the past, the SDD and the ARTS model. This study defines novel cis- and trans-acting factors that are indispensable for gradient formation. The first one is the poly A tail length of the bcd mRNA where this study demonstrates that it changes not only in time, but also in space. Posterior bcd mRNAs were shown to possess a longer poly A tail than anterior ones and this elongation is likely mediated by wispy (wisp), a poly A polymerase. Consequently, modulating the activity of Wisp results in changes of the Bcd gradient, in controlling downstream targets such as the gap and pair-rule genes, and also in influencing the cuticular pattern. Attempts to modulate the Bcd gradient by subjecting the egg to an extra nuclear cycle, i.e. a 15(th) nuclear cycle by means of the maternal haploid (mh) mutation showed no effect, neither on the appearance of the gradient nor on the control of downstream target. This suggests that the segmental anlagen are determined during the first 14 nuclear cycles. Finally, the Cyclin B (CycB) gene was identified as a trans-acting factor that modulates the movement of Bcd such that Bcd movement is allowed to move through the interior of the egg. This analysis demonstrates that Bcd gradient formation is far more complex than previously thought requiring a revision of the models of how the gradient is formed.
Sankaranarayanan, M., Emenecker, R. J., Wilby, E. L., Jahnel, M., Trussina, I., Wayland, M., Alberti, S., Holehouse, A. S. and Weil, T. T. (2021). Adaptable P body physical states differentially regulate bicoid mRNA storage during early Drosophila development. Dev Cell. PubMed ID: 34655524
Summary:
Ribonucleoprotein condensates can exhibit diverse physical states in vitro and in vivo. Despite considerable progress, the relevance of condensate physical states for in vivo biological function remains limited. This study investigated the physical properties of processing bodies (P bodies) and their impact on mRNA storage in mature Drosophila oocytes. The conserved DEAD-box RNA helicase Me31B forms viscous P body condensates, which adopt an arrested physical state. Structurally distinct proteins and protein-protein interactions, together with RNA, regulate the physical properties of P bodies. Using live imaging and in situ hybridization, this study shows that the arrested state and integrity of P bodies support the storage of bicoid (bcd) mRNA and that egg activation modulates P body properties, leading to the release of bcd for translation in the early embryo. Together, this work provides an example of how physical states of condensates regulate cellular function in development.
Sankaranarayanan, M., Emenecker, R. J., Wilby, E. L., Jahnel, M., Trussina, I., Wayland, M., Alberti, S., Holehouse, A. S. and Weil, T. T. (2021). Adaptable P body physical states differentially regulate bicoid mRNA storage during early Drosophila development. Dev Cell. PubMed ID: 34655524
Summary:
Ribonucleoprotein condensates can exhibit diverse physical states in vitro and in vivo. Despite considerable progress, the relevance of condensate physical states for in vivo biological function remains limited. This study investigated the physical properties of processing bodies (P bodies) and their impact on mRNA storage in mature Drosophila oocytes. This study shows that the conserved DEAD-box RNA helicase Me31B forms viscous P body condensates, which adopt an arrested physical state. This study demonstrates that structurally distinct proteins and protein-protein interactions, together with RNA, regulate the physical properties of P bodies. Using live imaging and in situ hybridization, this study shows that the arrested state and integrity of P bodies support the storage of bicoid (bcd) mRNA and that egg activation modulates P body properties, leading to the release of bcd for translation in the early embryo. Together, this work provides an example of how physical states of condensates regulate cellular function in development (Sankaranarayanan, 2021).
Singh, A. P., Wu, P., Ryabichko, S., Raimundo, J., Swan, M., Wieschaus, E., Gregor, T. and Toettcher, J. E. (2022). Optogenetic control of the Bicoid morphogen reveals fast and slow modes of gap gene regulation. Cell Rep 38(12): 110543. PubMed ID: 35320726
Summary:
Developmental patterning networks are regulated by multiple inputs and feedback connections that rapidly reshape gene expression, limiting the information that can be gained solely from slow genetic perturbations. This study shows that fast optogenetic stimuli, real-time transcriptional reporters, and a simplified genetic background can be combined to reveal the kinetics of gene expression downstream of a developmental transcription factor in vivo. Light-controlled versions of the Bicoid transcription factor were generated and their effects on downstream gap genes was studied in embryos. The results recapitulate known relationships, including rapid Bicoid-dependent transcription of giant and hunchback and delayed repression of Krüppel. In addition, it was found that the posterior pattern of knirps exhibits a quick but inverted response to Bicoid perturbation, suggesting a noncanonical role for Bicoid in directly suppressing knirpsggmj transcription. Acute modulation of transcription factor concentration while recording output gene activity represents a powerful approach for studying developmental gene networks in vivo.
Thukral, S., Kaity, B., Mitra, D., Dey, B., Dey, P., Uttekar, B., Mitra, M. K., Nandi, A. and Rikhy, R. (2022). Pseudocleavage furrows restrict plasma membrane-associated PH domain in syncytial Drosophila embryos. Biophys J 121(12): 2419-2435. PubMed ID: 35591789
Summary:
Syncytial cells contain multiple nuclei and have local distribution and function of cellular components despite being synthesized in a common cytoplasm. The syncytial Drosophila blastoderm embryo shows reduced spread of organelle and plasma membrane-associated proteins between adjacent nucleo-cytoplasmic domains. Anchoring to the cytoarchitecture within a nucleo-cytoplasmic domain is likely to decrease the spread of molecules; however, its role in restricting this spread has not been assessed. In order to analyze the cellular mechanisms that regulate the rate of spread of plasma membrane-associated molecules in the syncytial Drosophila embryos, a pleckstrin homology (PH) domain was expressed in a localized manner at the anterior of the embryo by tagging it with the bicoid mRNA localization signal. Anteriorly expressed PH domain forms an exponential gradient in the anteroposterior axis with a longer length scale compared with Bicoid. Using a combination of experiments and theoretical modeling, it was found that the characteristic distribution and length scale emerge due to plasma membrane sequestration and restriction within an energid. Loss of plasma membrane remodeling to form pseudocleavage furrows shows an enhanced spread of PH domain but not Bicoid. Modeling analysis suggests that the enhanced spread of the PH domain occurs due to the increased spread of the cytoplasmic population of the PH domain in pseudocleavage furrow mutants. This analysis of cytoarchitecture interaction in regulating plasma membrane protein distribution and constraining its spread has implications on the mechanisms of spread of various molecules, such as morphogens in syncytial cells.
Wang, J., Zhang, S., Lu, H. and Xu, H. (2022). Differential regulation of alternative promoters emerges from unified kinetics of enhancer-promoter interaction. Nat Commun 13(1): 2714. PubMed ID: 35581264
Summary:
Many eukaryotic genes contain alternative promoters with distinct expression patterns. How these promoters are differentially regulated remains elusive. This study applied single-molecule imaging to quantify the transcriptional regulation of two alternative promoters (P1 and P2) of the Bicoid (Bcd) target gene hunchback in syncytial blastoderm Drosophila embryos. Contrary to the previous notion that Bcd only activates P2, this study found that Bcd activates both promoters via the same two enhancers. P1 activation is less frequent and requires binding of more Bcd molecules than P2 activation. Using a theoretical model to relate promoter activity to enhancer states, this study showed that the two promoters follow common transcription kinetics driven by sequential Bcd binding at the two enhancers. Bcd binding at either enhancer primarily activates P2, while P1 activation relies more on Bcd binding at both enhancers. These results provide a quantitative framework for understanding the kinetic mechanisms of complex eukaryotic gene regulation.
Baltruk, L. J., Lavezzo, G. M., Machado-Lima, A., Digiampietri, L. A. and Andrioli, L. P. (2022). An additive repression mechanism sets the anterior limits of anterior pair-rule stripes 1. Cells Dev 171: 203802. PubMed ID: 35934285
Summary:
Segments are repeated anatomical units forming the body of insects. In Drosophila, the specification of the body takes place during the blastoderm through the segmentation cascade. Pair-rule genes such as hairy (h), even-skipped (eve), runt (run), and fushi-tarazu (ftz) are of the intermediate level of the cascade and each pair-rule gene is expressed in seven transversal stripes along the antero-posterior axis of the embryo. Stripes are formed by independent cis-regulatory modules (CRMs) under the regulation of transcription factors of maternal source and of gap proteins of the first level of the cascade. The initial blastoderm of Drosophila is a syncytium and it also coincides with the mid-blastula transition when thousands of zygotic genes are transcribed and their products are able to diffuse in the cytoplasm. Thus, a complex regulation of the CRMs of the pair-rule stripes is anticipated. The CRMs of h 1, eve 1, run 1, ftz 1 are able to be activated by bicoid (bcd) throughout the anterior blastoderm and several lines of evidence indicate that they are repressed by the anterior gap genes slp1 (sloppy-paired 1), tll (tailless) and hkb (huckebein). The modest activity of these repressors led to the premise of a combinatorial mechanism regulating the expression of the CRMs of h 1, eve 1, run 1, ftz 1 in more anterior regions of the embryo. This possibility was tested by progressively removing the repression activities of slp1, tll and hkb. In doing so, it was possible to expose a mechanism of additive repression limiting the anterior borders of stripes 1. Stripes 1 respond depending on their distance from the anterior end and repressors operating at different levels.
Lopez, C. H., Puliafito, A., Xu, Y., Lu, Z., Talia, S. D. and Vergassola, M. (2023). Two-fluid dynamics and micron-thin boundary layers shape cytoplasmic flows in early Drosophila embryos. bioRxiv. PubMed ID: 36993669
Summary:
Cytoplasmic flows are widely emerging as key functional players in development. In early Drosophila embryos, flows drive the spreading of nuclei across the embryo. This study combined hydrodynamic modeling with quantitative imaging to develop a two-fluid model that features an active actomyosin gel and a passive viscous cytosol. Gel contractility is controlled by the cell cycle oscillator, the two fluids being coupled by friction. In addition to recapitulating experimental flow patterns, this model explains observations that remained elusive, and makes a series of new predictions. First, the model captures the vorticity of cytosolic flows, which highlights deviations from Stokes' flow that were observed experimentally but remained unexplained. Second, the model reveals strong differences in the gel and cytosol motion. In particular, a micron-sized boundary layer is predicted close to the cortex, where the gel slides tangentially whilst the cytosolic flow cannot slip. Third, the model unveils a mechanism that stabilizes the spreading of nuclei with respect to perturbations of their initial positions. This self-correcting mechanism is argued to be functionally important for proper nuclear spreading. Fourth, the model was used to analyze the effects of flows on the transport of the morphogen Bicoid, and the establishment of its gradients. Finally, the model predicts that the flow strength should be reduced if the shape of the domain is more round, which is experimentally confirmed in Drosophila mutants. Thus, the two-fluid model explains flows and nuclear positioning in early Drosophila, while making predictions that suggest novel future experiments.
Xu, R., Dai, F., Wu, H., Jiao, R., He, F. and Ma, J. (2023). Shaping the scaling characteristics of gap gene expression patterns in Drosophila. Heliyon 9(2): e13623. PubMed ID: 36879745
Summary:
How patterns are formed to scale with tissue size remains an unresolved problem. This study investigated embryonic patterns of gap gene expression along the anterior-posterior (AP) axis in Drosophila. Embryos were used that greatly differed in length and, importantly, possess distinct length-scaling characteristics of the Bicoid (Bcd) gradient. The dynamic movements of gap gene expression boundaries were systematically analyze in relation to both embryo length and Bcd input as a function of time. The process through which such dynamic movements drive both an emergence of a global scaling landscape was shown, and evolution of boundary-specific scaling characteristics was documented. Despite initial differences in pattern scaling characteristics that mimic those of Bcd in the anterior, such characteristics of final patterns converge. This study thus partitions the contributions of Bcd input and regulatory dynamics inherent to the AP patterning network in shaping embryonic pattern's scaling characteristics.
Kawasaki, K. and Fukaya, T. (2023). Functional coordination between transcription factor clustering and gene activity. Mol Cell 83(10): 1605-1622. PubMed ID: 37207625
Summary:
The prevailing view of metazoan gene regulation is that transcription is facilitated through the formation of static activator complexes at distal regulatory regions. This study employed quantitative single-cell live-imaging and computational analysis to provide evidence that the dynamic assembly and disassembly process of transcription factor (TF) clusters at enhancers is a major source of transcriptional bursting in developing Drosophila embryos. It was further shown that the regulatory connectivity between TF clustering and burst induction is highly regulated through intrinsically disordered regions (IDRs). Addition of a poly-glutamine tract to the maternal morphogen Bicoid demonstrated that extended IDR length leads to ectopic TF clustering and burst induction from its endogenous target genes, resulting in defects in body segmentation during embryogenesis. Moreover, this study successfully visualized the presence of 'shared' TF clusters during the co-activation of two distant genes, which provides a concrete molecular explanation for the newly proposed "topological operon" hypothesis in metazoan gene regulation.

BIOLOGICAL OVERVIEW

The maternally transcribed gene bicoid organizes anterior development in Drosophila. Its mRNA remains localized at the anterior tip of the oocyte and later in the early embryonic stages. Maternal bcd transcription is regulated by the maternal transcription factor Serendipity delta. Bicoid mRNA translation is inhibited in the posterior by Nanos (Payre, 1994 and Gavis, 1995).

The protein cannot be detected in oocytes, indicating translation of bicoid is inhibited prior to fertilization. Zygotic translation can be detected shortly after egg deposition, and immediately after fertilization, at the anterior tip of the embryo. As a consequnce of the anterior localization of RNA, a gradient of Bicoid protein becomes established prior to cellularization (Driever, 1988).

Localization of BCD RNA is a multi-step process including an early event taking place in nurse cells requiring Exuperentia and a later event in the oocyte requiring Gurken and involving the cytoskeleton.

Initial stages in localization of BCD mRNA take place in nurse cells, mediated by a cis acting localization signal, (the Bicoid localization element [BLE1]), which is present as part of the 3' untranslated region of Bicoid mRNA. The exuperentia gene product required for Bicoid anterior localization (Berleth, 1988), has been considered to be the BLE binding agent. However, RNA binding activity has not been detected for EXU. Instead, another protein from wild type exu flies has been found associated with BCD RNA . This BCD RNA binding protein has been termed Exu-like (EXL). It is possible that both EXL and EXU interact with BLE (MacDonald, 1995).

BCD ultimately finds its way to the anterior pole of the oocyte as a ribonuclear protein complex. It may also be transported along microtubules in vesicles (Wang, 1994). Zygotic translation of Bicoid RNA ensues rapidly upon fertilization. Transition from maternal to zygotic conditions requires an increase in polyadenylation in the Bicoid messenger RNA. The successful translation of RNA into proteins requires polyadenylation. In this process adenyl residues attach to the 3' end of the Bicoid RNA, contributing to its stability and preparing it for translation. Maternal RNAs must be kept inactive until needed. A major cellular mechanism for the maintenance of inactive mRNA in the oocyte is the lack of adenyl residues (Salles, 1994).

The anterior-posterior gradient of Bicoid plus its partner Hunchback are required to either activate or inhibit transcription of a variety of zygotic genes, including hunchback, gap genes such as empty spiracles, Krüppel and knirps, pair rule genes like even-skipped and runt, and even some homeotic and Polycomb group genes. Thus Bicoid has an essential role in establishing the anterior-posterior axis of Drosophila, its gradient acting to position the transcription of gap and pair rule genes along the anterior-posterior axis.

Bicoid mRNA translation is posttranscriptionally regulated by Nanos protein. It would seem that one requirement for NOS mRNA localization, involving 11 proteins of the posterior group are imposed by the presence of the Nanos response elements in BCD and HB mRNAs (Wharton, 1991).

High Bicoid levels render the terminal system dispensable for Drosophila head development

In Drosophila, the gradient of the Bicoid (Bcd) morphogen organizes the anteroposterior axis while the ends of the embryo are patterned by the maternal terminal system. At the posterior pole, expression of terminal gap genes is mediated by the local activation of the Torso receptor tyrosine kinase (Tor). At the anterior, terminal gap genes are also activated by the Tor pathway but Bcd contributes to their activation. Evidence is presented that Tor and Bcd act independently on common target genes in an additive manner. Furthermore, the terminal maternal system is shown not to be required for proper head development, since high levels of Bcd activity can functionally rescue the lack of terminal system activity at the anterior pole. This observation is consistent with a recent evolution of an anterior morphogenetic center consisting of Bcd and anterior Tor function (Schaeffer, 2000).

The terminal maternal system directly modifies Bcd by phosphorylation at several MAPK sites in a Ser/Thr (S/T)-rich region located between the homeodomain and the identified transcriptional activation domains. A deletion variant of Bcd that lacks all these activation domains but still contains the S/T-rich region (BcdDeltaQAC) is able to rescue to viability bcd loss-of-function mutants. Hence, it is conceivable that the ability of the tor pathway to create negative charges through phosphorylation of this region of Bcd might result in an acidic-rich transcriptional activation domain that compensates for the loss of all the other activation domains. If this were the case, then the transcriptional activity of the BcdDeltaQAC deletion variant should be highly dependent on tor function. To test this hypothesis, the ability of a BcdDeltaQAC transgene to rescue the bcd phenotype in embryos derived from bcd;tsl double mutant mothers was assayed. BcdDeltaQAC rescues the bcd phenotype of the bcd;tsl double mutant similarly to a wild-type bcd transgene, resulting in a tsl only phenotype. Since BcdDeltaQAC is functionally independent of the tor pathway, it is concluded that the terminal system is not responsible for BcdDeltaQAC's activation potential. This result is also consistent with the notion that, in transient transfection experiments and transgenic studies, Bcd transcriptional activity is not significantly modified by mutations of the putative MAPK consensus sites. Thus, the described direct modification of Bcd by the tor pathway does not appear to be necessary for Bcd's function (Schaeffer, 2000).

tor function is necessary to allow a normal expression pattern of most Bcd target genes: many Bcd target genes such as otd are expressed in a reduced anterior domain in tor mutants. Furthermore, the expression domain of these genes is expanded in tor gain-of-function backgrounds, again suggesting that the tor pathway potentiates Bcd function. This effect could be direct, as Bcd transcriptional activity might be enhanced by direct modification of the protein (for instance by phosphorylation). Alternatively, the effect might be indirect, since most Bcd target promoters might also be responsive to Tor through distinct elements (Schaeffer, 2000).

A direct effect should be detectable with simply organized Bcd target promoters that only contain Bcd-response elements and no Tor-response elements. The proximal hb promoter (P2) resembles such a simple Bcd-response element, which uses activators to set an expression border without the assistance of repressors. The hb P2 promoter is not directly responsive to the terminal pathway; in the absence of Tor activity, the posterior border of hb expression moves only very slightly towards the anterior and, in tor gain-of-function embryos (tor4021), the posterior expression border does not respond significantly to ubiquitously activated Tor (Schaeffer, 2000).

However, the hb P2 promoter is still 300 bp long and might contain elements that are not well defined, and the hb pattern is very dynamic. Therefore, in addition an artificial Bcd responder gene was used whose promoter elements are all known. This promoter contains only Bcd and Hb binding sites (Bcd3Hb3-LacZ), and its expression is reminiscent of the hb P2 promoter, with an anterior cap expression domain from 100% to 65% EL. If Bcd were a direct target of Tor, the posterior border of the reporter gene expression domain should move in response to a tor gain-of-function allele. However, the expression pattern does not change in a tor4021 background. This argues for a Bcd activator function that is not under direct control of the terminal system. Thus, Bcd and Tor seem to be part of two independent pathways, which share common target genes (Schaeffer, 2000).

When a complete series of Bcd deletion variants was assayed for their ability to rescue the bcd loss-of-function phenotype in the absence of terminal system activity, one transgenic line was found that not only rescues the bcd phenotype but also the anterior part of the tsl phenotype (labrum and dorsal bridge), resulting in a posterior terminal mutant phenotype only. This particular transgenic line carries a bcd variant that deletes an alanine-rich domain (BcdDeltaA) and has been shown to activate the bcd target gene hb in a widely enlarged expression domain. Using Bcd immunostaining, it has been shown that this transgenic line exhibits levels of Bcd that are approximately 2- to 3-fold higher than wild type. Since other BcdDeltaA lines did not exhibit the same ability to rescue the tsl phenotype, it is concluded that the higher expression level of this particular line rather than the lack of a specific negative protein element (alanine-rich domain) is responsible for overcoming the requirement for the terminal pathway at the anterior (Schaeffer, 2000).

To further address whether high levels of bcd activity are sufficient to rescue the anterior terminal system phenotype or, if only a particular Bcd deletion variant is capable thereof, the ability of increased doses of wild-type bcd transgenes to rescue several terminal mutant backgrounds was tested. Since the previous experiments were performed with the tsl1 allele, which might only represent a strong hypomorphic allele rather than a null, another tsl mutant, tsl4 , was included that is among the strongest in the allelic series, as well as null mutant alleles of the terminal genes trk and tor. To increase the Bcd expression level, flies containing an X chromosome or a third chromosome each carrying two wild-type bcd rescue constructs were used; these flies carry up to six copies of bcd. The phenotypes of all terminal mutants (tsl, trk or tor) are similar: lack of labrum and dorsal bridge in the anterior and deletion of all structures posterior to A7. Four copies of the bcd gene were able to rescue anterior structures including labrum and dorsal bridge in about 40% of all embryos derived from a tsl4 mutant background, while the posterior terminal phenotype is unaffected. Six copies of bcd are necessary to obtain the same anterior rescue in about 15% of all embryos derived from trk mutants and in about 5% of all embryos derived from tor mutants. However, not all embryos with rescued labrum and dorsal bridge had a perfectly aligned head skeleton. This might be due to incomplete rescue, but it could also be due to Bcd-mediated overexpression of hb at the anterior pole, which results in terminal-like phenotypes (Schaeffer, 2000).

Actually 50%, 70% or 85% of the head cuticles of tsl, trk or tor mutants, respectively, could not be analyzed for rescue due to severe anterior defects, which seemed more severe than normal terminal phenotypes. Nonetheless, some of the rescued embryos (less than 2%) were able to hatch and move around, which suggests complete anterior rescue. These probably represent embryos where just enough Bcd was present to overcome the lack of the terminal system but not too much to induce the phenotype due to high ectopic expression of hb. All larvae died within 2 hours, likely due to the posterior terminal defects. It should be noted that very few embryos exhibited the type of abdominal segment fusions that have been described for embryos derived from mothers carrying excess copies of the bcd gene. This might be due to the lack of terminal system function at the posterior pole in these experiments. Since no tail is made, there is probably more space for fate-map shifts towards the posterior, resulting in the correct establishment of abdominal segments A1 to A6. The rescue of the anterior terminal phenotype by high levels of bcd further indicates that the major role of the anterior terminal system is the potentiation of Bcd activity (Schaeffer, 2000).

In the posterior region of the embryo, the tor pathway activates the zygotic effectors tll and hkb, which are sufficient to specify the most posterior anlagen and the gut of the larva. At the anterior, the function of the terminal system is more difficult to interpret and, in tor mutants, hkb expression is only reduced. It actually requires bcd;tsl double mutants to lose all anterior hkb expression, which indicates additive functions of the anterior and terminal systems on this common target gene. hkb seems particularly interesting in this context, as its function is required for the formation of the labrum: reduction of hkb expression, as observed in terminal mutant background leads to the deletion of this particular structure (Schaeffer, 2000).

Therefore, it was asked whether the rescue of anterior structures (e.g. the labrum) mediated by high levels of Bcd in terminal system mutants is correlated with the restoration of the hkb expression pattern. Expression of hkb is first detected in the terminal regions (anterior and posterior) of the syncytial blastoderm. In terminal mutant embryos, the posterior domain is absent, whereas the anterior domain is reduced. In a tsl background with four or six copies of bcd, however, hkb expression extends further towards the posterior. Hence, the level of hkb expression can be regained by increasing the levels of Bcd in a terminal system mutant, even though its exact expression domain cannot be restored. It is likely that fate-map shifts are able to absorb the slightly changed expression domain of hkb. This suggests that the lack of terminal system activity at the anterior can simply be overcome by another system through enhancement of transcriptional activation of common target genes (Schaeffer, 2000).

Tor has been shown to antagonize Groucho-mediated repression of genes such as hkb and tll, probably by acting on the HMG-box transcription factor Capicua. Therefore, it is likely that Tor enhances Bcd activity by derepression, i.e. the inactivation of potential repressors of Bcd target genes, and thereby rendering any transcriptional activator more potent. As the cis-regulatory control regions of most developmental genes comprise both repressor and activator sites, the inactivation of potential repressors should lead to enhanced expression, or enlarged expression domains, as observed for several bcd target genes in a tor gain-of-function background (Schaeffer, 2000).

Since bcd and tor appear to function independently of each other, it is conceivable that anterior tor activity can also enhance the function of other transcriptional activators through derepression. Therefore, in long-germband insects that might lack a true bcd homolog, the anterior terminal system could also assist other activators, like homologs of Otd or Hb. Moreover, the fact that, in certain situations in the Drosophila embryo, anterior Tor activity can be dispensable for proper head development, is consistent with the observation that a posterior morphogenetic center is more frequently found in insects than an anterior center. Although flies accumulate BCD mRNA and tor activity at the anterior pole of the egg, this might not be useful for most short-germband insects as their embryos develop in the posterior part of the egg. In this case, the anterior of the embryo is far away from the anterior of the egg and the morphogenetic role of bcd and tor would not be effective in patterning the head. In Tribolium castaneum, an intermediate-germband beetle, the activity of the terminal system is conserved at the anterior of the embryo, but it does not appear to have a specific function for pattern formation. Correspondingly, the tll homolog of Tribolium is only expressed at the posterior pole of the embryo. This is in contrast to Drosophila, where tll is expressed at both poles of the early embryo. Moreover, in spite of arguments supporting the presence of a bcd-like function in Tribolium, no bcd homologous gene has been identified outside higher diptera notwithstanding Bcd's homeodomain and bcd's location in the Hox cluster. It is therefore reasonable to assume that an anterior morphogenetic center consisting of Bcd and anterior Tor activity is not a general feature of insects (Schaeffer, 2000).

Homeodomain position 54 specifies transcriptional versus translational control by Bicoid

Bicoid (Bcd) controls embryonic gene expression by transcriptional activation and translational repression. Both functions require the homeodomain (HD), which recognizes DNA motifs at target gene enhancers and a specific sequence interval in the 3' untranslated region of Caudal (CAD) mRNA. The Bcd HD has been shown to be a nucleic acid-binding unit. Its helix III contains an arginine-rich motif (ARM), similar to the RNA-binding domain of the HIV-1 protein REV, needed for both RNA and DNA recognition. Replacement of arginine 54, within this motif, alters the RNA but not the DNA binding properties of the HD. Corresponding BCD mutants fail to repress CAD mRNA translation, whereas the transcriptional target genes are still activated (Niessing, 2000).

In order to characterize portions and individual amino acid residues of the Bcd HD that are specifically required for one or both Bcd regulatory functions, transgenes expressing wild-type or mutant bcd cDNAs were placed into the genome of homozygous bcd mutant females and their ability to rescue wild-type zygotic hb activation and cad mRNA translation in their embryos was assayed. Such embryos, referred to as 'bcd embryos,' fail to exert Bcd-dependent transcriptional activation of the zygotic target gene hb in their anterior half. Instead, the embryos show a duplication of the posterior Bcd-independent stripe of hb expression in the anterior region (Niessing, 2000).

Expressed Bcd mutant proteins that lack the helices I and II of the HD (BcdDeltaH1-2) or the amino acid interval between positions 42 and 51 in helix III (BcdT42-N51) fail to restore Bcd-dependent hb transcriptional activation and translational repression of CAD mRNA in the anterior region of bcd embryos. This indicates that the integrity of the Bcd HD is necessary for the control of transcription and translation. Transgene-dependent expression of BcdhIIIAntp, in which the C-terminal half of the Bcd HD is exchanged for the corresponding sequence of the Antennapedia (Antp) HD, rescues Bcd-dependent hb expression in the anterior region of bcd embryos, but no Cad gradient is formed. Bcd mutations in which two adjacent arginines at positions 53-54 and 54-55 of the HD, respectively, were replaced, fail to control Bcd-dependent transcription and translation. Thus, helix III of the Bcd HD is necessary for both transcriptional activation and translational repression, and amino acids within helix III are essential for specifying not only DNA binding but also RNA recognition by the HD. This proposal is consistent with the observation that part of the helix III of the Bcd HD has characteristics of an arginine-rich motif (ARM) (Niessing, 2000).

To test whether the conserved amino acids of Bcd's ARM are indeed required for RNA target recognition and whether single amino acid replacements may allow the DNA and RNA binding properties to separate, alanine replacement mutants of the Bcd HD were generated and their in vitro binding properties assayed. The Bcd HD (HDwt) binds both DNA and RNA, whereas HDK50A, HDN51A, HDR53A, and HDR55A failed to bind to both targets. Bcd HDR54A, which contains alanine in place of arginine in position 54 of the HD, bound DNA properly, but its RNA binding was reduced by more than one order of magnitude. The binding properties of HDK57A were indistinguishable from HDwt. In summary, arginine at position 54 of the HD is critical for specifying RNA versus DNA binding, and its replacement shifts the binding property of the HD to prefer DNA over RNA recognition (Niessing, 2000).

In order to test the in vivo relevance of these binding studies, the corresponding Bcd HD mutants were examined by transgene-dependent expression in bcd embryos. The Bcd mutants were generated in the context of an 8.7 kb genomic DNA fragment spanning the entire bcd locus, which fully rescues bcd embryos after P element-mediated transformation. The transgene-expressed BcdK57A protein, which contains an HD with normal DNA and RNA binding properties, causes Bcd-dependent hb expression and Cad gradient formation, and the embryos developed into normal-looking larvae and fertile adults. BcdN51A, BcdR53A, and BcdR55A, which contain HD mutations that cause the loss of DNA and RNA binding properties in vitro, fail to activate Bcd-dependent hb transcription and to repress translation of CAD mRNA; such embryos develop a bcd mutant phenotype. The BcdR54A mutant, which contains an HD with DNA, but no RNA, binding properties, was able to activate the transcription of hb but not to repress the translation of CAD mRNA. This observation is consistent with the result obtained using the transgene bearing the BcdR54S mutation, which contains a serine residue in place of arginine at position 54. Thus, both Bcd mutants that contain a replacement of arginine at position 54 of the HD fail to control CAD mRNA translation but do activate transcription of hb (Niessing, 2000).

Mutations of bcd that interfere with the control of CAD mRNA translation but not with the activation of transcription cause temperature-dependent head involution defects. The corresponding larvae develop the normal number and identity of head segments, which, however, fail to be properly assembled. The same phenotype would be expected for the BcdR54A mutant embryos, ensuring that the replacement affects only CAD mRNA translational control. bcd embryos expressing the BcdR54A mutant develop a normal segment pattern at 18°C and give rise to normal-looking and fertile adults. At 29°C, however, the majority of the embryos (more than 90%) die as unhatched larvae, and all of them express a strong head defect. The embryos show a normal expression pattern of the segment polarity gene engrailed (en) at stages 9-11, indicating that segments are generated normally. Furthermore, all discernible head markers can be observed in larval cuticle preparations, but, as observed with mutations affecting the translational repressor region of Bcd, the assembly of the head elements is strongly perturbed. The same temperature-dependent phenotype is observed when cad cDNA lacking the Bcd-responsive BBR in the 3'UTR is expressed in the preblastoderm embryo using the GAL4/UAS system. Taken together, the in vivo transgene studies and the in vitro binding results establish that a single amino acid replacement in the ARM of the Bcd HD specifically interferes with Bcd-dependent RNA binding and translational repression of CAD mRNA, without affecting DNA binding and transcriptional activation. The finding is consistent with the observation that an arginine residue at this position is conserved in ARMs but rare in HDs (Niessing, 2000).

The results provide strong evidence that the Bcd HD functions as a nucleic acid-binding unit that enables Bcd to function in transcriptional and translational control. In addition, the findings establish that the direct interaction of Bcd with the BBR of CAD mRNA shown in vitro is necessary to prevent Cad activity from interfering with head morphogenesis. Helix III of the Bcd HD has been identified as a region in which a single amino acid replacement shifts the in vitro binding property of the HD to prefer DNA over RNA recognition and abolishes CAD mRNA translational repression not affecting transcriptional activation by Bcd in vivo. The alpha-helical structure and sequence comparison between HIV-1 REV and the third helix of the Bcd HD indicate that Bcd formally fits as a member of the ARM family of RNA-binding proteins that show a low degree of amino acid sequence identity. The sequence similarity between the ARMs of HIV-1 REV and the Bcd HD is therefore remarkable. However, there is no corresponding sequence similarity observed between the RNA target sequences to which they bind. Furthermore, REV fails to bind the BBR, and Bcd-HD does not recognize the REV response element. Thus, the high degree of amino acid identity and conservation of the critical arginine residue in the ARMs of the Bcd HD and HIV-1 REV is not correlated with similarity at the level of the targets (Niessing, 2000).

Asparagine is absolutely conserved at position 51 of HDs and is also found in the corresponding position in ARM family members. It has been shown to provide base contacts in DNA/HD complexes and RNA target recognition by ARM proteins, respectively. Consistently, mutation of arginine in position 51 of the Bcd HD abolished DNA binding as well as RNA binding. In contrast, the 52-57 region of HDs interacts with DNA electrostatically, whereas some of the corresponding REV arginine residues are hydrogen bonded to bases. Mutating arginine at position 54, which is rare in other HDs, affects RNA binding without altering the DNA binding. In summary, these and earlier findings with respect to the DNA binding properties of HDs support the proposal that the ARM within the helix III of the Bcd HD is necessary for both RNA and DNA target recognition, and that individual amino acids within this portion of the HD specify RNA versus DNA binding (Niessing, 2000).

Although the Bcd HD is by now the only known HD with RNA binding properties, it has been noted that the ARM-containing RNA-binding domain of EIAV-TAT and the ribosomal protein L11 can fold into HD-like structures with the RNA-binding domain exposed as a helix III equivalent. The recently solved crystal structure of this protein bound to a ribosomal RNA fragment shows binding to the minor groove of RNA that is similar in width to a DNA major groove. The results also indicate that L11 uses the same surface as the HD does in binding DNA. The structural similarities and the fact that helix III regions of HDs are generally rich in basic amino acids suggest that HDs hold a high potential to either exert or to adopt RNA binding properties during evolution. The possibility that other HDs also bind RNAs and thereby provide HD proteins with dual regulatory functions is a challenging proposal (Niessing, 2000 and references therein).

A system of repressor gradients spatially organizes the boundaries of Bicoid-dependent target genes

The homeodomain (HD) protein Bicoid (Bcd) is thought to function as a gradient morphogen that positions boundaries of target genes via threshold-dependent activation mechanisms. This study analyzed 66 Bcd-dependent regulatory elements, and their boundaries were shown to be positioned primarily by repressive gradients that antagonize Bcd-mediated activation. A major repressor is the pair-rule protein Runt (Run), which is expressed in an opposing gradient and is necessary and sufficient for limiting Bcd-dependent activation. Evidence is presented that Run functions with the maternal repressor Capicua and the gap protein Kruppel as the principal components of a repression system that correctly orders boundaries throughout the anterior half of the embryo. These results put conceptual limits on the Bcd morphogen hypothesis and demonstrate how the Bcd gradient functions within the gene network that patterns the embryo (Chen, 2012).

This study identified 32 enhancers that respond to Bcd-dependent activation and form expression boundaries at different positions along the AP axis of fly embryos. Adding these elements to the 34 previously known enhancers constitutes the largest data set of in vivo-tested and -confirmed enhancers regulated by a specific transcription factor in all of biology (Chen, 2012).

The 32 confirmed enhancers were identified among 77 tested genomic fragments, which were selected because they showed in vivo-binding activity, or they conformed to a stringent homotypic-clustering model for predicted Bcd-binding sites, or both. All seven previously unknown fragments showing in vivo binding and a predicted site cluster directed Bcd-dependent transcription in the early embryo. Other fragments from the top 50 ChIP-Chip signals (which do not conform to the clustering model) were also very likely (21 of 26) to test positive in the in vivo test, but this likelihood drops significantly (9 of 25) in a set of fragments from lower on the list of ChIP-Chip fragments. Interestingly, of 19 tested fragments that contain clusters of predicted sites, but no in vivo binding activity, not a single one tested positive in vivo. These results suggest that in ;vivo binding assays are much better predictors of regulatory function than simple site-clustering algorithms alone (Chen, 2012).

One explanation for the failure of these predicted site clusters to bind Bcd in vivo is that they lie in heterochromatic regions of the genome that prevent site access. However, because they fail to function when taken out of their normal context (in reporter genes), whatever is preventing activation must be a property of the fragment itself and not its location in the genome. Interestingly, a number of Bcd site cluster-containing fragments drive expression later in development. It is proposed that these fragments fail to bind Bcd because they lack sites for cofactors that facilitate Bcd binding. In preliminary experiments it was observed that Bcd-activated fragments contain on average more binding sites for the ubiquitous activator protein Zelda (Zld) than those that fail to activate. Zld has been shown to be critical for timing the zygotic expression of hundreds of genes in the maternal to zygotic transition (Chen, 2012).

These results suggest strongly that a gradient of Run protein plays a major role in limiting Bcd-dependent activation. Run seems to work as part of a repression system that also includes Cic and possibly Kr. Expression boundaries in the region anterior to the presumptive cephalic furrow shift toward the posterior in run and cic mutants, and the double mutant causes boundaries that are normally well separated to collapse into a single position (Chen, 2012).

The use of multiple repressors permits flexibility in binding site architecture within enhancers that establish boundaries at similar positions. For example type I enhancers show overrepresentations of both Run and Cic sites, but 27% lack strong matches to the Cic PWM, and 12% lack strong matches to the Run PWM. Importantly, however, all type I enhancers lacking Cic sites contain Run sites, and those lacking Run sites contain Cic sites. Multiple Kr sites were observed in a large number of Bcd-dependent enhancers, which suggests that Kr is also a major component of the repression system that orders Bcd-dependent expression boundaries. Taken together, these data suggest that antagonistic repression of Bcd-mediated activation is a key design principle of the system that organizes the AP body plan. The repressors identified so far (Run, Cic, and Kr) are expressed in overlapping domains with gradients at different positions, consistent with the formation and ordering of a relatively large number of boundaries throughout the anterior half of the embryo (Chen, 2012).

The close linkage between repressor sites and Bcd sites within discrete enhancers suggests that repression occurs via short-range interactions that interfere directly with Bcd binding or activation. Interestingly, Cic also shows repressive effects that seem to be binding site independent. For example some type I enhancers do not contain recognizable Cic sites, but their expression boundaries expand posteriorly in cic mutants. This could be caused by the reduced expression of run and Kr in cic mutants. However, genetically removing both Kr and run causes a less dramatic expansion than that seen in the absence of cic. This suggests that Cic binds these enhancers via suboptimal sites or that it is required for the correct patterning of another unknown repressor. Another possibility is that these expansions are caused indirectly by changing the balance of MAPK phosphorylation events that control terminal patterning (Chen, 2012).

These results do not strictly falsify the Bcd morphogen hypothesis, but they support the idea that the Bcd gradient can establish only a 'rough framework that is elaborated by the interaction of the zygotic segmentation genes'. What is the nature of this framework, and what role does it play in the network that precisely positions target gene boundaries (Chen, 2012)?

One component of the system, the Cic repression gradient, is maternally produced and formed by downregulation at the poles via the terminal patterning system. This gradient is formed independently of Bcd but is critical for establishing boundaries of Bcd-dependent target genes. In contrast, Bcd is involved in activating the expression patterns of run and Kr and in repressing them in anterior regions. Both run and Kr expand anteriorly in bcd mutants. There is no evidence that Bcd functions directly as a transcriptional repressor, so these repressive activities are probably indirect. Previous work showed that the Bcd target gene gt is involved in setting the anterior Kr boundary, and it is hypothesized that another Bcd target gene, slp1, encodes a forkhead domain (FKH) protein that sets the anterior boundary of the early run pattern. slp1 is expressed in a pattern reciprocal to the run pattern and was previously shown to position the anterior boundaries of several pair-rule gene stripes including run stripe 1 (Chen, 2012).

These results suggest that a major function of the Bcd gradient is the differential positioning of two repressors, Slp1 and Gt, which set the positions of the Run and Kr repression gradients, which then feedback to repress Bcd-dependent target genes. How are slp1 and gt differentially positioned? One possibility is that slp1 and gt enhancers respond to specific concentrations within the Bcd gradient, consistent with the original model for morphogen activity. However, the fact that the slp1 and gt expression domains form boundaries at the same positions in embryos lacking the Cic and Run repressors argues against this model for these genes (Chen, 2012).

It was also shown that Bcd target genes normally expressed in cephalic regions form and correctly position posterior boundaries in embryos containing flattened Bcd gradients. Run is still expressed in these embryos, specifically in a domain that consistently abuts the boundaries of the anterior Bcd target genes, regardless of copy number. This suggests that a mutually repressive interaction between Slp1 and Run is maintained in these embryos but does not explain how these boundaries are consistently oriented perpendicularly to the AP axis. The answer might lie in the fact that the flattened Bcd gradients in these embryos are not completely flat but are present as shallow gradients with slightly higher levels in anterior regions. In these embryos the slight changes in concentration along the AP axis might cause a bias that enables the orientation of the mutual repression interaction. In wild-type embryos, Bcd is much more steeply graded, which makes this bias stronger and the boundary between these mutual repressors more robust (Chen, 2012).

These results suggest that antagonistic repression precisely orders Bcd-dependent expression boundaries. However, repression may not be required for the activity of all morphogens. For example the extracellular signal activin has been shown to activate target genes in a threshold-dependent manner in isolated animal caps from frog embryos. Also, a gradient of the transcription factor Dorsal (Dl) is critical for setting boundaries between different tissue types along the dorsal-ventral (DV) axis of the fly embryo. It is thought that the major mechanism in Dl-specific patterning is threshold-dependent activation, which is quite different from the system described in this paper. One major difference between Bcd and Dl is the number of boundaries specified: three for Dl and more than ten for Bcd. It is proposed that the robust ordering of more boundaries simply requires a more complex system (Chen, 2012).

In general, though, it seems that antagonistic mechanisms are involved in controlling the establishment or interpretation of most morphogen activities. For example in the Drosophila wing disc, the TGF-N2 signal Dpp forms an activity gradient that is refined by interactions with multiple extracellular factors. Also, in vertebrates the signaling activity of the extracellular morphogen Sonic hedgehog (Shh) is affected by positive and negative interactions with specific molecules on the surfaces of receiving cells (Chen, 2012).

There is some evidence that transcriptional repression is also used for refining the patterning activities of extracellular molecules. Dpp acts as a long-range morphogen that activates two major target genes (optomotor blind [omb] and spalt [sal]) in nested patterns with boundaries at different positions with respect to the source of Dpp. Although these boundaries could in theory be formed by differential responses to the morphogen, it is clear that the transcriptional repressor Brinker (Brk), which is expressed in an oppositely oriented gradient, also plays an important role. The Brk gradient is itself positioned by Dpp activity in a manner analogous to positioning of the Run and Kr repressor gradients by Bcd. Also, a similar transcriptional network functions in Shh-mediated patterning of the vertebrate neural tube, where a series of spatially oriented repressors feeds back to limit the expression boundaries of Shh-mediated cell fate decisions (Chen, 2012).

Conceptually, these more complex systems are reminiscent of the reaction-diffusion model proposed by Turing, in which a localized activator would activate a repressor, which would diffuse more rapidly than the activator, and feed back on its activity. These systems strongly suggest that the patterning activity of a single monotonic gradient is insufficiently robust for establishing precise orders of closely positioned expression boundaries. By integrating gradients with repressive mechanisms that refine gradient shape or influence outputs, systems are generated that ensure consistency in body plan establishment while still maintaining the flexibility required for complex systems to evolve (Chen, 2012).

Transcription factor binding affinities and DNA shape readout

An essential event in gene regulation is the binding of a transcription factor (TF) to its target DNA. Models considering the interactions between the TF and the DNA geometry proved to be successful approaches to describe this binding event, while conserving data interpretability. However, a direct characterization of the DNA shape contribution to binding is still missing due to the lack of accurate and large-scale binding affinity data. This study use a recently established binding assay to measure with high sensitivity the binding specificities of 13 Drosophila TFs, including dinucleotide dependencies to capture non-independent amino acid-base interactions. Correlating the binding affinities with all DNA shape features, this study found that shape readout is widely used by these factors. A shape readout/TF-DNA complex structure analysis validates this approach while providing biological insights such as positively charged or highly polar amino acids often contact nucleotides that exhibit strong shape readout (Schnepf, 2020).

The binding of transcription factors (TFs) to specific DNA sequences is a key event for the regulation of gene expression. The features defining a binding site have been the focus of several decades of research starting from simple consensus motif binding sites, later replaced by probabilistic models of TF binding assuming that each base contributes independently to the overall affinity, the so-called position-specific weight matrices (PWMs). With the advent of high-throughput methods, binding specificities became available for thousands of TFs and it has become clear that more complex models for binding sites using non-independent nucleotide interactions lead to more accurate predictions than PWMs. Nucleotide correlations can originate from amino acids that contact multiple bases simultaneously or from stacking interactions that determine binding through DNA shape readout. Hence, although determining binding specificities is crucial to predict binding sites in the genome, such data alone are not sufficient to fully describe TF-DNA binding interactions as they do not provide insights about the mechanism the TF employs to bind to different DNA sequences. To elucidate how the TF 'reads' the DNA is of paramount importance not only to improve algorithms predicting binding sites but also to refine fundamental understanding of how TFs are recruited to specific DNA regulatory sequences. To date, two distinct modes of protein-DNA recognition are known: base readout, which reflects the interplay at nucleobase-amino acid contacts mainly driven by the formation of hydrogen bonds, and shape readout, dominated by van der Waals interactions and electrostatic potentials (EPs), that recognizes the 3D structure of the DNA double helix. As a consequence, one can assume that, if the TF uses the shape readout, models incorporating DNA structural information should improve prediction of TF-DNA binding specificities. To test this hypothesis and thereby help model development, it would thus be highly desirable to (1) determine accurately TF-DNA binding specificities, including non-independent nucleotide interactions since deviations from linear binding can carry information about the influence of DNA shape, and (2) use these data to assess the contribution of DNA shape readout to the binding interaction. Despite the availability of techniques able to measure protein-DNA interactions at high throughput such as protein binding microarray (PBM), SELEX-seq, and SMiLE-seq, the accurate measurement of binding affinities remains problematic. Moreover, these methods require a resin- or filter-based selection step that introduces bias and/or use stringent washing protocols resulting in the loss of weak binders, which can lead to erroneously over-specific binding specificities. These limitations are critical, especially to determine higher-order binding interactions, which are intrinsically weak (Schnepf, 2020).

Evaluating the contribution to binding of DNA shape readout also poses challenges. First, although it had been known for along time from crystal structures that. TFs read out the DNA shape, it is still not possible to determine experimentally the DNA shape features at a large scale for any given DNA sequence. However, this would be necessary to quantitatively assess DNA shape influence on TF-DNA binding. This issue has been tackled by Zhou. who introduced 'DNAShape' (Zhou, 2013), an algorithm that predicts structural DNA features from nucleotide sequences, considering at each DNA position a local 5-mers nucleotide environment. The original set of four geometric shape features was later completed by Li (2017), who made tables available to calculate an expanded repertoire of 13 DNA shape features in total. Finally, Chiu (2017) added in a comparable fashion the EP, which approximates the minor-groove EPs. The EP reflects the mean charge density of the DNA back-bone sensed by positively charged amino acid residues of the binding protein. Another difficulty to analyze the influence of DNA shape to binding is that, in spite of all the advances made possible by 'DNAShape' and the succeeding studies, it is still not clear to what degree shape readout can be described as a function of the underling DNA sequence. It is indeed very difficult to tease apart whether a binding protein favors a given nucleotide sequence because it recognizes certain amino acids of this sequence or rather certain shapes features of the DNA helix. An important step was made with homeodomain TFs by Abe (2015), who was able to specifically remove the ability of the binding proteins to read a certain structural feature of DNA and to switch between different modes of DNA shape readouts. Another approach computationally dissects TF binding specificity in terms of base and shape readout (Rube, 2018). Remarkably, that study determined that 92-99% of the variance in the shape features can be explained with a model considering only dinucleotides dependencies. That study also found that interactions were much stronger between neighboring nucleotides than for non-adjacent positions, indicating that these dinucleotide features are the most important for binding. Hence, determining neighboring dinucleotide dependencies should be enough to capture most on the higher-order binding interactions. Unfortunately, although these studies shed new light on the role of DNA shape in TF-DNA recognition, they were limited to the analysis of only a few factors and used only four different shape features. This was due to the lack of quantitative data on higher-order binding specificities and to the lack of tables to calculate other shape features. Thus, a more comprehensive analysis of TF-DNA binding - especially including higher-order dependencies - is urgently needed to better understand TF-DNA binding in general and to what extent DNA shape features are recognized by TFs in particular. Recently, high-performance fluorescence anisotropy (HiP-FA) (Jung, 2018; Jung, 2019), was presented as a method that determines TF-DNA binding energies directly in solution with high sensitivity and at a large scale and allows for measuring the affinity of a TF to any given DNA sequence. These features make HiP-FA an ideal tool to measure TF-DNA binding specificities, in particular the higher-order dependencies since these interactions are generally weak and their accurate measurement is both difficult and indispensable. This study used HiP-FA to measure binding energies for 13 TFs of the Drosophila segmentation gene network belonging to 8 different binding domain families. Their 0th order of binding specificities were determined taking only into account independent base contributions (PWM) and their first order of binding specificities accounting for dinucleotide dependencies represented by the dinucleotide position weight matrices (DPWMs). This work defines DPWMs as being the scoring matrices characterizing the deviations in the dinucleotide binding energies compared to pure PWMs (Schnepf, 2020).

Correlating the affinity data with the 13 known DNA shape features and the EP, it was found that nearly all the factors extensively use shape readout for DNA recognition, independently of the binding domain family. For 11 TFs for which structural information is available, the correlations were examined between their nuclear magnetic resonance (NMR)/co-crystal structures or structures of analog proteins obtained by homology-based modeling and the shape attributes obtained from this analysis. Finally, a cluster analysis was run to test if certain shape features tend to co-occur in the DNA shape readout used by these TFs (Schnepf, 2020).

Correlation between DNA shape readout and structural information is presented for homeodomain proteins Bicoid, Goosecoid and Ocelliless, for the bZip transcription factor Giant, and for the zinc finger transcription factor GATAe (see Correlation between DNA shape readout and structural information) (Schnepf, 2020).

HiP-FA constitutes a powerful tool to quantify TF-DNA binding specificity, especially the non-independent interactions requiring to be determined with high accuracy. The throughput of the method is not sufficient to discover de novo shape motifs or to explore the large sequence space possible with sequencing-based methods like HT-SELEX or SMiLE-seq. However, this is not a major limitation since the prior knowledge that HiP-FA requires (some information about the TF's binding preferences) is known for many TFs, and dinucleotide mutations are sufficient to cover most of the non-independent amino acid-nucleotide interactions. It would also be straightforward to extend the measurements in the flanking regions of the core binding motif (Schnepf, 2020).

By combining directly TF-DNA binding affinities, DNA shape features, and structural information, this study gained insights into their correlation, a debated topic due to their intrinsic covariation. Importantly, the results suggest that DNA shape readout is widespread among the TFs. The extended use of DNA shape readout by TFs has become increasingly apparent over the past years, which comes as no surprise considering that the number of van der Waals interactions enabling shape readout account for two-third of the protein-DNA interactions (Rube, 2018). The correlation analysis of the shape readout values with protein-DNA complex structures leads to a generalization of the influence of the charged amino acids on the shape readout that has been described so far only for homeodomains in the minor groove region of the DNA. This effect is attributed to other DNA secondary structures (such asa-helixes) and to other binding domains. In addition, for the POU domain Nub non-charged but polar residues are described that can also lead to a strong DNA shape readout. These effects onDNA shape readout have not been reported previously. The difficulty to detect the effects of charged and non-charged residues, especially in the major groove, is that they are obscured by the interactions involved in the base readout. This analysis was able to resolve even subtle effects due to the high sensitivity of the binding affinity measurements, and the shape analysis was able to deconvolve, to some extent, shape from base readout. In summary, the binding specificities were determined for 13 Drosophila TFs including first-order depedencies, provided insights into the correlation between their binding affinities to DNA and the shape features of the DNA helix, and gave structural insights in the shape readout. This method could easily be extended to more factors and to different organisms to provide a refined catalog of TF-DNA shape readout landscapes (Schnepf, 2020).

Although the HiP-FA assay allows determination of accurate binding affinities at a relatively large scale, the whole sequence space cannot be covered as high-throughput methods do. To restrict the number of measurements, this study thus focussed on the core binding motif of the TFs, and to all mononucleotide and dinucleotides mutations of the consensus sequence rather that all possible mutations. This should however cover most of the TF-DNA interactions since it has been shown that dinucleotide models explain >92% of the variance for the MGW, ProT, Roll, and HelT shape features (Rube, 2018). In addition, this analysis based on the direct correlation between binding affinities and shape features can only indirectly and partially tease apart the respective contributions of base and DNA shape readouts. Note that how to achieve the deconvolution between base and shape readouts is a longstanding issue in the field (Schnepf, 2020).

Synthetic reconstruction of the hunchback promoter specifies the role of Bicoid, Zelda and Hunchback in the dynamics of its transcription

For over 40 years, the Bicoid-hunchback (Bcd-hb) system in the fruit fly embryo has been used as a model to study how positional information in morphogen concentration gradients is robustly translated into step-like responses. A body of quantitative comparisons between theory and experiment have since questioned the initial paradigm that the sharp hb transcription pattern emerges solely from diffusive biochemical interactions between the Bicoid transcription factor and the gene promoter region. Several alternative mechanisms have been proposed, such as additional sources of positional information, positive feedback from Hb proteins or out-of-equilibrium transcription activation. By using the MS2-MCP RNA-tagging system and analysing in real time, the transcription dynamics of synthetic reporters for Bicoid and/or its two partners Zelda and Hunchback, this study showed that all the early hb expression pattern features and temporal dynamics are compatible with an equilibrium model with a short decay length Bicoid activity gradient as a sole source of positional information. Meanwhile, Bicoid's partners speed-up the process by different means: Zelda lowers the Bicoid concentration threshold required for transcriptional activation while Hunchback reduces burstiness and increases the polymerase firing rate (Fernandes, 2022).

Recently, synthetic approaches have been used to understand how the details of gene regulation emerge from the plethora of binding sites for transcription factors buried in genomes. In developmental systems, these approaches are starting to help us unravel the evolution of gene regulatory modules. In many cases, using high-throughput analysis of systematically mutagenized regulatory sequences, expression was measured through synthesis of easily detectable fluorescent proteins, RNA sequencing or antibody or FISH staining on fixed samples. Even though these approaches allowed screening for a high number of mutated sequences with a very high resolution (single nucleotide level), the output measurements remained global and it was hard to capture the temporal dynamics of the transcription process itself. In addition, because effects of single mutations are frequently compensated by redundant sequences, it remained often difficult from these studies to highlight the mechanistic roles of the TF they bind to. This work combined the MS2 tagging system, which allows for a detailed measurement of the transcription process dynamics at high temporal resolution, with an orthogonal synthetic approach focusing on a few cis-regulatory elements with the aim of reconstructing from elementary blocks most features of hb regulation by Bcd. The number and placement of TF BS in the MS2 reporters are not identical to those found on the endogenous hb promoter and the number of combinations tested was very limited when compared to the high throughput approaches mentioned above. Nevertheless, this synthetic approach combined with quantitative analyses and modeling sheds light on the mechanistic steps of transcription dynamics (polymerase firing rate, bursting, licensing to be ON/OFF) involving each of the three TFs considered (Bcd, Hb, and Zld). Based on this knowledge from synthetic reporters and the known differences between them, an equilibrium model of transcription regulation was built that agrees with the data from the hb-P2 reporter expression (Fernandes, 2022).

Expression from the Bcd-only synthetic reporters indicate that increasing the number of Bcd BS from 6 to 9 shifts the transcription pattern boundary position toward the posterior region. This is expected as an array with more BS will be occupied faster with the required amount of Bcd molecules. Increasing the number of Bcd BS from 6 to 9 also strongly increases the steepness of the boundary indicating that cooperativity of binding, or more explicitly a longer time to unbind as supported by our model fitting, is likely to be at work in this system. In contrast, adding three more BS to the 9 Bcd BS has very limited impact, indicating that either Bcd molecules bound to the more distal BS may be too far from the TSS to efficiently activate transcription or that the system is saturated with a binding site array occupied with 9 Bcd molecules. In the anterior with excess Bcd, the fraction of time when the loci are active at steady state also increases when adding 3 Bcd BS from B6 to B9. By assuming a model of transcription activation by Bcd proteins bound to target sites, the activation rate increases by much greater fold (~4.5 times) than the number of BS (1.5-2 times) suggesting a synergistic effect in transcription activation by Bcd (Fernandes, 2022).

The burstiness of the Bcd-only reporters in regions with saturating amounts of Bcd, led us to build a model in two steps. The first step of this model accounts for the binding/unbinding of Bcd molecules to the BS arrays. It is directly related to the positioning and the steepness of the expression boundary and thus to the measurement of positional information. The second step of this model accounts for the dialog between the bound Bcd molecules and the transcription machinery. It is directly related to the fluctuation of the MS2 signals including the number of firing RNAP at a given time (intensity of the signal) and bursting (frequency and length of the signal). Interestingly, while the first step of the process is achieved with an extreme precision (10% EL), the second step reflects the stochastic nature of transcription and is much noisier. This model therefore also helps to understand and reconcile this apparent contradiction in the Bcd system (Fernandes, 2022).

As predicted by an original theoretical model, 9 Bcd BS in a synthetic reporter appear sufficient to reproduce experimentally almost entirely the spatial features of the early hb expression pattern i.e. measurements of positional information. This is unexpected as the hb-P2 promoter is supposed to only carry 6 Bcd BS and leaves open the possibility that the number of Bcd BS in the hb promoter might be higher. Alternatively, it is also possible that even though containing 9 Bcd BS, the B9 reporter can only be bound simultaneously by less than 9 Bcd molecules. This possibility must be considered if for instance, the binding of a Bcd molecule to one site prevents by the binding of another Bcd molecule to another close by site (direct competition or steric hindrance). Even though this possibility cannot be excluded, it is thought to be unlikely for several reasons: (1) some of the Bcd binding sites in the hb-P2 promoter are also very close to each other and the design of the synthetic constructs was made by multimerizing a series of 3 Bcd binding sites with a similar spacing as found for the closest sites in the hb-P2 promoter; (ii) the binding of Bcd or other homeodomain containing proteins to two BS is generally increased by cooperativity when the sites are close to each other (as close as two base pairs for the paired homeodomain) compared to binding without cooperativity when they are separated by five base pairs or more (Fernandes, 2022).

Importantly, even though it is not really known if the B9 and the hb-P2 promoter contain the same number of effective Bcd BS, the B9 reporter which solely contains Bcd BS recapitulates most spatial features of the hb-P2 reporter, clearly arguing that Bcd on its own brings most of the spatial (positional) information to the process. Interestingly, the B9 reporter is however much slower (2-fold) to reach the final boundary position than the hb-P2 reporter. This suggested that other maternally provided TFs binding to the hb-P2 promoter contribute to fast dynamics of the hb pattern establishment. Among these TFs, this study focused on two known maternal partners of Bcd: Hb which acts in synergy with Bcd and Zld, the major regulator of early zygotic transcription in fruit fly. Interestingly, adding Zld or Hb sites next to the Bcd BS array reduces the time for the pattern to reach steady state and modifies the promoter activity in different ways: binding of Zld facilitates the recruitment of Bcd at low concentration, making transcription more sensitive to Bcd and initiate faster while the binding of Hb affects strongly both the activation/deactivation kinetics of transcription (burstiness) and the RNAP firing rate. Thus, these two partners of Bcd contribute differently to Bcd-dependent transcription. Consistent with an activation process in two steps as proposed in this model, Zld will contribute to the first step favoring the precise and rapid measurements of positional information by Bcd without bringing itself positional information. Meanwhile, Hb will mostly act through the second step by increasing the level of transcription through a reduction of its burstiness and an increase in the polymerase firing rate. Interestingly, both Hb and Zld binding to the Bcd-dependent promoter allow speeding-up the establishment of the boundary, a property that Bcd alone is not able to achieve. Of note, the hb-P2 and Z2B6 reporters contain the same number of BS for Bcd and Zld but they have also very different boundary positions and mean onset time of transcription T0 following mitosis when Bcd is limiting. This is likely due to the fact that the two Zld BS in the hb-P2 promoter are not fully functional: one of the Zld BS is a weak BS while the other Zld BS has the sequence of a strong BS but is located too close from the TATA Box (5 bp) to provide full activity (Fernandes, 2022).

Zld functions as a pioneer factor by potentiating chromatin accessibility, transcription factor binding and gene expression of the targeted promoter. Zld has recently been shown to bind nucleosomal DNA and proposed to help establish or maintain cis-regulatory sequences in an open chromatin state ready for transcriptional activation. In addition, Zld is distributed in nuclear hubs or microenvironments of high concentration. Interestingly, Bcd has been shown to be also distributed in hubs even at low concentration in the posterior of the embryo. These Bcd hubs are Zld-dependent and harbor a high fraction of slow moving Bcd molecules, presumably bound to DNA. Both properties of Zld, binding to nucleosomal DNA and/or the capacity to form hubs with increased local concentration of TFs can contribute to reducing the time required for the promoter to be occupied by enough Bcd molecules for activation. In contrast to Zld, knowledge on the mechanistic properties of the Hb protein in the transcription activation process is much more elusive. Hb synergizes with Bcd in the early embryo and the two TF contribute differently to the response with Bcd providing positional and Hb temporal information to the system. Hb also contributes to the determination of neuronal identity later during development. Interestingly, Hb is one of the first expressed members of a cascade of temporal TFs essential to determine the temporal identity of embryonic neurons in neural stem cells (neuroblasts) of the ventral nerve cord. In this system, the diversity of neuronal cell-types is determined by the combined activity of TFs specifying the temporal identity of the neuron and spatial patterning TFs, often homeotic proteins, specifying its segmental identity. How spatial and temporal transcription factors mechanistically cooperate for the expression of their target genes in this system is not known. The current work indicates that Hb is not able to activate transcription on its own but that it strongly increases RNAP firing probability and burst length of a locus licensed to be ON. Whether this capacity will be used in the ventral nerve cord and shared with other temporal TFs would be interesting to investigate (Fernandes, 2022).

The Bcd-only synthetic reporters also provided an opportunity to scrutinize the effect of Bcd concentration on the positioning of the expression domain boundaries. This question has been investigated with endogenous hb in the past, always giving a smaller shift than expected given the decay length of 20% EL for the Bcd protein gradient and arguing against the possibility that positional information in this system could solely be dependent on Bcd concentration. When comparing the transcription patterns of the B9 reporter in Bcd-2X flies and Bcd-1X flies, a shift was detected of ~10.5 ± 1% EL of the boundary position. This shift revealed a gradient of Bcd activity with an exponential decay length of ~15 ± 1.4% EL (~75 μm), significantly smaller than the value observed directly (20% EL, ~ 100 μm) with immuno-staining for the Bcd protein gradient but closer from the value of 16.4% EL obtained with immuno-staining for Bcd of the Bcd-GFP gradient. Given the discrepancies of previous studies concerning the measurements of the Bcd protein gradient decay length, this work calls for a better quantification to determine how close the decay length of the Bcd protein gradient is from the decay length of the Bcd activity gradient uncovered here. This work opens the possibility that the effective decay length of 15% EL corresponds to a population of 'active' or 'effective' Bcd distributed in steeper gradient than the Bcd protein gradient observed by immunodetection which would include all Bcd molecules. Bcd molecules have been shown to be heterogenous in intranuclear motility, age and spatial distributions but to date, it is not known which population of Bcd can access the target gene and activate transcription. The existence of two (or more) Bicoid populations with different mobilities obviously raises the question of the underlying gradient for each of them. Also, the dense Bcd hubs persist even in the posterior region where the Bcd concentration is low. As the total Bcd concentration decreases along the AP axis, these hubs accumulate Bcd with increasing proportion in the posterior, resulting in a steeper gradient of free-diffusing Bcd molecules outside the hubs. At last, the gradient of newly translated Bcd was also found to be steeper than the global gradient. Finally and most importantly, reducing by half the Bcd concentration in the embryo induced a similar shift in the position of the hb-P2 reporter boundary as that of the Bcd-only reporters. This further argues that this gradient of Bcd activity is the principal and direct source of positional information for hb expression (Fernandes, 2022).

The effective Bcd gradient found here rekindles the debate on how a steep hb pattern can be formed in the early nuclear cycles. With the previous value of λ=20% EL for the decay length of the Bcd protein gradient, the Hill coefficient inferred from the fraction of loci's active time at steady state PSpot is ~6.9, beyond the theoretical limit of the equilibrium model of Bcd interacting with six target BS of the hb promoter. This led to hypotheses of energy expenditure in Bcd binding and unbinding to the sites, out-of-equilibrium transcription activation, hb promoters containing more than 6 Bcd sites or additional sources of positional information to overcome this limit. The effective decay length λeff ~15% EL, found here with a Bcd-only reporter but also hb-P2, corresponds to a Hill coefficient of ~5.2, just below the physical limit of an equilibrium model of concentration sensing with 6 Bcd BS alone. Of note, a smaller decay length also means that the effective Bcd concentration decreases faster along the AP axis. In the Berg & Purcell limit (Biophys. J., 1977), the time length to achieve the measurement error of 10% at hb-P2 expression boundary with λ=15% EL is ~2.1 times longer than with λ=20% EL. This points again to the trade-off between reproducibility and steepness of the hb expression pattern and reinforces the importance of Hb and Zelda in speeding-up the process (Fernandes, 2022).

Hunchback activates Bicoid in Pair1 neurons to regulate synapse number and locomotor circuit function

Neural circuit function underlies cognition, sensation, and behavior. Proper circuit assembly depends on the identity of the neurons in the circuit (gene expression, morphology, synapse targeting, and biophysical properties). Neuronal identity is established by spatial and temporal patterning mechanisms, but little is known about how these mechanisms drive circuit formation in postmitotic neurons. Temporal patterning involves the sequential expression of transcription factors (TFs) in neural progenitors to diversify neuronal identity, in part through the initial expression of homeodomain TF combinations. This study addresses the role of the Drosophila temporal TF Hunchback and the homeodomain TF Bicoid in the assembly of the Pair1 (SEZ_DN1) descending neuron locomotor circuit, which promotes larval pausing and head casting. Both Hunchback and Bicoid are expressed in larval Pair1 neurons, Hunchback activates Bicoid in Pair1 (opposite of their embryonic relationship), and the loss of Hunchback function or Bicoid function from Pair1 leads to ectopic presynapse numbers in Pair1 axons and an increase in Pair1-induced pausing behavior. These phenotypes are highly specific, as the loss of Bicoid or Hunchback has no effect on Pair1 neurotransmitter identity, dendrite morphology, or axonal morphology. Importantly, the loss of Hunchback or Bicoid in Pair1 leads to the addition of new circuit partners that may underlie the exaggerated locomotor pausing behavior. These data are the first to show a role for Bicoid outside of embryonic patterning and the first to demonstrate a cell-autonomous role for Hunchback and Bicoid in interneuron synapse targeting and locomotor behavior (Lee, 2022).

Neural circuit formation underlies the generation of behavior, and aberrant neural circuit development has been associated with many neural disorders, such as autism and attention deficit hyperactivity disorder. It is widely accepted that circuit formation requires the assembly of precise interconnectivity between diverse neuron subtypes. Although the mechanisms for generating molecularly and morphologically distinct neurons are well studied, little is known about how these developmental mechanisms regulate 'higher-order' neuronal properties such as pre- and post-synapse numbers or circuit partner choice (Lee, 2022).

In Drosophila, neuronal identity is specified by the combination of spatial and temporal transcription factors (TFs) acting on neuronal stem cells (neuroblasts in Drosophila). Spatial patterning creates molecularly distinct neuroblasts, followed by each neuroblast sequentially expressing a series of temporal TFs: Hunchback > Kruppel Temporal TFs are known to specify axon and dendrite morphology and targeting as well as behavior. For example, in neuroblast 7-1, the best characterized lineage in the embryo, the zinc-finger temporal TF Hunchback promotes expression of the homeodomain TF even-skipped that is required for proper motor neuron morphology and connectivity; and the combination of Kruppel and Pdm temporal TFs promotes expression of the homeodomain TF Nkx6 (FlyBase: HGTX) that is required for proper ventral projecting motor neuron morphology and connectivity. In both cases, transient temporal TF expression activates a homeodomain TF that persists in the postmitotic neuron to determine neuron morphology and neuromuscular connectivity. Similarly, work from the Hobert lab in C. elegans supports a model in which each of the 302 neurons is specified by a unique combination of homeodomain TFs. Overall, from worms to flies to mammals, temporal TFs activate homeodomain TFs to specify molecular and morphological neuronal identity (Lee, 2022).

Although homeodomain TFs are well known to specify these early aspects of motor neuron identity, their role in specifying later aspects of neuronal identity such as synapse number, position, and connectivity remains poorly understood. To address this question, the Pair1 (SEZ_DN1) locomotor circuit in Drosophila was used. Pair1 is a GABAergic interneuron with ipsilateral dendrites and contralateral descending axonal projections. The moonwalker descending neurons (MDN) provide inputs to Pair1, and Pair1 sends outputs to A27h neurons in the ventral nerve cord (VNC). When optogenetically activated, the Pair1 neurons induce a pause in forward locomotion and increase in head casting, in part by inhibiting the A27h neurons, which drive forward locomotion. Importantly, it was previously reported that the temporal TF Hunchback and the homeodomain TF Bicoid are expressed in Pair1 neurons throughout life, providing candidates to study the transcriptional regulation of Pair1 neuronal identity and connectivity (Lee, 2022).

Hunchback is the first temporal TF to be expressed in the Drosophila embryo and acts transiently to generate early born neurons. In the embryonic CNS, Hunchback is not required to maintain neuronal identity, although it is required to maintain proper dendrite morphology of the mAL interneuron in adult males. Bicoid is a homeodomain TF; however, its expression and function outside the early embryo had not been reported until recent work from this lab. Bicoid is well known to form an anterior-posterior morphogen gradient that directly activates hunchback to properly pattern the anterior-posterior body axis.26 Although the role of Hunchback in temporal patterning is conserved in mammals, Bicoid is found only in higher dipteran insects, making it an interesting contributor to insect evolution. This study tested the model that the temporal TF Hunchback activates the homeodomain TF Bicoid (opposite of their early embryo relationship) and whether Hunchback and Bicoid play a role in Pair1 neurotransmitter expression, neuron morphology, synapse number, circuit function, and behavior. The data support the emerging model that temporal TFs drive expression of homeodomain TFs that maintain distinct aspects of neuronal identity including synapse number/position, connectivity, and behavior (Lee, 2022).

The results show that Hunchback activates Bicoid in postmitotic Pair1 neurons, where it regulates specific and important aspects of neuronal identity-synapse number, synapse density, and connectivity. When Hunchback or Bicoid levels are decreased, synapse density is increased, with a corresponding disruption of the function of the Pair1 locomotor neural circuit. This work demonstrates a novel role for Hunchback and Bicoid-functioning postmitotically to regulate synapse number and to ensure proper circuit function. Importantly, this work also reproduces a phenotype previously seen in C. elegans-a single homeobox gene (unc-4) specifically regulates synaptic connectivity but not other aspects of neuronal identity. Interestingly, unc-4 expression is also regulated by a nonhomeodomain TF, suggesting that this regulatory pathway may be conserved between species to specify highly specific aspects of neuronal identity (Lee, 2022).

Unlike most early born neurons in the VNC that only transiently express Hunchback, and Bicoid which is only expressed in the first few hours of embryogenesis, the Pair1 neuron maintains both Hunchback and Bicoid expression into the adult. This suggests that a Pair1-specific regulatory mechanism may be leading to the persistent Hunchback and Bicoid expression and function. Given that the Pair1 neuron persists into adulthood, still expresses Hunchback and functions within a similar locomotor neural circuit, it is hypothesized that Hunchback and Bicoid expressions may be required in Pair1 neurons throughout life for the maintenance of the Pair1 locomotor neural circuit (Lee, 2022).

Surprisingly, Bicoid protein expression in larval Pair1 neurons was often detected in one or more spherical puncta located in the cytoplasm; this was observed with two independent Bicoid antibodies and a third FLAG-tagged Bicoid protein and was abolished by Bicoid RNAi. Given that Bicoid contains highly disordered regions with an abundance of glutamine and glycine, the spherical puncta may represent a phase-separation condensate, perhaps to keep nuclear Bicoid levels low. Interesting, Bicoid does not form spherical puncta outside of the larvae. Further investigation is needed to understand nature of the Bicoid cytoplasmic puncta, but these studies have the potential to elucidate a novel role for phase-separation in mature neurons (Lee, 2022).

Previous work showed that Bicoid activates hunchback in the early embryo. This study is the first to demonstrate the reverse that Hunchback can promotes Bicoid expression in vivo. Hunchback may regulate Bicoid directly or indirectly; supporting the former possibility are the findings that Hunchback protein binds two distinct regions at the 3' and 5' end of the bicoid locus. Alternatively, Hunchback may act indirectly by promoting Bicoid phase separation in larval neurons. Regardless, this finding supports the initial hypothesis that temporal TFs, like Hunchback, can activate homeodomain TFs, like Bicoid, to specify some or all aspects of neuronal identity. Other morphogens have been previously associated with establishing properties of neuronal identity, further suggesting that early developmental TFs may be important regulators of neuronal identity, connectivity, and circuit function in general (Lee, 2022).

Hunchback and Bicoid had no detectable role in regulating dendrite morphology, axon morphology, nor GABA expression, key aspects of Pair1 neuronal identity. However, both Hunchback and Bicoid are required for maintaining synapse number and functional connectivity of the Pair1 neuron. Trans-Tango experiments show that reduced Hunchback levels resulted in the addition of new synaptic partners of Pair1, although it cannot be excluded that these may be normal partners that are too weak to see in controls. Although the novel neuronal partners were not formally identified, the Drosophila larvae TEM volume was used to speculate that Pair1 could be synapsing with the A27h neurons located in the thoracic region. Given that A27h neurons are involved in forward locomotion, additional thoracic A27h neurons synapsing onto, and therefore being inhibited by Pair1 activation, could explain the increased pausing phenotype observed when Hunchback in knocked down in Pair1. Alternatively, abdominal A27h neurons could be forming more synapses with Pair1 in the posterior axonal regions (Lee, 2022).

Interestingly, it appears that Bicoid is not the only homeodomain TF functioning downstream of Hunchback in Pair1. When Hunchback is knocked down in Pair1, pausing speed is increased, head casting is increased, and recovery speeds are decreased. However, Bicoid knockdown only replicated the decreased recovery speed phenotype; this suggests that another homeodomain TF may be functioning downstream of Hunchback to regulate pausing speed and head casting. The data presented in this study begin to support this hypothesis, but additional work is needed to identify other homeodomain TFs functioning downstream of Hunchback (Lee, 2022).

This work is the first to demonstrate a role for Hunchback and Bicoid in postmitotic neurons to regulate synapse number, connectivity, and circuit function. These results raise the question of which is the more ancestral function of these two TFs: in segmentation, temporal patterning in neuroblasts, or postmitotic neuronal circuit maintenance (Lee, 2022)?


GENE STRUCTURE

bicoid is located in the Antennapedia complex between Deformed and zerknüllt.
cDNA clone length - 2.6 kb

Bases in 5' UTR - 169

Exons - four

Bases in 3' UTR - 825


PROTEIN STRUCTURE

Amino Acids - 494

Structural Domains

Bicoid has an N-terminal region, consisting of alternating histidines and prolines, and a central homeodomain. Following the homeodomain there is a region of repetitive glutamines known as an OPA repeat (Berleth, 1988). The homeodomain has no more than 40% homology to any other known homeodomain proteins (Frigerio, 1986)

The maternal gene bicoid (bcd) determines pattern in the anterior half of the Drosophila embryo. It is reported here that the injection of bcd mutant embryos with messenger RNAs that encode proteins consisting of heterologous acidic transcriptional activating sequences fused to the DNA-binding portion of the bcd gene product, can completely restore the anterior pattern of the embryo (Driever, 1989).

Bicoid is a molecular morphogen, controlling embryonic patterning in Drosophila. It is a homeodomain-containing protein that activates specific target genes during early embryogenesis. A domain of Bcd located outside its homeodomain has been identified and referred to as a self-inhibitory domain; this domain can dramatically repress Bicoid's ability to activate transcription. Evidence that the self-inhibitory function is evolutionarily conserved. A systematic analysis of this domain reveals a composite 10-amino acid motif with interdigitating residues that regulate Bcd activity in opposite manners. Mutations within the Bcd motif can exert their respective effects when the self-inhibitory domain is grafted to an entirely heterologous activator, but they do not affect DNA binding in vitro or subcellular localization of Bcd in cells. It is further shown that the self-inhibitory domain of Bcd can interact with Sin3A, a component of the histone deacetylase co-repressor complex. This study suggests that the activity of Bcd is intricately controlled by multiple mechanisms involving the actions of co-repressor proteins (Zhao, 2005).

The solution structure of the homeodomain of the Drosophila morphogenic protein Bicoid (Bcd) complexed with a TAATCC DNA site is described. Bicoid is the only known protein that uses a homeodomain to regulate translation, as well as transcription, by binding to both RNA and DNA during early Drosophila development; in addition, the Bcd homeodomain can recognize an array of different DNA sites. The dual functionality and broad recognition capabilities signify that the Bcd homeodomain may possess unique structural/dynamic properties. Bicoid is the founding member of the K50 class of homeodomain proteins, containing a lysine residue at the critical 50th position (K50) of the homeodomain sequence, a residue required for DNA and RNA recognition; Bcd also has an arginine residue at the 54th position (R54), which is essential for RNA recognition. Bcd is the only known homeodomain with the K50/R54 combination of residues. The Bcd structure indicates that this homeodomain conforms to the conserved topology of the homeodomain motif, but exhibits a significant variation from other homeodomain structures at the end of helix 1. On the consensus TAATCC DNA site, both side-chains make direct and water-mediated contacts to bases in the DNA (ATTAGG for R54, and TAATCC/ATTAGG for K50). A key result is the observation that the side-chains of the DNA-contacting residues K50, N51 and R54 all show strong signs of flexibility in the protein-DNA interface. This finding is supportive of the adaptive-recognition theory of protein-DNA interactions (Baird-Titusa, 2005).


bicoid: Evolutionary Homologs | Regulation | Targets of Activity | Protein Interactions | Miscellaneous Interactions: Control of Bicoid mRNA subcellular distribution | Developmental Biology | Effects of Mutation | References

date revised:  20 December 2023 

Home page: The Interactive Fly © 1995, 1996 Thomas B. Brody, Ph.D.

The Interactive Fly resides on the
Society for Developmental Biology's Web server.