![]() |
|||||||||||||||||||||||||||||||
| Dept. of Biochemistry Dept. of Biological Sciences |
|||||||||||||||||||||||||||||||
|
Halfon Lab
|
|||||||||||||||||||||||||||||||
![]() |
![]() |
|
|||||||||||||||||||||||||||||
rotation students welcome! |
|||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||
Genomic Approaches to Elucidating Developmental Regulatory NetworksMy laboratory investigates the genetic regulatory circuitry responsible for assigning cell fates during development, using the Drosophila embryonic mesoderm as our primary model system. Our work combines genomics and bioinformatics with the traditional molecular and genetic techniques of Drosophila research to investigate two key components of developmental regulatory networks, intercellular signaling and transcriptional regulation. This powerful combination of in silico and in vivo approaches enables us not only to make predictions but also to validate them within specific biological contexts. Our approaches have broad applicability to the study of genomes other than that of Drosophila, including the human genome. Current research in the laboratory falls into two main areas: (a) discovery and characterization of transcriptional cis-regulatory modules (CRMs), and (b) mechanisms of specificity for receptor tyrosine kinase (RTK) signaling. The combined results of our studies will provide insight into gene regulation, genome structure, intercellular signaling, and the regulatory networks that govern embryonic development. Regulatory NetworksGene expression is controlled by the binding of transcription factors to specific cis-regulatory elements. In the higher eukaryotes, these elements can lie 5' to, 3' to, or within introns of a gene; in some cases, they can even be found within protein coding sequences! Spatial and temporal aspects of gene expression are often controlled in a modular fashion, with individual cis-regulatory elements (termed "modules" or "enhancers") regulating expression in a particular time and place. An emerging theme is that a specific combination of transcription factors activiated as a result of intercellular signaling binds a regulatory module in conjunction with tissue-specific transcription factors ("selectors"), forming a "transcriptional code" that regulates the expression of a given gene (see Figure 1).
Defining cis-regulatory elementscis-Regulatory modules (CRMs) are critical nodes in developmental regulatory networks, as it is here that signaling pathways and transcription factors are integrated to give rise to changes in the expression of specific genes. Mutations within CRMs have been implicated in a number of diseases, underscoring the importance of being able to identify and characterize them. However, CRM identification has traditionally been difficult, relying on a trial-and-error approach using the non-coding DNA flanking the gene of interest. We are using a number of computational approaches to attempt to locate the cis-regulatory modules responsible for directing specific patterns of gene expression in a rapid and comprehensive fashion. Our focus in particular has been on elements that modulate expression in the progenitors of the Drosophila embryonic muscles. Our basic strategy is to empirically define in detail at least one model enhancer, identifying the majority of the transcription factors that bind to it. We then search the entire genome to find other regions that contain a compact cluster of the same binding sites. This approach is effective, although it has a high false positive rate (i.e., many incorrect predictions). We are finding that a considerable gain in accuracy is achieved by incorporating comparative genome data from additional Drosophila speices, as true regulatory modules tend to be well conserved over the course of evolution. All of our predictions are extensively tested in vivo using reporter gene assays in the fly embryo so that we can definitively assess our success rate and refine our approach to achieve better performance. In addition to this basic strategy, we are exploring other computational and empirical approaches to characterizing cis-regulatory elements, such as combining phylogenetic footprinting with sub-sequence profiling ("word" counts) as a way to identify functionally related enhancers, the development of high-throughput ways to test predictions in vivo, and the incorporation of data from transcriptional profiling experiments (see below). Although our primary focus has been on CRM discovery, much can be learned from studying already-known CRMs using bioinformatics approaches. However, these studies are significantly hindered by the absence of readily available data on large numbers of CRMs. To address this shortcoming, we have constructed the REDfly database of published Drosophila CRMs. This database contains more than 600 CRMs associated with over 200 genes, along with their sequences and the expression patterns for which they are responsible. Computational analysis of this collection will allow us to discover previously unrecognized transcription factor binding sites (see below) as well as to begin to explore the "grammar" of CRMs--how differences in the order and spacing of individual binding sites affect the overall functioning of the module. We can address issues such as to what extent clustering of binding sites is important for enhancer activity, how modular versus nested or interrelated regulatory elements tend to be, and how subtle differences in structure and sequence affect enhancer activity. [return to top]Downstream responses to intercellular signalingIn order to characterize developmental regulatory networks, we must also understand how upstream intercellular signaling events establish the transcriptional codes that act at the cis-regulatory elements. To this end, we have used DNA microarrays to determine the downstream target genes of signaling pathways in the embryo. By marking subpopulations of cells with GFP or cell-surface markers, we are able to isolate RNA from cells of interest in wild type as well as in loss- and gain-of-function genetic backgrounds. We are extending these studies to look at the downstream effects of combinations of signaling pathways, to better understand the combinatorial nature of intercellular signaling. In addition to the microarray studies, we make extensive use of real time quantitative RT-PCR and of whole mount RNA in situ hybridizations to visualize gene expression patterns. An important yet unresolved issue in microarray experiments lies in how to best analyze the data. Although numerous algorithms have been proposed for normalizing arrays, producing gene expression summaries, and statistically validating the results, it is unclear which of these methods work best. The main difficulty in determining this has been the lack of a comprehensive data set in which all of the RNAs being hybridized to the array, and their relative concentrations, are known. We have created such a set of over 4000 RNAs and hybridized them to Affymetrix arrays with relative concentration differences ranging from 1- to 4-fold. Using this data set we have assessed various methods of analysis to determine which give results that most accurately reflect the known input. In the future, we will repeat these experiments using other microarray platforms and analysis methods. The data sets should be of considerable value in benchmarking the performance of both existing and newly developed microarray analysis algorithms. Our manuscript describing these results can be found here. We have been particularly interested in receptor tyrosine kinase (RTK) signaling pathways, including the receptors Heartless (a fibroblast growth factor (FGF) receptor homolog) and Egfr (an epidermal growth factor (EGF) receptor homolog), which play important roles in establishing mesodermal cell fates. Although the FGF and EGF receptors are often believed to be acting via identical downstream signaling cascades, our data suggest that there are significant points of divergence within these pathways. Thus, these studies will contribute to our knowledge of RTK pathway regulation as well as identify additional genes important for mesodermal development. Selected Recent PublicationsHalfon, M. S., Gallo, S. M. and Bergman, C. M. (2007). REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila. Nucleic Acids Research, doi:10.1093/nar/gkm876. Li, L., Zhu, Q., He, X., Sinha, S. and Halfon, M. S. (2007). Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses. Genome Biology, 8:R101. Halfon, M. S. (2006). (Re)modeling the transcriptional enhancer. Nat Genet 38(10): 1102-1103. Choe, S. E., Boutros, M., Michelson, A. M., Church, G. M. and Halfon, M. S. (2005). Preferred analysis methods for Affymetrix GeneChips revealed by a wholly-defined control dataset. Genome Biology. 6:R16. Grad, Y., Roth, F. P., Halfon, M. S. and Church, G. M. (2004). Prediction of similarly-acting cis-regulatory modules by subsequence profiling and comparative genomics in D. melanogaster. Bioinformatics. 20:2738-2750. Halfon, M. S., Grad, Y., Church, G. and Michelson, A.M. (2002). Computation-based discovery of related transcriptional regulatory modules and motifs using a combinatorial model. Genome Res. 12:1019-1028. Halfon, M. S. and Michelson, A.M. (2002). Exploring genetic regulatory networks in metazoan development: methods and models. Physiological Genomics 10:131-143. Halfon, M. S., Carmena, A., Gisselbrecht, S., Sackerson, C. M., Jiménez, F., Baylies, M. K. and Michelson, A. (2000) Ras Pathway Specificity Is Determined by the Integration of Multiple Signal-Activated and Tissue-Restricted Transcription Factors. Cell 103:63-74. |
|||||||||||||||||||||||||||||||