One of the main discoveries in biology in the post-genomic era is the realization that the difference between organisms (i.e. mouse and human) is due not only due to genetic variation, but is mainly due to the algorithm or program that controls when, where, and at what level particular genes are expressed. A seminal work published several years ago (Wilson, M.D. et al., Science 2008) showed that replacing the genomic content of a mouse cell with human DNA (leaving the original mouse proteomic profile intact), resulted in a gene-expression profile that closely resembled the gene expression profile of the human cell. Had the expression profile resembled that of the mouse, the genes or protein could be considered to be the crucial difference between the two organisms. As a result of this work and others, a new picture of the genome is beginning to emerge, which suggests a model where the genome contains not only genes (protein-coding and RNA), but also a complex operating system, which determines when, where, and how much each gene is expressed.
In eukaryotic organisms this program is executed via a host of regulatory sequences termed enhancers. Enhancers are complex regulatory regions that consist of many binding sites for several transcription factors (TFs). Enhancers are assumed to occupy ~10% of the genome (as compared with 1-2% for protein coding genes), and are likely numbered in the hundreds of thousands in the human genome alone.
While recent high-throughput analysis of enhancers has substantially increased our understanding of particular enhancers, the overall generalization of the findings to other systems has been limited. This is due to the fact that unlike protein coding genes, where the vast majority of mutations is detrimental to the protein, for enhancers nearly every mutation produces some sort of functional change. This implies that in order to decipher a particular natural enhancer, millions of mutations must be carried out in order to collect sufficient functional changes to develop a generalized mutational model. We design synthetic enhancers based on the predictions produced by numerical simulation, which utilizes a self-avoiding worm-like chain model to predict the probability of enhancer looping based on the presence of particular binding sites. We use the model to make genomic predictions, and use both bioinformatic analysis and genome-editing techniques to test our predictions on real genomes.
Relevant publications:
Using synthetic bacterial enhancers to reveal a looping-based mechanism for quenching-like repression
Brunwasser-Meirom, Michal, Yaroslav Pollak, Sarah Goldberg, Lior Levy, Orna Atar, and Roee Amit. Nature Communications (2016).
Building Enhancers from the Ground Up: A Synthetic Biology Approach
Amit, Roee, Hernan G. Garcia, Rob Phillips, and Scott E. Fraser. Cell (2011).