2003 — 2006 |
Petrov, Dmitri |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Population Analysis of All Transposable Elements in the Sequenced Drosophila Genome
The project will determine the population prevalence of all ~1500 transposable elements ("jumping genes") found in the sequenced genome of the fruit fly (Drosophila melanogaster). The work will take advantage of the known location of all transposable elements in the D. melanogaster genome to design specific tools to estimate the prevalence of each copy in populations. Unusually frequent transposable elements will be identified and assayed for potential functional roles.
Transposable elements are parasitic genes that persist by actively multiplying within genomes. As a result they are the most genetically active component of many eukaryotic genomes. For example more than 90% of the human genome consists of old transposable elements. In Drosophila more than half of all mutations are caused by transposable elements. This project will be the first to comprehensively investigate population variability of transposable elements in any eukaryotic genome. It will shed light on the forces that maintain these parasitic genes in limited numbers, and the consequences they have for genome evolution. It will help determine the frequency with which transposable elements generate advantageous, adaptive changes and will identify these for further study. This project will also provide key information for annotation of the sequenced Drosophila genome.
|
1 |
2006 — 2010 |
Petrov, Dmitri A |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Patterns of Background Nucleotide Substitution in the Human Lineage
[unreadable] DESCRIPTION (provided by applicant): Knowledge of the patterns and rates of background substitution is essential for the identification and analysis of functional sequences in the human genome. Provided this knowledge it should be possible to identify functional sequences in comparison of two or more genomes. Such sequences will stand out as those that have changed either significantly less or significantly more than expected under the estimated rates of background substitution. Despite this central importance, patterns of background substitution are poorly known. Questions of whether rates of background substitution vary across the human genome and whether these rates have been evolving in the human lineage remain controversial. Major difficulties lie in identifying sequences that evolve under no functional constraint and also in devising methods of inferences given the existence of a rapid neighbor-dependent CpG to TpG/CpA transition prevalent in mammalian DNA. The high rate and the neighbor-dependence of this process substantially complicate all inferences of substitution, even those of single-nucleotide substitutions at non-CpG sites. This project will utilize a new maximum likelihood method capable of simultaneous inference of the rates of CpG to TpG/CpA transition and of the rates of single-nucleotide substitution significantly beyond the point of naive saturation. This method will be applied to the abundant sequences of dead copies of transposable elements in the human and other mammalian genomes deposited over the last 200-300 million years. This analysis will provide essential new information regarding the evolution of patterns of substitution in mammalian genomes, will create fine-scale (1-5 Mbp) genomic maps of substitution patterns and rates of the human and other mammalian genomes, and will investigate genomic determinants of background substitution patterns in mammals. [unreadable] [unreadable] [unreadable]
|
1 |
2010 — 2013 |
Petrov, Dmitri |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Population Genomics of Adaptive Transposition in Drosophila
DESCRIPTION (provided by applicant): The project will discover and investigate most of the transposable elements that have been adaptive during or after the migration of Drosophila melanogaster out of Africa. Transposable elements comprise an ubiquitous, extremely active and abundant part of eukaryotic genomes. Upward of 50% of the human genome, for example, is composed of transposable elements. Transposable elements are responsible for a large fraction of mutations of every type, from subtle regulatory mutations to gross genomic rearrangements. Recently, we have shown that transposable elements have been responsible for a large number of adaptive mutations in D. melanogaster. Here we will identify these TEs (estimated 30-60 in total) via computational work with wholly sequenced genomes of a number of D. melanogaster strains and via PCR measurements of frequency of these TEs in a number of North American and sub-Saharan African populations. For the TEs that are likely to be adaptive we will carry out tests of population differentiation across environmental gradients and will also test for the presence of signatures of positive selection in the flanking genomic sequences. We will determine the timing of the spread of these TEs using several population genetic approaches and will determine experimentally their effect on the regulation of neighboring genes. We will also use the functional annotation and molecular evolutionary and population genetic analyses of the genes adjacent to putatively adaptive transposable elements to shed light on which kinds of genes and processes are impacted by positive natural selection. Finally, we will use a panel of 192 isogenic strains, which are being fully sequenced and phenotyped for a large number of traits, to carry out association mapping of the phenotypic effects of all putatively adaptive and neutral TEs. We expect to detect consistent phenotypic effects for multiple independent adaptive mutations. In this way, we will not only identify multiple adaptive TE-derived mutations, describe their population genetic and evolutionary history, and investigate their molecular effects, but should also provide an insight into a key question in evolutionary biology -- which traits positive natural selection has been acting upon in the history of a species? PUBLIC HEALTH RELEVANCE: The transposable elements constitute much of the human genome and genomes of other organisms. They are also the most dynamic part of genomes, commonly generating genetic aberrations are often deleterious and cause disease. Understanding transposable element behavior is essential for understanding genomic and genetic determinants of human health.
|
1 |
2012 — 2015 |
Petrov, Dmitri Schmidt, Paul (co-PI) [⬀] Schmidt, Paul (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Adaptation in 6 Dimensions
DESCRIPTION (provided by applicant): Over the last 10,000 years Drosophila melanogaster and D. simulans have spread through the world in the wake of human migration. As a consequence, these now cosmopolitan species have been exposed to a variety of novel environments and habitats. For flies living in temperate environments, winter cold represents a novel environment and these species have evolved in response to cold temperatures in several distinct ways. D. melanogaster has evolved a diapause syndrome that confers extreme lifespan extension, reduction in metabolic rate, and elevated stress tolerance; although diapause increases winter survival, individuals able to diapause suffer a fitness disadvantage during the summer and thus this adaptation remains at intermediate frequencies even at high latitudes. D. simulans, on the other hand, appears to have a more modest adaptive response to winter conditions: this species performs well at cool temperatures but does not appear have the capacity to survive prolonged exposure to the harsh temperate winter. Little is known about other overwintering mechanisms in D. simulans, or indeed any evolutionary response to novel temperate environments. Herein, we propose to identify polymorphisms underlying adaptation to temperate climates in D. melanogaster and D. simulans, link these polymorphisms to function and test hypotheses about the evolutionary origin of these polymorphisms. To do this, we will collect large samples of individuals from both of these species (i) across latitude on the East and West coasts of North America, (ii) through the growing season at several sites near Philadelphia, PA and (iii) along an altitudinal transect in Northern California for four replicate years. First, we will identify polymorphisms that vary in consistent fashion through time and space through high- throughput sequencing technologies and we will verify changes in allele frequency through pyrosequencing. Second, we will identify the functional consequences of a subset of these polymorphisms using quantitative genetic and transcriptomic techniques. We hypothesize that many of the polymorphisms we identify will be associated with phenotypes known to vary in a clinal fashion amongst many drosophilid flies. Finally, we will test hypotheses about the evolutionary origin of these polymorphisms by assessing worldwide haplotype diversity at surrounding loci. We hypothesize that most of these adaptive alleles will be subject to soft- sweeps which are characteristic of species with extremely large population sizes or those with large amounts of genetic variation in ancestral populations such as both D. melanogaster and D. simulans.
|
1 |
2015 — 2019 |
Rosenberg, Noah [⬀] Petrov, Dmitri |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Abi Innovation: Computational Population-Genetic Analysis For Detection of Soft Selective Sweeps
The molecular process of adaptation-the rise in frequency of genetic variants that enable organisms to succeed in their environments-is a central process in evolutionary biology. Surmounting significant challenges such as the ability of infectious agents to evolve resistance to drugs and the ability of crop pests to defeat a diverse array of increasingly powerful insecticides requires an understanding of the nature of adaptation. Recent advances have demonstrated that adaptation often occurs via "soft selective sweeps," in which an adaptive genetic variant originates multiple times or has become favored only after it has been present at a substantial frequency in the population. This project contributes to advancing knowledge of the fundamental evolutionary process of adaptation by developing new computational tools to detect and study the occurrence of adaptation by soft selective sweeps. Through the interactions of a multidisciplinary team spanning evolutionary biology and bioinformatics, the project integrates advances in evolutionary simulation with modern and efficient computational methods in order to produce progress on understanding adaptation, while simultaneously developing efficient computational tools applicable in the modern "big-data" era of inexpensive sequencing. In addition, its joint mentorship efforts from evolutionary and bioinformatics perspectives promote interdisciplinary training of graduate students and postdoctoral scientists.
The project has four objectives: (1) To design new tests for detecting selection in the case in which soft selective sweeps occur from standing genetic variation; (2) To identify haplotypes that carry a beneficial allele in genomic regions known to be experiencing positive selection; (3) To enhance new methods of analysis of natural selection to make them robust to confounding demographic scenarios; (4) To apply new selection methods in a series of data sets from multiple species, including humans, Drosophila, and Plasmodium malaria parasites. The project will use algorithmic techniques from combinatorial optimization and machine learning, and it will exploit ideas from population genetics and coalescent theory. It breaks ground on several fronts, providing a deeper understanding of the patterns in site-frequency spectra and haplotype data as a basis for selection signatures, and assisting in the design of subtyping studies for complex regions of the genome. As it becomes increasingly possible to sequence whole genomes of multiple individuals within a population, the intellectual challenge of designing tools for detecting selection to accommodate new phenomena such as soft sweeps coincides with the computational challenge of incorporating genomic data sets into selection studies. These challenges are addressed by the project, whose results will be available at http://proteomics.ucsd.edu/vbafna/research-2/nsf1458059/.
|
1 |
2015 |
Petrov, Dmitri |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
High-Resolution Study of Adaptation in Haploid and Diploid Populations of Yeast
? DESCRIPTION (provided by applicant): Cancer is a disease of adaptation in which some cells acquire beneficial mutations that allow them to proliferate within the body. Cancer, and adaptation in diploids in general, is driven by beneficial mutations that must be at least partly dominant to be detected by natural selection and might even be commonly overdominant in fitness (i.e. more beneficial as heterozygotes than as homozygotes). It is therefore likely that adaptation in diploids will be driven by a qualitatively different set of mutations than in haploid and might obey a qualitatively different set of rules. In order to understand the dynamics of adaptation in diploids and to contrast it with that of haploids it is necessary to (i) identify a lrge number of individual beneficial mutations in both haploids and diploids, (ii) determine their molecular nature, and (iii) measure their fitness with high precision in both heterozygotes and homozygotes. Unfortunately this has not been possible due to the difficulty of isolating more than a handful of large-effect beneficial mutations in any system. Here, we will use an ultra- high resolution barcoding system to uniquely tag hundreds of thousands of yeast cells making it possible to identify thousands of adaptive mutations in large haploid and diploid yeast populations (~108 cells/population). We will identify and measure the fitness, molecular nature, and heterozygous effects of hundreds of distinct beneficial mutations arising in the same environment in haploids and diploids. We will use these data to test theoretical predictions about dominance of adaptive mutations arising and spreading in haploids and diploids and will generate the first detailed joint distribution of molecular identity/fitness benefit/heterozygote effect of several hundred individual adaptive mutations. We anticipate that the insight gained from this project will (i) inform our understanding of adaptation in a regime - large populations - that is especially relevant for human diseases such as cancer and (ii) reveal the likely qualitatively different ways in which evolution proceeds in haploids and diploids.
|
1 |
2016 — 2018 |
Petrov, Dmitri Winslow, Monte Meier [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
A Quantitative Multiplexed Platform For the Pharmacogenomic Analysis of Lung Cancer
PROJECT SUMMARY Lung cancer is a major health burden, leading to more deaths than the next four major cancer types combined. Despite advances in clinical cancer genome sequencing and the development of many targeted therapies, understanding the relationship of tumor genotype to therapeutic response remains a major obstacle to translating existing drugs into effective cancer treatments in the clinic. Pharmacogenomic analysis of tumor response is often extrapolated from the analysis of patients' tumor responses or modeled using in vitro cultured cell line systems, but investigating the effect of tumor genotype on drug response in cell lines, patient-derived xenograft models, or patients themselves all have severe limitations. Genetically-engineered mouse models have emerged as particularly rigorous in vivo systems with which to test early stage oncology therapies and represent tractable models with which to investigate the impact of tumor genotype on therapy response. Current genetically-engineered mouse models are time-consuming, cost-intensive, and have unavoidable technical and experimental variability that has limited their use in translational studies. We have established a novel multiplexed somatic genome-editing approach that will allow the quantification of genotype-specific drug responses. This in vivo approach will increase in precision and scope of translational cancer pharmacogenomics studies. To quantify the effect of tumor suppressor gene inactivation on lung cancer growth, we established a system that combines somatic Cas9-mediated gene inactivation with existing genetically-engineered mouse models to generate ~30 different lung tumor genotypes. To quantify the exact size of each tumor and determine the size distribution of each tumor genotype, we induce tumors with barcoded vectors and use high-throughput sequencing and statistical approaches to determine the number of cancer cells in each tumor. We will combine our quantitative pooled genome-editing approach with pre-clinical treatments to uncover genotype-specific therapy responses. We will quantify the responses of ~30 different genotypes of tumors to several therapies that have been shown to have genotype-specific effects in lung adenocarcinoma models. This will extend our understanding of the genomic modifiers of treatment responses and define the experimental and statistical parameters to enable the most efficient use of these models for translational studies. Finally, by performing pre-clinical/co-clinical trials for targeted therapies across >30 tumor genotypes in parallel we will generate a pharmacogenomic map connecting lung adenocarcinoma genotype to targeted therapy response. Our ongoing clinical interactions will allow validation of our pharmacogenomic predictions in lung adenocarcinoma patients. This flexible system can incorporate additional tumor suppressors, allows for the investigation of genotype-specific responses to other therapies including immunotherapies, and be adapted to other cancer types. The techniques described in this proposal are ideally positioned to become a mainstay of pre-clinical/co-clinical trial design.
|
1 |
2016 — 2021 |
Petrov, Dmitri |
R35Activity Code Description: To provide long term support to an experienced investigator with an outstanding record of research productivity. This support is intended to encourage investigators to embark on long-term projects of unusual potential. |
Genomics of Rapid Adaptation in the Lab and in the Wild
Project Summary Adaptation is the central concept in evolution and biology in general. I aim to build and quantitative and predictive theory of the adaptive process by focusing on interrelated empirical, computational, and theoretical studies of rapid adaptation in a range of systems. The MIRA grant will be supporting three specific umbrella projects that focus on: (i) inference of the dynamics of adaptation from genomic data, (ii) studies of rapid seasonal adaptation in Drosophila, and (iii) high throughput studies of adaptive mutation in experimental evolution studies in yeast.
|
1 |
2018 — 2021 |
Petrov, Dmitri Winslow, Monte Meier [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
(Pq4) Quantitative and Multiplexed Analysis of Gene Function in Cancer in Vivo
PROJECT SUMMARY Genome sequencing has catalogued the somatic alterations in human cancers and identified many putative driver genes. However, human cancers generally evolve through the sequential acquisition of multiple genomic alterations and simply identifying recurrent genomic alterations does not necessarily reveal their functional importance to cancer growth. Genetically engineered mouse models have become a mainstay for the analysis of gene function in cancer in vivo, however the breadth of their utility is limited by the fact that they are neither readily scalable nor sufficiently quantitative. To increase the scope and precision of in vivo cancer modeling, we previously integrated conventional genetically-engineered mouse models, CRISPR/Cas9-based somatic genome engineering, and quantitative genomics with mathematical approaches. We developed methods to inactivate multiple genes in parallel in mouse models of lung cancer using pools of barcoded sgRNA- containing lentiviral vectors. This tumor barcoding with sequencing (Tuba-seq) approach uncovers the size of each tumor, enables the parallel investigation of multiple tumor genotypes in individual mice, and allows the generation of large-scale maps of gene function within autochthonous cancer models. Our preliminary data and novel genetic systems, as well as our dedicated and collaborative team of investigators with expertise in cancer genetics, mouse models, genome-editing, clinical cancer care, and quantitative modeling make us uniquely positioned to conduct these studies. In this proposal, we will extend Tuba-seq to quantify the effect of combinatorial genetic alterations through the development and validation of a platform for the rapid and quantitative analysis of interactions between genetic alterations on tumor growth in vivo. To enable multiplexed and quantitative analysis of the impact of temporally controlled genomic alterations on cancer cell growth in vivo, we will also develop a system for inducible genome editing in established lung tumors. Finally, we will develop novel in vivo approaches to comprehensively and broadly uncover the gene expression programs in cancer cells of different genotypes in parallel. Through multiplexed in vivo genetic alterations, the effect of putative cancer drivers can be uncovered at an unprecedented scale and resolution. The results of this proposal will be significant because innovative methods for the cost-effective, quantitative, and multiplexed analysis of the genetic determinants of cancer pathogenesis will illuminate novel aspects of tumorigenesis and accelerate our ability to understand cancer evolution, drug responses, and therapy resistance.
|
1 |
2019 — 2021 |
Petrov, Dmitri Winslow, Monte Meier [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Unraveling Mechanisms of Tumor Suppression in Lung Cancer
PROJECT SUMMARY Genome sequencing has catalogued the somatic alterations in human cancers and identified many putative tumor suppressor genes. However, human cancers generally evolve through the sequential acquisition of multiple genomic alterations and simply identifying recurrent genomic alterations does not necessarily reveal their functional importance to cancer growth. Genetically engineered mouse models uniquely enable the introduction of defined genetic alterations into normal adult cells, which results in the initiation and growth of tumors entirely within their natural in vivo setting. However, the breadth of their utility is limited by the fact that they are neither readily scalable nor sufficiently quantitative. To increase the scope and precision of in vivo cancer modeling, we previously integrated conventional genetically engineered mouse models, CRISPR/Cas9-based somatic genome engineering, and quantitative genomics with mathematical approaches. Tumor barcoding coupled with CRISPR/Cas9-mediated gene inactivation and high-throughput barcode sequencing (Tuba-seq) enables the parallel investigation of multiple tumor genotypes in individual mice and allows the large-scale analysis of pairwise tumor suppressor alterations. In Aim 1, we will employ our multiplexed and quantitative Tuba-seq approach to quantify the impact of inactivating many uncharacterized putative tumor suppressor genes on tumor growth in vivo and across time. This analysis will broaden our understanding of the driving forces of tumorigenesis and uncover the potential clinical meaning of these genomic alterations. In Aim 2, we will uncover epistatic genetic interactions between tumor suppressor genes by generating de novo tumors with pairwise combination of tumor suppressor alterations. We will generate the first broad-scale functional understanding of the combinatorial effects of genomic alterations within an autochthonous cancer model. We will uncover the epistatic interactions of these genes and pathways, illuminating novel aspects of tumorigenesis, and potentially highlighting therapeutic vulnerabilities. In Aim 3, we will uncover the molecular programs in cancer cells of different genotypes. To gain insight into how the molecular outputs of single genomic alterations relate to the effects of pairwise alteration, we will also characterize tumors with combined inactivation of cooperative tumor suppressors. This will provide a molecular framework to understand the effects of novel tumor suppressors and uncover the molecular logic that drives the pattern of genomic alterations in human cancer. Our preliminary data, novel genetic systems, and strong collaborative team make us uniquely positioned to conduct these studies. The results of this proposal will be significant because these innovative, multidisciplinary, and highly quantitative approaches will accelerate our understanding of the determinants of cancer growth and will begin the systematic deconvolution of gene function during lung cancer growth in vivo.
|
1 |