2016 — 2017 |
Lareau, Liana Faye |
R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
A Computational Framework For Statistical Analysis of Ribosome Profiling Data @ University of California Berkeley
? DESCRIPTION (provided by applicant): The human genome project led to intense efforts to pro?le gene expression globally to understand the difference between normal and disease states, tissue types, or developmental stages. The ultimate product of gene expression is protein, but researchers have largely been limited to measuring mRNA production as a proxy for protein. However, the translation of mRNA into protein can be highly regulated. Global measurements of protein production thus have enormous potential for understanding human physiology and ful?lling the promise of the human genome project to reveal how the genome encodes normal and disease states. Ribosome pro?ling is a powerful new technique to measure translation genome-wide by sequencing and counting the fragments of mRNA protected by translating ribosomes. This method has quickly been adopted by many labs, but at present, the data analysis requires computational expertise, and the analysis so far has used ad hoc methods without statistical rigor. This project aims to characterize the statistical properties of ribosomal protein reads, design methods to account for biases in fragment counts, and incorporate these into a maximum likelihood framework for estimating translation of each human transcript. The ?rst aim will generate high-quality, high-coverage data from human ribosome pro?ling experiments, and then use these data to develop maximum likelihood models for fragment positions, the effect of ligation bias on fragment recovery, and the proper proportional assignment of reads between alternative splice forms of a gene. The second aim will combine the models of aim 1 into a piece of software that uses expectation maximization to generate genome-wide estimates of ribosome occupancy per codon and overall estimates of translation per transcript. This software will be designed to ?t into existing RNA-seq analysis pipelines. Th proposed work will provide an important and much needed tool for genome-wide measurement of gene expression. By providing a well-designed pipeline for ribosome pro?ling data, we will put this method within reach of many research groups. We will increase the accessibility of ribosome pro?ling as a method for understanding gene regulation in a wide range of medically relevant conditions.
|
1.009 |
2019 — 2022 |
Lareau, Liana |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Bbsrc: Riboviz For Reliable, Reproducible and Rigorous Quantification of Protein Synthesis From Ribosome Profiling Data @ University of California-Berkeley
All cells make proteins by using molecular machines called ribosomes, which read a messenger RNA template and "translate" the RNA code into the protein code. Cells need to make the right proteins, at the right time, in the right quantities, and so this process is carefully controlled by signals that are also encoded in the RNA. These signals are complex and only just beginning to be understood because there are thousands of different RNA sequences in a cell and each is hundreds to thousands of nucleotides ("letters") long. Recent advances in DNA & RNA sequencing technology mean that we can now measure all parts of RNA that are translated into protein and how much by using a technique called ribosome profiling. Although this technique is amazing, it is not perfect, and statistical tools are needed to separate the interesting biological signals in the data from unwanted biases of the experimental measurement. These tools need to be implemented in usable and reliable software in order for all scientists studying studying protein synthesis to be able to get the maximum possible information from ribosome profiling data, which is expensive and time-consuming to collect. The RiboViz software suite, which is open source and free to use by anyone in the world, already takes raw data from sequencing machines and puts it through a series of processing steps. RiboViz estimates how much each part of RNA is translated, and how the amount of translation is controlled by the code of that RNA. RiboViz produces tables, figures and graphs that are accessible online, so is useful for both experts and non-experts. This kind of data sharing makes science more reproducible and more accessible.
This project will accelerate understanding of the mechanism and regulation of protein synthesis by extending the RiboViz open-source computational pipeline to extract biological insight from high-throughput data measuring protein synthesis. The goal is to further develop the RiboViz open-source software pipeline (https://github.com/shahpr/RiboViz) for accessible, reliable, reproducible, rigorous and bias-aware analysis and visualization of ribosome profiling data. Specific aims are to refactor RiboViz following best practices for scientific computing, by writing tests akin to experimental controls for each step of the setup, processing and analysis, and by containerization of the pipeline to enable running on different computers with full control of software dependencies; develop likelihood-based statistical methods for quantification of differential translation of open reading frames and codons while correcting for sequence-level bias, building on best practices in differential RNA abundance analysis, and implement these analysis and visualization tools within RiboViz; and to generate standardized ribosome profiling datasets by re-analyzing published datasets for all eukaryotes to quantify rigorously how codon usage and other sequence features predict protein synthesis. The improved RiboViz pipeline will accelerate studies of translation regulation and produce tested and rigorous tools that we will be disseminated as an open-source resource to the entire community studying translation.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.915 |
2019 — 2021 |
Lareau, Liana Faye |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Determits and Consequences of Translation Elongation Rate @ University of California Berkeley
Project summary We propose to use quantitative, high-throughput methods to understand the consequences of variation in translation elongation on gene expression. New methods including ribosome pro?ling provide a genome-wide view of the motion of ribosomes along transcripts. We recently developed a neural network based on ribosome pro?ling data that captures information about what makes a ribosome move faster or slower, and then used that same information to design synonymous sequences that are not just decoded at different rates but also actually make more or less protein. This raises two questions that we propose to address here: how does slow decoding result in diminished protein expression, and what consequences does this rate variation have in vivo? First, we will expand our preliminary results from yeast, adapting our neural network model to understand the impact of variation in translation elongation in mammalian systems. We will measure changes in translation elongation in different cell types and with differential activity of translation elongation factors. Second, we will investigate and model the interplay between translation initiation and translation elongation rate to understand how translation elongation can be rate-limiting for protein production. We will also identify trans-acting factors that modulate this effect, using a genome wide CRISPR screen. Third, we will develop a more complete neural network model relating gene sequence to ultimate translation output, incorporating not just local sequence context but also positional effects and other factors. This proposal presents new experimental systems that can quickly and sensitively measure the consequences of codon choice and identify factors affecting how different codons determine translation output.
|
1.009 |