1989 — 2017 |
Stormo, Gary D |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Dna Pattern Identification and Analysis
[unreadable] DESCRIPTION (provided by applicant): [unreadable] We will continue the development of computer methods for analyzing gene regulation. We will further enhance methods to identify regulatory sites from the promoter regions of co-regulated genes with special emphasis on taking advantage of orthologous promoter regions from additional species. Included in the improvements will be better ways to identify multiple transcription factors that act coordinately to regulate gene expression. [unreadable] [unreadable] We will develop computational methods to help determine which transcription factors within a genome interact with which regulatory sites. The methods will be developed initially using bacterial genomes where a large number of genome sequences, from a wide range of phylogenetic distances, already exist. We will test the ability of different types of information, including genomic location, phylogenetic correlation and recognition code predictions, to aid in the identification of the associations between factors and sites. [unreadable] [unreadable] We will continue the development and enhancement of methods to predict RNA motifs composed of both sequence and structure constraints. We will go beyond the capabilities of programs like FOLDALIGN to detect conserved structures that are complex, including pseudoknots. Two approaches will be tested, one general one that should work on any collection of common RNA motifs and the other designed specifically to take advantage of phylogenetically conserved motifs in orthologous regions of multiple species. [unreadable] [unreadable] Each of these projects will be enhanced through collaborations with other groups, primarily experimentalists, who are interested in the application of our methods to their biological problems. [unreadable] [unreadable]
|
0.958 |
1997 |
Stormo, Gary D |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Sequence Analysis @ University of Washington
technology /technique development; plants; statistics /biometry; genome; genetics; computers; biotechnology; biomedical resource;
|
0.947 |
1997 |
Stormo, Gary D |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Sequence Analysis Postdoctoral Training @ University of Washington
informatics; biomedical resource;
|
0.947 |
1998 |
Stormo, Gary D |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Sequence Analysis Component @ University of Washington
The primary focus of this component of the Resource is to apply pattern recognition methods to discover regulatory elements, such as transcription factor binding sites, in sets of co-regulated genes. We have not yet gotten datasets from the DNA array group for analysis. However, we have been assessing the ability of our programs to identify the correct functional domains in yeast promoters using collections of genes with known regulatory sites. The controls have allowed us to test different approaches and to make refinements to to improve their detection capability. We have also obtain a couple of experimental datasets from Pat Brown and have run analyses on those as further tests of the methods to reliably identify regulatory sites. We have also explored the ability of our methods to elucidate cooperative interactions when multiple proteins are involved in the regulation. We have done this using data from the MetJ repressor from E. coli because adequate data exist for this analysis. A paper describing these results has been submitted. We think that the ability to detect multiple binding sites and account for their cooperative interactions will be important for some yeast genes, and the approach we've adopted and refined on an E. coli example is useful experience and preparation. We have also been developing and refining an adaptation of our method to specifically identify motifs involved in protein-protein interactions. For this we have used data on MHC-peptide interactions to identify the sub-domain being recognized and to model the energetic contributions of individual amino acids. This method appears to be working well and will be further refined with additional datasets. We think this method will be applicable to other types of data concerning protein-protein interactions, specifically for the identification of motifs that allow for docking of the proteins into complexes.
|
0.947 |
2001 — 2006 |
Stormo, Gary Zhang, Weixiong [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr/Ap (Cise) Collaborative Research: Best-First Search Algorithms For Sequence Alignment Problems in Computational Biology
Abstract Zhang 0113618 Korf 0113313
Best-First Search Algorithms for Sequence Alignment Problems in Computational Biology
Molecular biologists are currently faced with very challenging computational problems. For example, a draft of the human genome has been completed, a sequence of about three billion base pairs. A draft of the mouse genome soon will be completed. We know that mice and men share over 90% of their genetic material. What we don't know is exactly which parts of the human and mouse genomes are common to both species. This information can be used to identify human genes, and to translate results from mouse studies to studies of human health and disease. The problem of identifying the common elements between these two DNA sequences is an example of sequence alignment, which is a computational problem. Other examples of sequence alignment problems include gene identification, and RNA and protean structure prediction. Current computer algorithms are either too slow, or require too much memory, to directly solve a problem as large as the human-mouse genomic sequence alignment. We propose to develop new algorithms for various sequence-alignment problems, based on heuristic search algorithms in artificial intelligence. Our goal is to provide much more efficient sequence-alignment algorithms for use by molecular biologists.
|
1 |
2001 — 2010 |
Stormo, Gary D |
T32Activity Code Description: To enable institutions to make National Research Service Awards to individuals selected by them for predoctoral and postdoctoral research training in specified shortage areas. |
Training Program in Computational Biology
DESCRIPTION (provided by applicant): We request renewed support for our graduate training Program in Computational Biology. This program attracts students from traditional biological backgrounds and also from non-traditional backgrounds such as computer science, mathematics, engineering, and others. A specialized curriculum gives the students broad training in current molecular biology research as well as fundamental methods in computer science, mathematics and statistics that can be applied to important biological problems. The curriculum includes course work, including advanced electives and special topics courses, rotations in "wet" and "dry" labs, teaching experience, instruction in the responsible conduct of research, journal clubs and other opportunities to present research in public talks. Thesis research is performed under the guidance of faculty actively involved in computational biology research, including several new ones who have been added since the previous application. In its first few years this program has been successful in the recruitment of top candidate students, including some from underrepresented minorities. This program interacts synergistically with other programs within the Division, broadening the scope of opportunities and fostering interactions between students and faculty in computational biology with those in more traditional disciplines. We request seven trainee slots per year.
|
0.958 |
2007 — 2010 |
Stormo, Gary D |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Deciphering the Regulatory Code of a Cell
DESCRIPTION (provided by applicant): Although the regulation of gene expression has been intensively studied in the yeast S. cerevisiae, much about this process remains unknown. This is exemplified by our inability to predict, as opposed to explain, the expression pattern of any gene given its promoter sequence. Our long-term goal is to provide a comprehensive map of the S. cerevisiae gene regulatory network that can be used to develop predictive models of gene expression. The first task is to complete the catalog of transcription factors and their binding sites. We will use a combination of existing in vitro and in vivo methods to accomplish that goal. We will identify the binding sites of the more than 100 transcription factors of yeast whose specificity remains unknown (Aim 1) using electrophoretic gel mobility shift assays, a yeast one-hybrid assay, and a novel method to probe protein microarrays with DMA oligonucleotides. We will then develop comprehensive weight matrices of the binding sites of yeast transcription factors (Aim 2) using a novel implementation of the SELEX method we have developed. These results will be extended by determining the in vivo targets of selected transcription factors (Aim 3) using genome-wide chromatin immunoprecipitation (ChlP-Chip). We expect that the combination of these approaches will enable us to determine the binding sites and target genes of nearly all transcription factors of yeast. We will then attempt to learn the architectural principles of yeast promoters by determining how transcription factor binding sites contribute to gene expression. By creating large libraries of potential gene promoters in which a set of binding sites have been randomly distributed, we can ascertain the combinations of binding sites that determine specific expression patterns. This approach will be initially developed and tested using a few well characterized binding sites;we expect it will provide a general tool for more comprehensive studies of the logic of gene regulation.
|
0.958 |
2010 — 2011 |
Stormo, Gary D |
R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Exploiting Microbiome Sequences For Improved Models of Protein-Dna Interactions
DESCRIPTION (provided by applicant): This project will develop computer programs to exploit the Human Microbiome Project (HMP) DNA sequences to better understand DNA-protein interactions. The interactions between transcription factors and the DNA sites that they bind to are critical to controlling the expression of the genes within each species, and therefore also the characteristics of each species and its interactions with the human host. The transcription factors themselves can be readily identified from DNA sequences and we will take advantage of the fact that most bacterial transcription factors regulate themselves and/or adjacent genes within their chromosomes. Transcription factors can be clustered into groups that are expected to recognize the same patterns of DNA, based on known structures for similar proteins from well studied bacteria. Together the clusters of proteins with very similar specificity and the probable regulatory regions of nearby promoters will give us a very large number of potential DNA-protein interacting sites on which to apply pattern discovery algorithms. This should not only help us to learn about the regulatory networks within the HMP species, but also lead to more general understanding about the relationships between transcription factor proteins and the DNA patterns that they recognize. This will have broader implications across several areas of biological research and may lead to the design of new proteins with novel specificities that could be useful as research tools and for therapeutics. PUBLIC HEALTH RELEVANCE: The Human Microbiome Project will obtain DNA sequences from many different species inhabiting many different microenvironments of the human body. This project will develop computer programs to analyze those DNA sequences to help discover how the expression of the genes in those species is regulated. The regulation of gene expression is a key element in understanding the interactions between the microbial communities and the human host.
|
0.958 |
2018 — 2021 |
Stormo, Gary D White, Michael Aaron (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Single Cell Tagging of Localized Rna From Whole Populations
Project Summary/Abstract The objective of this proposal is to develop a broadly applicable technology, called SCALoP (Single Cell Analysis of Localized RNA on whole Populations), to measure gene expression in single cells, without the need to perform technically challenging manipulations on individual cells. Currently, large-scale genetic screens based on new technologies are leading to important advances in our understanding of mammalian genomes and genetic variation linked to human disease. However, the power of these screens is limited by the lack of methods to measure gene expression in these large screens. Our proposed technology would fill this major unmet need, and thus have a broad impact on mammalian genetics. Our goal is to develop a method to attach single-cell-specific sequence barcodes to transcripts, using RNA proximity ligation in pooled samples. We propose to do this by designing barcoded ?tagRNAs? which are expressed in cells and targeted to specific RNA binding proteins. These tagRNAs are attached to transcripts derived from the same single cell by proximity ligation. To develop this method, our aims are 1) optimize and quantify the efficiency of methods to link tagRNAs and cellular RNA, 2) optimize in vivo specificity and single cell resolution of SCALoP, and 3) develop RNA aptamers to natural proteins and alternate location targets.
|
0.958 |