2001 — 2007 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Pecase: Computational Methods For Genome-Wide Prediction of Protein-Protein Interactions
The long-term goal of this research is to develop computational methods for predicting protein-protein interactions at a genomic level. Protein-protein interactions play a central role in how an organism functions, and computational methods for predicting these interactions will be key to understanding functional pathways within biological systems. The vast amount of biosequence data in a genome makes sophisticated computational analysis a necessity. While computational methods have already proven to be a useful first step for rapid genome-wide identification of putative protein function and structure, research on the problem of computationally determining biologically relevant partners for given protein sequences is just beginning.
This project looks at the problem of predicting protein-protein interactions from two complementary viewpoints. For both approaches, the constraint of genomic-level analysis favors development of fast, informatics-based methods. The first part of this proposal focuses on a specific well-characterized structural motif that mediates protein-protein interactions: the parallel, 2-stranded coiled coil. The goal is to develop novel computational techniques that can predict whether two coiled-coil proteins interact with each other. The second part will extend several existing non-structural whole- and cross-genome methodologies that were initially designed for inferring protein function to the problem of predicting protein-protein interactions mediated by particular protein interaction domains.
The educational goals of this project include: (1) bioinformatics curriculum development, including a graduate "computing certificate" for molecular biology graduate students; (2) development of interdisciplinary bioinformatics courses at the introductory level and at the graduate research seminar level; and (3) dissemination of instructional material via the internet.
|
1 |
2005 — 2007 |
Chazelle, Bernard (co-PI) [⬀] Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sger: Algorithms For Predicting Protein Function Using Interaction Maps
Intellectual Merit
The goal of this project is to develop algorithms for analyzing protein interaction maps, in order to make novel predictions about a protein's biological process. The goal is to provide a framework for moving from individual pairwise linkages to exploiting entire interaction networks, where each interaction may arise from either experimental and/or other computational methods.
Methods for analyzing interaction networks are in their infancy, and most current approaches predict the function of a protein by considering only the annotations of its direct interactions. In contrast, the proposed methods will use the global connectivity of interaction networks, the relationships between functions, and several high-throughput data sources in making predictions. The hope is that by developing novel network-based algorithms, we will obtain functional predictions for many, as yet, uncharacterized proteins.
Broader Impact
Both PIs teach cross-disciplinary courses in computational biology, and the research outlined here will further enhance their educational efforts. The co-PI, Chazelle, has designed and is teaching a new undergraduate course-an integrated, quantitative introduction to the natural sciences. Together with biologists, physicists, and chemists the PI, Singh, has designed a graduate course Introduction to computational molecular biology and genomics; she has co-taught it with a molecular biologist for the past four years.
The proposed work will develop methods using interaction maps for baker's yeast and fruit fly. Humans share many proteins and pathways with these model organisms. Thus, network analysis methods may allow transfer of information from these organisms to human, potentially revealing critical information about proteins and pathways implicated in human disease. Predictions and software will be made available on the web (www.cs.princeton.edu/mona/software.html).
|
1 |
2006 — 2015 |
Singh, Mona |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Predicting and Analyzing Protein Interaction Networks
[unreadable] DESCRIPTION (provided by applicant): Large-scale protein interaction networks have been determined experimentally for several organisms, and computational analysis of these networks provides new opportunities to uncover protein functions and pathways. At the same time, despite improvements in high-throughput technologies, it is still not feasible in the near future to apply them to all sequenced genomes. Thus, for the vast majority of sequenced genomes, only a small fraction of known protein interactions have been experimentally determined, and novel computational approaches provide a promising, alternative means for building large, high- confidence interaction maps. The broad, long-term goal of this research is to build a comprehensive research program for understanding protein interactions, by developing algorithms for the complementary problems of analyzing and predicting protein interaction maps. Our specific aims are: (1) To develop algorithms that exploit the topology of whole-genome protein interaction maps and the relationships between protein functions, in order to make novel predictions about a protein's biological process. (2) To build a system for interrogating protein interaction networks using "templates" specifying common patterns of interactions or pathways, in order to help uncover novel instances. (3) To develop a general structural bioinformatics approach for leveraging properties of specific protein interaction interfaces, and to apply this approach in order to help predict Cys2HiS2 zinc finger protein-DNA interactions at the genomic scale. Taken together, we hope that the proposed tools will significantly advance the state-of-the-art in computational approaches for characterizing proteins within the context of their cellular interactions, pathways and networks. All software and predictions will be made publicly available via the internet. [unreadable] [unreadable] [unreadable]
|
1 |
2006 — 2010 |
Funkhouser, Thomas [⬀] Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sei: New Shape Analysis Methods For Structural Bioinformatics
A complete understanding of any biological system or disease necessitates a detailed analysis of how its proteins interact with other molecules. Most methods for predicting and understanding protein function have focused on determining evolutionary relationships in amino acid sequences. However, the molecular function of a protein is determined also by its 3D structure (i.e., how atoms interact within its active sites), and thus a great deal of attention has recently been devoted towards solving the 3D structures of proteins with the hope that computer algorithms can infer functional relationships between them. 3D atomic coordinates are available for tens of thousands of proteins and the number has been increasing exponentially over the last several years. The goal of this project is to develop novel computer algorithms for analyzing protein structures, detecting similarities between them, visualizing how they interact with other molecules, and automatically providing functional classifications for them. For example, given a novel protein structure, new geometric algorithms will be used to determine the locations and shapes of its active sites. Next, the model of the structural and chemical properties of those sites will be used to search large databases for sites with similarities. Finally, the best matches are aligned so that functional annotations can be transferred from the active site of one protein to another. These algorithms will not only be useful for molecular biology, but they will drive research on a broader class of computation methods for detecting features in noisy 3D data, matching shapes of complex 3D structures, and searching large repositories of 3D data. Beyond the research, the project will have impact through its interdisciplinary collaborations, educational and outreach programs, and public dissemination of information. The project is a collaborative effort across diverse disciplines, aiding the project to promote cross-pollination of ideas between fields, and provide new educational opportunities for students to learn in an inter-disciplinary environment. Everything developed as part of this proposal will be made freely available to the public through talks, workshops, web pages, course notes, software libraries, bibliographies, and data sets.
|
1 |
2009 — 2013 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Discovery of Complex Recurring Protein Interaction Patterns Within Interactomes: Algorithms, Applications and Software
A grant has been awarded to Princeton University to develop computational tools to facilitate research in biological networks through discovery and analysis of recurring patterns of relations among biological units such as proteins. Searching for recurring patterns in biological data has been the backbone of much research and analysis in computational biology, and has been essential in uncovering biological function. In the last few years, there has been an explosion in the availability of protein-protein interaction data, and we now have large-scale interaction networks for human and many model organisms.
The overall goal of this research is to develop the necessary computational infrastructure for applying recurring pattern analysis---which has already proven to be useful in analyzing biological sequence, structure and expression data---to biological networks. The specific goals of this research are: (1) To develop a computational framework for uncovering what types of proteins preferentially work together, with the goal of revealing patterns underlying cellular organization and protein functioning. (2) To apply these algorithms to existing interaction networks across the evolutionary range, with the goals of understanding how the recurring network units within organisms differ and of revealing how new types of proteins are incorporated into existing networks. (3) To develop software that, given a particular protein sequence, uncovers the recurring network interaction patterns it participates in, with the goal of placing the input protein within the context of its cellular pathways and modules, and thereby gaining insight into its function. The proposed research is coupled with the development of a new undergraduate course in bioinformatics, with a final project focusing on network analysis. Software and results of this project will be available from the website http://compbio.cs.princeton.edu.
|
1 |
2010 — 2011 |
Singh, Mona |
R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Computational Methods For Uncovering Protein Function in Plasmodium Falciparum
DESCRIPTION (provided by applicant): Malaria is one of the most common human infectious diseases, with an estimated 300-500 million cases a year and between one and three million yearly deaths. Malaria is caused by protozoan parasites, with the most serious forms of the disease in human caused by Plasmodium falciparum. The P. falciparum genome has been fully sequenced. Remarkably, only 55% of its identified proteins have any predicted or known functional annotations, and much of the organism's core machinery remains unidentified, thereby significantly hampering our understanding of this organism and of malaria. Since traditional bioinformatics approaches have had limited success in uncovering P. falciparum protein functions, the long-term goal of this research is to develop novel computational approaches that are more effective for this task. Our framework is centered on better identification of protein domains, the structural, functional and evolutionary units of proteins, and linking uncovered P. falciparum protein domains to well-characterized domains associated with known protein functions. Our approaches leverage comparative genomics, graph-theoretic methods, and sensitive probabilistic profile-profile comparisons, all within a robust computational pipeline. The specific aims of this proposal are (1) To uncover putative domains within P. falciparum protein sequences using homologous sequences in closely related genomes, and to use these to identify similarity to known functionally characterized protein domains. (2) To increase the number of P. falciparum proteins with predicted functional motifs and domains by exploiting the tendency of certain motifs and domains to occur together within the same sequence. (3) To experimentally test a representative set of predictions, in order to uncover new P. falciparum biology and to evaluate our computational pipeline. The proposed techniques have significant potential for expanding the number of protein functional annotations within P. falciparum, and for therefore accelerating ongoing research efforts aimed at developing anti-malarial drug targets against the causative agent of human malaria. PUBLIC HEALTH RELEVANCE: Malaria is one of the most common human infectious diseases, with an estimated 300- 500 million cases a year and between one and three million yearly deaths. The most serious forms of the disease in human are caused by /P. falciparum/. The proposed research aims to significantly expand the number of protein functional annotations for /P. falciparum/, in order to accelerate our understanding of the causative agent of human malaria.
|
1 |
2011 — 2015 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Abi Development: Algorithms and Software For Discovery of Non-Sequential Protein Structure Similarities
The University of Illinois at Chicago and Princeton University are awarded collaborative grants to develop efficient and scalable computational methods for comparing protein structures. Protein sequences and structures are being determined at an increasingly rapid rate. To date, there are more than 1000 sequenced genomes with several thousand more in progress and over 50,000 three-dimensional structures in the Protein Data Bank. Whether considering proteins at the level of sequence or structure, comparing or aligning two proteins is the fundamental technique for uncovering principles of protein structure, function and evolution. While the vast majority of research efforts have focused on sequence comparisons, since protein structures are generally better conserved than protein sequences, identifying structural similarity between proteins can yield valuable clues to protein function and can be used to classify proteins, analyze their evolutionary histories and even to help predict protein interactions. Though considerable advances have been made in recent years in comparing protein structures, key difficulties include detecting shared, conserved structures between proteins where the individual structural elements are in different orderings on the two sequences. This project will develop innovative methods to enable discovery of sequence-order-independent substructure similarity with a long-term goal of doing a large-scale comparison over all protein structures. The research team will formulate precise theoretical problems, design efficient algorithms for them and implement and test the resulting algorithms to test accuracy and efficiency issues. The final software for comparing protein structures will be released to the scientific community and is expected to provide a significant and demonstrable impact on further research in structural bioinformatics.
Scientifically, the methodologies to be developed for substructure comparison are general and will have broader impacts beyond structural proteomics and bioinformatics. For example, a biomedical application of the proposed project lies in guiding protein engineering and rational drug design via a systematic identification of all such substructures and their underlying sequences. The project will involve undergraduates and under-represented minority (URM) groups in active research. A central component is to engage URM undergraduate students from the urban UIC campus and involve them in summer research at Princeton with the goal of possible recruitment into Princeton's graduate program in Quantitative and Computational Biology. Additionally, the PIs are planning course and curriculum development, dissemination of research, mentoring of undergraduate and graduate students, outreach and community involvement.
The outcomes of the project will be made available through the websites of all the investigators: http://www.cs.uic.edu/~dasgupta http://gila.bioengr.uic.edu/lab http://www.cs.princeton.edu/~mona
|
1 |
2011 — 2013 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Recomb Special Session: Computational Challenges and Emerging Areas Within Computational Biology
The field of computational biology has undergone explosive in the last decade. To understand the computational challenges facing computational biology in the next 5-8 years, the PI proposes to run a special session at the next RECOMB2012 (http://recomb2012.crg.cat/) focused on emerging areas. RECOMB is an international scientific conference bridging the computational, mathematical and biological sciences, and is an excellent forum for disseminating the latest developments in computational biology.
|
1 |
2011 — 2013 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Support For Recomb 2012
RECOMB (http://recomb2012.crg.cat/) is an international scientific conference bridging the computational, mathematical, and biological sciences. The conference features keynote talks by preeminent scientists, together with presentations of refereed research papers in computational biology, special sessions and poster sessions. This proposal requests travel support for US conference participants, with priority going to graduate students who will be presenting paper or posters.
|
1 |
2013 — 2014 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Support For Recomb 2013
RECOMB2013: 17th International Conference on Research in Computational Molecular Biology
The RECOMB Conference Series (http://www.recomb.org/) was founded in 1997 by Sorin Istrail, Pavel Pevzner, and Michael Waterman to provide a scientific forum for theoretical advances in computational biology and their applications in molecular biology and medicine. Previous RECOMB conferences have been held annually in North America, Europe, and Asia.
RECOMB 2013 is the seventeenth in a series of well-established scientific conferences bridging the areas of computational, mathematical, statistical and biological sciences. The conference features 6 keynote talks by preeminent scientists in life sciences, including Scott Fraser (USC and Caltech, Imaging), Takashi Gojobori (Japan, comparative genomics), Deborah Nickerson (U Washington, Exon sequencing), Nadia A Rosenthal (Monash University, Australia, Stem cell), Chung-I Wu (U Chicago, evolutionary genomics), and Xiaoliang Sunny Xie (Harvard, single molecule). The conference will also feature 32 state-of-art scientific presentations selected from 167 submissions. The topics cover essentially every aspect of computational biology and bioinformatics, plus emerging areas such as molecular imaging and single molecule sequencing. There will also be 5 highlight talks selected from 47 submissions that are published in year 2012 up to 02/10/2013, plus about 200 poster presentations.
The conference attracts research contributions in all areas of computational molecular biology, including but not limited to: molecular sequence analysis; recognition of genes and regulatory elements; molecular evolution; protein structure; structural genomics; analysis of gene expression; biological networks; sequencing and genotyping technologies; drug design; probabilistic and combinatorial algorithms; systems biology; computational proteomics; structural and functional genomics; information systems for computational biology and imaging.
The many events at RECOMB provide an excellent way for students to be exposed to new and exciting research areas, and learn of cutting-edge advances in their own research areas and beyond. Furthermore, students will have ample opportunity to meet leading researchers in both informal and formal settings. Thus, the attendance of RECOMB2013 will benefit the students greatly for their career development.
|
1 |
2015 — 2018 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Abi: Innovation: Computationally Uncovering Dynamic Transcription Factor Interactions Within and Across Organisms
This project entails the development of new computational methods to uncover the dynamic variation of transcriptional regulatory networks. While nearly all cells within an organism have the same DNA, they can exhibit very different characteristics as different genes are turned on, or expressed, within them. Transcription factor interactions, comprising regulatory networks, control which genes are expressed, and thus the dynamics of these interactions across cells, conditions and organisms are a critical feature of proper biological functioning. To date, however, most existing knowledge about regulatory networks is static in nature: for nearly all organisms, transcription factor interactions are known under only a small number of conditions of interest. To help fill this gap and begin to uncover the dynamic nature of transcriptional regulatory networks, novel computational approaches will be developed to predict and compare condition-specific transcriptional interactions within an organism and varying transcriptional interactions across organisms. Software for these tasks will be released and made available to the broader scientific community. Additionally, significant new outreach efforts will be undertaken to recruit a diverse group of students to take part in this research.
More specifically, new computational approaches will be developed to uncover and differentially analyze condition-specific and cross-organism variation both in regulatory interactions between transcription factors and their genomic targets, as well as in interactions amongst regulators themselves, as these interactions are a central mechanism by which regulatory specificity is achieved. The approaches will leverage large numbers of transcription factors with known binding specificities, existing chromatin accessibility data across numerous conditions that reveal which portions of a genome are accessible to be bound by transcription factors in those conditions, and comparative analysis of multiple closely related fully sequenced organisms. The results of this research will be disseminated at http://compbio.cs.princeton.edu.
|
1 |
2016 — 2020 |
Singh, Mona |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Interaction-Based Computational Methods For Analyzing Cancer Genomes
Project Summary Recent cancer genome sequencing efforts have determined the complete protein coding regions for thousands of patients across tens of different cancer types. Initial analyses have revealed that cancer genomes can have numerous genetic alterations, but only a subset are thought to be important for cancer initiation or progression. Further, across patients, there is a high degree of mutational heterogeneity with very few genes altered in a high fraction of cases, and many infrequently altered genes, some of which are functionally important in cancer cells. These factors significantly complicate efforts to identify cancer-related genes. Our long-term goal is to identify cancer-related genes by analyzing the genomes of cohorts of individuals with a particular cancer. The key insight underlying our work is that molecular interactions and networks reveal important aspects of protein functioning, and thus provide an important context by which to tackle the mutational heterogeneity observed across cancers. Our specific aims are: (1) To develop structure-based methods that uncover proteins enriched in somatic mutations in their interaction interfaces, as mutations in these sites are likely to affect protein functioning. (2) To develop network-based methods for de novo discovery of pathways that are mutated across patient samples, as mutations in cancers tend to target specific pathways?even if different genes within them are mutated in different individuals?and genes proximal in networks tend to be functionally related. (3) To develop metabolite- centric methods that use protein-small molecule networks in order to uncover mutated proteins that alter cellular metabolism, as reprogrammed metabolism is increasingly being recognized as a major adaptation of cancer cells. By pursuing these three complementary and tightly coupled aims?which exploit critical but often overlooked structural and network information?we will vastly advance the state-of-the-art in computational methods for analyzing cancer genomes. These analyses will deepen our understanding of cancer biology, and will ultimately lead to better patient stratification, refined prognostic tools, and novel therapeutics. .
|
1 |
2016 — 2017 |
Singh, Mona |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Support Recomb 2016
RECOMB is an international scientific conference bridging the computational, mathematical, and biological sciences. The RECOMB conference series was founded in 1997 to provide a scientific forum for theoretical advances in computational biology and their applications in molecular biology and medicine. Recomb 2016 will be the 20th Annual Meeting. RECOMB features keynote talks by preeminent scientists, together with presentations of refereed research papers in computational biology, special sessions and poster sessions. This proposal requests travel support for US conference participants, with priority going to graduate students who will be presenting paper or posters.
The broader impact goal of this proposal is to increase the participation of students at the RECOMB conference by giving travel funding to students. Women, minorities and persons with disabilities will be given the highest priority for travel funding, in order to increase their access to scientific meetings and to broaden participation at RECOMB. All individuals presenting posters or papers will be sent email making them aware of the student support possibilities. RECOMB is one of the top computational biology conferences. The many events at RECOMB provide an excellent way for students to be exposed to new and exciting research areas, and learn of cutting-edge advances. Further, student participation is vital for the growth and continuation of the field.
|
1 |
2019 — 2021 |
Singh, Mona |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Predicting and Analyzing Variation in Cellular Interactomes
Project Summary Over the last two decades, significant experimental efforts have determined large sets of ?reference? interactions for humans and other model organisms, along with substantial knowledge about the binding specificities of proteins, including for a large fraction of human transcription factors (TFs). The resulting data have proven to be an incredibly useful resource for understanding how cells function; nevertheless, they do not capture how molecular interactions and networks are different from the reference across individuals. Indeed, while human genomes in both healthy and disease populations are rapidly being sequenced, the corresponding individual-specific interaction networks remain largely unexamined; this represents a major gap in our knowledge, as mutations that alter molecular interactions underlie a wide range of human diseases. Further, the substantial amount of genetic variation across populations makes it infeasible in the near term to experimentally determine per-individual interaction networks. Thus our long-term goal is to develop computational methods to uncover whether and how mutations within coding and non-coding portions of the genome perturb cellular interactions and networks. Our specific aims are: (1) We will develop computational structure-based approaches to identify and catalog, at proteome-scale, variations within proteins that are likely to impact their ability to bind with DNA, RNA, small molecules, peptides or ions, thereby providing a comprehensive resource for analyzing protein interaction variation. (2) We will develop novel structure-based and probabilistic methods to predict how DNA-binding specificities are altered when a TF is mutated; since mutated TFs have been linked to numerous diseases, this will be a great aid in understanding disease networks and pathology. (3) We will develop new methods to uncover non-coding somatic mutations that alter human regulatory networks in cancer; this is a critical step towards ultimately uncovering patient-specific cancer networks. Overall by pursuing these aims?which integrate mutational information with existing knowledge about reference interactions, interfaces and specificities?we will develop novel computational methods that will significantly advance our understanding of molecular interactions perturbed in disease and healthy contexts.
|
1 |