1994 — 1996 |
Pevzner, Pavel A |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Algorithms and Software For Sequencing by Hybridization @ Pennsylvania State University-Univ Park
Sequencing by Hybridization (SBH) is a challenging alternative to the classical DNA sequencing methods. Several SBH problems must still be jointly addressed by biochemists, computer scientists and instrument designers. In this proposal we address computer science aspects of SBH and hope to bring these communities into closer contact. Some unsolved computer science problems seriously slow down the development of biochemical and instrumentation aspects of SBH. In close collaboration with biology and instrumentation SBH groups we identified the most important computer science problems which need to be addressed to benefit the further development of SBH. The proposal describes a project to study these problems. * User-friendly SBH software in public domain within the first year. * Algorithms and software for sequence reconstruction by optimized chips. * Algorithms and software allowing biologists to check their hypothesis and to optimally choose additional biochemical experiments for SBH. * Algorithms and software to eliminate "blind" choice of PCR primers by SBH data and reduce the amount of conventional sequencing in molecular evolution studies. * Algorithms and software to combine small-scale SBH and primer walking DNA sequencing and to parallelize ordered sequencing strategies. * Computational analysis of PCR-SBH and estimation of resolving power of PCR-SBH.
|
0.976 |
2001 — 2003 |
Pevzner, Pavel A |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
A New Approach to Fragment Assembly in Dna Sequencing @ University of California San Diego
DESCRIPTION (provided by applicant): For the last twenty years, fragment assembly in DNA sequencing followed the "overlap - layout - consensus" paradigm that is used in all currently available assembly tools. Although this approach proved to be useful in clone-by-clone DNA sequencing, it faces difficulties in genomic shotgun assembly: the existing algorithms make assembly errors and are often unable to resolve repeats even in prokaryotic genomes. Biologists are well aware of potential assembly errors and are forced to carry additional experiments to verify the assembled contigs. We abandon the classical "overlap - layout - consensus" paradigm in favor of a new Eulerian Superpath approach to fragment assembly. This allows us to reduce the fragment assembly to a variation of the classical Eulerian path problem, and, for the first time, to resolve the problem of repeats in fragment assembly. Our new EULER algorithm resolves all repeats except long 100 percent perfect repeats that are theoretically impossible to resolve without additional experiments. This reduction allows one to generate provably optimal error-free fragment assemblies. The main goal of this proposal is to scale up EULER for assembly of eukaryotic genomes.
|
1 |
2001 — 2007 |
Mcginnis, William (co-PI) [⬀] Bier, Ethan Pevzner, Pavel |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Multiplex in Situ Visualization of the Drosophila Transcriptome in Blastoderm Embryos @ University of California-San Diego
0120728 Bier
A major challenge following the completion of genome sequencing is to determine the expression patterns of all genes during development and in the adult. Obtaining this data is critical if we are to unravel the complex regulatory networks that regulate genome expression. Although whole genome gene expression can be analyzed by current microarray techniques, this method lacks spacial discrimination and is relatively low resolution in time and magnitude, since by its nature, it measures average gene expression levels over large heterogeneous cell populations. To understand the regulatory interrelationships between genes, many of which are regulated in highly dynamic and spatially restricted patterns, one must ultimately know how the genome is expressed on a cell-by cell basis throughout development. The most obvious way to obtain fine scale gene expression data at single cell resolution is by performing genome scale in situ hybridization experiments.
Dr. Bier, Dr. McGinnis, and Dr. Pevzner will address this problem jointly by developing a multiplex in situ hybridization method that will greatly facilitate and enable the acquisition of genome expression data at single cell resolution. In addition, this collaborative team will validate the method by applying it to two well defined hypothesis driven questions. The specific goals of this proposal are to: 1) Develop a multiplex RNA in situ hybridization labeling technique, 2) Analyze Hox gene regulatory networks repressing limb development, and 3) Identify genes mediating cross-talk between signaling pathways.
Impact Statement: Because the same genetic systems create pattern during development in diverse metazoans, fine spatial and temporal scale analysis of these regulatory relationships in Drosophila will provide an essential framework for analyzing how these core genetic pathways have served as substrates for modification by natural selection during evolution to tailor body plans to different environments and ecological niches. This knowledge is essential for resolving deep structures of metazoan phylogeny and may reveal whether multicellular metazoans co-opted a polarity generating mechanism present in facultative colonial unicellular organisms to create metameric pattern along the A/P axis. In addition, the methodologies we develop and the understanding we gain of cellular responses to developmental signals will form the basis for creating detailed mathematical models of cellular states and will be critical for evaluating how adult organisms respond
|
0.915 |
2002 — 2004 |
Pevzner, Pavel A |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Computer Mass-Spectrometry @ University of California San Diego
[unreadable] DESCRIPTION (provided by applicant): Protein identification is an important problem in many proteomics projects, including studies of post-translational modifications, protein-protein interactions, and protein functions. This grant proposal focuses on the computational aspect of tandem mass spectrometry (MS-MS), a technology of choice in many proteomics projects. Computational analysis is often a bottleneck in applying this technology for protein identification since the corresponding combinatorial problems remain largely unsolved. In close collaboration with leading mass-spectrometry groups we propose to develop new algorithms and software for identification of mutated and modified proteins, de-novo protein sequencing, DXMS, and gene validation via mass-spectrometry. We also propose to implement MS-ALIGNMENT, MS-SEQUENCING, and MS-GENESEARCH software tools as a web server available for mass-spectrometry researchers.
|
1 |
2004 — 2008 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Eulerian Path Approach to Dna Fragment Assembly @ University of California San Diego
Algorithms; CRISP; Computer Retrieval of Information on Scientific Projects Database; Consensus; DNA Sequence; Face; Funding; Genome; Genomics; Grant; Institution; Investigators; Masks; NIH; National Institutes of Health; National Institutes of Health (U.S.); Paper; Research; Research Personnel; Research Resources; Researchers; Resources; Shotguns; Source; United States National Institutes of Health; experiment; experimental research; experimental study; facial; new approaches; novel approaches; novel strategies; novel strategy; research study; tool
|
1 |
2005 — 2007 |
Pevzner, Pavel |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Recomb Conference Support @ University of California-San Diego
ABSTRACT
NSF-0507056
Pevzner, Pavel
Research on Computation in Molecular Biology (RECOMB) is the leading meeting in bringing computer scientists and molecular biologists together to focus on the computational aspects in the interdisciplinary field. The steering committee are internationally recognized leaders in computer science, statistics and computational molecular biology. The meeting provides peer reviewed plenary presentations, poster sessions and publication of papers and abstracts of posters. The engaging atmosphere enables all level of researchers to engage in discussion and exploration. Funds enable graduate students to participate in the meeting. The recruiting focus is on junior researchers and members of under-represented groups.
|
0.915 |
2006 — 2008 |
Pevzner, Pavel A |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Algorithms &Software For Computational Mass Spectrometry @ University of California San Diego
[unreadable] DESCRIPTION (provided by applicant): This grant proposal addresses the following important problems in computational proteomics: 1. Development of new filters for MS/MS database searches that will dramatically reduce the running time for protein identification and post-translationally modified proteins in particular. 2. Design of new algorithms for matching MS/MS spectra against the alternative splicing databases. 3. Development of algorithms for shotgun protein sequencing by clustering and assembly of overlapping spectra. Application of clustering and assembly of MS/MS spectra to analysis of post - translational modifications. Improving the state of the art in de novo sequencing through analysis of paired MS/MS spectra and generation of reliable sequence tags derived from paired spectra. 4. Development of computational tools for analyzing relative abundance of peptides in protein samples. [unreadable] Relevance: Mass spectromtery is a key technology for proteomics, and is increasingly used for research that directly impacts human health. Examples include, but are not limited to, discovery of protein bio-markers that can be used as diagnostics, and small peptides that can be used directly as therapeutics. However, computational analysis of mass spectromtery data remains a significant bottleneck. This proposed research addresses computational challenges in the analysis of mass spectromtery data. [unreadable] [unreadable] [unreadable] [unreadable]
|
1 |
2008 — 2013 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Center For Computational Mass-Spectrometry @ University of California San Diego
DESCRIPTION (provided by applicant): This application seeks support for a center of excellence in computational mass spectrometry and a national and international resource in the broad area of proteomics. It proposes to enlarge the current research activities, to branch into previously unexplored areas of computational proteomics, and to support multiple collaborative efforts. The proposal addresses the computational bottleneck that affects the entire proteomics community and impairs interpretation of data in thousands of experimental labs around the world. The goal is to bring the modern algorithmic technologies to mass-spectrometry and to build a new generation of reliable open access software tools to support both new development in mass-spectrometry instrumentation and the emerging applications of mass-spectrometry. The proposal focuses on four directions: (i) enabling complex mass spectrometry searches, (ii) analyzing unknown proteomes without protein databases, (iii) analyzing altered proteomes, and (iv) constructing proteogenomic annotations and analyzing pathways. These directions cover both well-studied but still inadequately addressed problems (like search for mutations and post-translational modifications) and unexplored problems for which there are no computational tools currently available (like antibody sequencing or analyzing fusion proteins in cancer). These projects require two-way collaborative efforts on a wide range of topics involving biomedical and computational scientists from various institutions. While many collaborations have been already established at San Diego (UCSD and Burnham Institute), sixteen other US universities, hospitals and biotechnology companies, as well as foreign research institutions at Germany, Singapore, Spain, Sweden, and United Kingdom, we propose to further extend these collaborations by developing robust open access mass spectrometry software that will catalyze the exchanges between experimental and computational researchers in proteomics. The biomedical applications addressed in these collaborative projects include but are not limited to (i) discovery of cancer biomarkers, (ii) elucidation of changes in aged cataractous lens, (iii) understanding how bacteria adjust to antibiotics and other harsh conditions, (iv) addressing the need to constantly reformulate the influenza vaccine to make it efficient, and (v) sequencing of snake venoms that proved instrumental in design of blood clotting drugs. Educational activities in the area of computational proteomics will also be developed, including short courses, a seminar program, an annual conference, and concerted education of students and postdocs.
|
1 |
2008 — 2010 |
Dorrestein, Pieter C [⬀] Pevzner, Pavel A |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
New Approaches to Sequencing of Complex Peptides. @ University of California San Diego
DESCRIPTION (provided by applicant): Nonribosomal peptides (NRPs) such as penicillin, vancomycin and related molecules isolated from microbial sources have been a staple for drug discovery for many decades. We propose to employ multi-stage mass-spectrometry (MSn) for de novo sequencing of NRPs, including cyclic NRPs. Analysis of MSn spectra of a cyclic peptide results in the difficult combinatorial problem of interpreting multiple linear peptides from the same spectrum. This proposal develops new combinatorial algorithms for solving this issue. Since the MSn based mass spectrometry analysis of NPRs is fast and inexpensive and requires minimal amounts of material (<1 5g), this approach opens a possibility of high-throughput sequencing of many unknown NRPs accumulated in large bioactivity marine cyanobacterial screening programs. In parallel to the automation of the NRPs sequencing efforts, we will harvest a set of orphan gene clusters from marine actinomycetes to generate a library of cyclic peptides. The algorithms developed in this proposal will be used to fully characterize this cyclic imine library. This work not only sets the stage for the automated characterization of NRPs but will also be applicable to the characterization of other peptidic natural products such as peptaibols, peptide derived toxins or lantibiotics. PUBLIC HEALTH RELEVANCE: This project describes the development and application of a novel mass spectrometry based method and corresponding algorithms that allow the de novo sequencing of complex therapeutic agents that are non-ribosomally derived.
|
1 |
2010 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Proteomic Analysis of Extinct Species to Validate Evolutionary Links @ University of Washington
This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Researchers (Asara et. al., Science 2007) have started using mass spectrometry to identify proteins from fossils that have been preserved in extinct species such as T. rex and mastodon to provide evolutionary links with extant species. These recent studies have identified collagen proteins and a hemoglobin protein that contain peptide sequences that match to multiple different extant taxa. There is much controversy on whether these identified peptides sequences are valid or artifacts of contamination. We are currently analyzing six fossilized dinosaur bone samples, three sediment samples and a bone sample from the femur of a cave bear to see what proteins we can detect using our sample preparation and mass spectrometry methods.
|
0.955 |
2011 — 2015 |
Hoffmann, Alexander (co-PI) [⬀] Pevzner, Pavel A Subramaniam, Shankar [⬀] |
T32Activity Code Description: To enable institutions to make National Research Service Awards to individuals selected by them for predoctoral and postdoctoral research training in specified shortage areas. |
Graduate Training Program in Bioinformatics @ University of California San Diego
DESCRIPTION (provided by applicant): Biology is increasingly becoming an information-driven science. To harness the opportunities of the post-genomic era in furthering health sciences research and improving health care, there is an enormous demand for biologists who are trained in mathematics and computer science and can think quantitatively. However, current disciplinary graduate training programs are not designed to accommodate these rapid changes in the biological research perspective. This need serves as the motivation for the development of specialized graduate training programs that will train students at the interface between biology, engineering and computer science. To address this need, UCSD established an interdisciplinary Graduate Program in Bioinformatics in 2001 under the directorship of Dr. Shankar Subramaniam. In 2008, it was renamed Graduate Program in Bioinformatics and Systems Biology and reorganized with Drs. Pavel Pevzner and Alex Hoffmann taking on the directorship, and the continued guidance from Dr. Subramaniam and an active steering committee containing representative faculty from all five participating UCSD schools and academic divisions. The primary objectives of this renewal application of the Training Grant by the three co-PIs are to continue and expand this premier Graduate Program, support the highest quality students in their truly interdisciplinary training which blends biomedicine, computer science and engineering. Indeed in the course of their training Program students have contributed important discoveries and impactful advances in health sciences research. The Program will continue to offer a core curriculum and a host of electives that will prepare a student solve difficult problems in biomedical research. Research training begins with a set of research rotations in laboratories of faculty members, and continues through doctoral research work under the supervision of a Ph.D. advisor and co-advisor who provide complementary interdisciplinary expertise, as well as the doctoral thesis committee. The Program will continue to a recently established weekly Colloquium, the student Journal Club, and annual retreat. Given the extraordinary number and quality of applicants, the capacity and eagerness of the Program faculty to train the Program's students, and the institutional support for the Program, this application seeks to increase the number of trainee slots to 18. Following Training Grant support of Graduate students during their course work education and initial research training, all graduate students will be supported by their thesis advisors for the duration of their Ph.D. studies. PUBLIC HEALTH RELEVANCE: The purpose of this doctoral training program is to train students in the emerging interdisciplinary area of Bioinformatics and Systems Biology. The NIH has recognized that there is a critical need for such interdisciplinary training that integrates the biomedical, computational and engineering sciences, in order to harness the opportunities of the post-genomic to furthering the health sciences research and improving health care.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Technology Research and Development Project 2: Antibiotics Sequencing and Drug Discovery @ University of California San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Technological Research and Development Project 4: Universal Tools For Peptide Identification and Sequencing @ University of California, San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Technology Research and Development Project 5: Top-Down Proteomics @ University of California, San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Training @ University of California San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2016 |
Pevzner, Pavel A |
R25Activity Code Description: For support to develop and/or implement a program as it relates to a category in one or more of the areas of education, information, training, technical assistance, coordination, or evaluation. |
Integrated Active Learning Framework For Biomedical Bd2k @ University of California San Diego
DESCRIPTION: The proposed project will create active and adaptive open online resources for students and educators. We propose the development of two massive open online courses (MOOCs) for Biomedical Big Data (BBD). BBD for Bioinformaticians will be aimed at bioinformatics students who know some introductory programming and need specialized tutorials focusing on BBD analysis. BBD for Biologists will provide biologists having no previous exposure to programming with the skills required to effectively apply existing software tools in BBD. These MOOCs will have three different adaptive learning tracks that will help guide readers through the courses based on their computational experience. Creating such an adaptive environment would not be possible without our substantial experience in offering the first bioinformatics MOOC, Bioinformatics Algorithms, in fall 2013 on Coursera. By making our learning materials open for use by individual learners and professors, we hope to bring down resource barriers that have prevented BBD courses from growing at individual universities. We will also develop two new problem tracks on our online Rosalind platform that facilitates independent learning of bioinformatics through automatically tested challenges. One of these problem sets will focus on implementing the algorithms required for BBD analysis; the second problem set will focus on applying existing online tools to analyze BBD. By creating a comprehensive set of assessments, we will eliminate the need for BBD professors to ever again think about automating their own homework assignments. Combined with the efforts of our open, adaptive learning environment, these problem sets will help reduce the barriers to creation of new BBD courses at universities. We will foster an open community of BBD educators by forming the BBD Education Alliance. This network will be founded at the RECOMB Conference on Bioinformatics Education at UCSD in 2015, which will focus on BBD education. Members in the alliance will create open learning modules to supplement our content as well as provide feedback to other members of the alliance on their modules. These educators will also work to design BBD courses at their own universities. Finally,
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Driving Biomedical Projects @ University of California San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Dissemination @ University of California San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Collaborations and Service @ University of California San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Bafna, Vineet (co-PI) [⬀] Bandeira, Nuno Filipe Cabrita Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Center For Computational Mass Spectrometry @ University of California San Diego
DESCRIPTION: Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of col- laborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical in- teractions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vac- cines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational mod- ifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2014 — 2018 |
Pevzner, Pavel A |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Administration and Management @ University of California San Diego
Project Summary Mass spectrometry is based on fragmenting biological molecules into smaller pieces, and using the fragment masses as a fingerprint for identifying and quantifying bio-molecules. It is the dominant technology for studying active molecules in healthy and diseased tissue, and identifying protein targets and natural products for novel therapeutics. When the initial proposal Center for Computational Mass Spectrometry (CCMS) was submitted in 2007, the lack of adequate computational tools for analyzing mass spectrometry data was the the key bottleneck. With great success in enabling applications of new experimental techniques such as FTMS, ETD, HCD, top-down mass spectrometry, and many others, the mandate of CCMS continues to be the development of next generation computational technologies and to apply them to open experimental. In this proposal, we will capitalize on our recent results in diverse subfields of computational proteomics and will further branch into previously unexplored MS applications. We will focus specifically on bridging proteomics and genomics technologies using 6 technology research and development platforms. Specifically, we will (a) apply proteogenomics approach for the discovery of abberant cancer genes and analyzing antibody repertoires; (b) sequence natural antibiotics; (c) collate spectral data through spectral archives and networks; (d) develop universal tools for peptide identification; (e) develop tools for top-down proteomics; and, (f) analyzing multiplexed spectra. The technology platforms are driven by a multitude of collaborative biomedical studies where the use of CCMS developed tools is essential for their success. These studies include (a) unraveling the combinatorial histone code in human diseases; (b) a proteogenomics approach to studies of oral microbiome and polybacterial infections; (c) detecting inter-species chemical interactions; (d) developing a systems approach towards the therapeutic modulation of the acetylome ; (e) developing tools for monoclonal and polyclonal antibody sequencing; (f) development of breast cancer vaccines; (g) clinical cancer proteogenomics; (h) discovery of lantibiotics; (i) discovering proteomic biomarkers for drug toxicity in cancer patients; and, (j) identifying protein-protein interactions and post-translational modifications in cataractous lens. These projects require three-way collaborative efforts on a wide range of topics involving biomedical scientists, mass spectrometrists, and computational scientists from various institutions. CCMS will also train students and practicing scientists from all over the world in computational proteomics, and educate the proteomics community about modern computational mass spectrometry to encourage its wide adoption.
|
1 |
2016 |
Pevzner, Pavel A |
R25Activity Code Description: For support to develop and/or implement a program as it relates to a category in one or more of the areas of education, information, training, technical assistance, coordination, or evaluation. |
Development of a New Mooc 'Programming For Biologists' @ University of California San Diego
DESCRIPTION: The proposed project will create active and adaptive open online resources for students and educators. We propose the development of two massive open online courses (MOOCs) for Biomedical Big Data (BBD). BBD for Bioinformaticians will be aimed at bioinformatics students who know some introductory programming and need specialized tutorials focusing on BBD analysis. BBD for Biologists will provide biologists having no previous exposure to programming with the skills required to effectively apply existing software tools in BBD. These MOOCs will have three different adaptive learning tracks that will help guide readers through the courses based on their computational experience. Creating such an adaptive environment would not be possible without our substantial experience in offering the first bioinformatics MOOC, Bioinformatics Algorithms, in fall 2013 on Coursera. By making our learning materials open for use by individual learners and professors, we hope to bring down resource barriers that have prevented BBD courses from growing at individual universities. We will also develop two new problem tracks on our online Rosalind platform that facilitates independent learning of bioinformatics through automatically tested challenges. One of these problem sets will focus on implementing the algorithms required for BBD analysis; the second problem set will focus on applying existing online tools to analyze BBD. By creating a comprehensive set of assessments, we will eliminate the need for BBD professors to ever again think about automating their own homework assignments. Combined with the efforts of our open, adaptive learning environment, these problem sets will help reduce the barriers to creation of new BBD courses at universities. We will foster an open community of BBD educators by forming the BBD Education Alliance. This network will be founded at the RECOMB Conference on Bioinformatics Education at UCSD in 2015, which will focus on BBD education. Members in the alliance will create open learning modules to supplement our content as well as provide feedback to other members of the alliance on their modules. These educators will also work to design BBD courses at their own universities. Finally,
|
1 |
2017 — 2020 |
Pevzner, Pavel |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf/McB-Bsf: Developing a Metavirome Assembler For Uncovering the Global Virome @ University of California-San Diego
While viruses are the most diverse biological entities on Earth, challenges in their genome sequencing have prevented surveys of Earth's virome. The key obstacle on the way towards exploration of viral diversity is the absence of computational methods of assembling virus genomes from the short overlapping sequences of DNA obtained when sequencing DNA from a wide range of environments. This project will develop such methods to enable a more complete cataloging of virus genomes and provide a critical new resource for microbiology. In addition, it will result in a new online learning classes at UCSD and Tel Aviv University as well as an online capstone project aimed at analyzing viral genomes. This activity will extend the online Coursera Specializations 'Bioinformatics' and the Massive Online Open Course (MOOC) "Gut Check: Exploring Your Microbiome" by including a new metagenomics component. This educational effort will reach thousands of students from all over the world since these MOOCs have large enrollments.
This project will result in new genome assemblers specifically aimed at reconstructing the viral component of metagenomes. It will enable the development of a new metavirome assembler aimed at discovery of complete viral genomes across a wide range of environments. This research will result in reassembling all publicly available metagenomics datasets to compile the global catalog of complete viral genomes, which will enable the exploration of viral diversity.
This collaborative US/Israel project is supported by the Division of Molecular and Cellular Biosciences, Biological Sciences Directorate and the Computational Biology activity in the Computer and Information Sciences and Engineering Directorate at the US National Science Foundation and by the Israeli Binational Science Foundation.
|
0.915 |
2020 — 2022 |
Pevzner, Pavel |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Assembling the Immunoglobulin Loci Across Mammalian Species and Across the Human Population @ University of California-San Diego
The ongoing COVID-19 pandemic raised the challenge of developing new computational techniques for understanding the immune response to emerging pathogens. This project will develop a new computer program for analyzing the parts of the genome that are responsible for immune responses to pathogens. This will enable analyses of the development of immunity to the virus in human populations by characterizing the mutations that are essential to the immune response. This research will also enhance understanding of the immune responses in bats, which are hosts to large numbers of coronaviruses, and in llamas, which are potentially an important source for a robust vaccine against the virus. The project will also develop a Massive Online Open Course ?Bioinformatics of SARS-COV-2? that will cover various aspects of computational analysis of emerging pathogens, and the challenges of analyzing these complex and dynamic parts of the human genome.
Information about the immunoglobulin-encoding regions in the genome is critically important for analyzing the immune response to the novel SARS-CoV-2 coronavirus, developing antibody drugs against SARS-COV-2, and testing future vaccines against SARS-CoV-2. However, there are still no algorithms for inferring the sequences of the highly complex and rapidly evolving immunoglobulin (IG) loci. Moreover, although the immunoglobulin genes within the IG loci form the building blocks of antibodies, there are still no software tools for accurate identification of these genes. This project will develop a new algorithm for assembling and annotating the IG loci. These methods will be applied to multiple mammalian genomes with a focus on assembling the unusually complex IG loci in bats to characterize their immune response to coronaviruses. A second focus will be assembly of camelid IG loci to aid in developing vaccines against SARS-COV-2, taking advantage of the simpler and more stable antibodies present in this group. The project will also focus on assembling the highly variable immunoglobulin loci in multiple COVID-19 patients with the goal of revealing functionally important mutations in these loci. The project entails substantial risk, because the IG loci have been recalcitrant to all previous assembly attempts, but enhanced understanding of mammalian immunoglobin genes would greatly enhance the ability of society to predict and respond to current and future pandemics.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.915 |
2020 — 2021 |
Pevzner, Pavel A |
R25Activity Code Description: For support to develop and/or implement a program as it relates to a category in one or more of the areas of education, information, training, technical assistance, coordination, or evaluation. |
Development of Online Computational Genomics Specialization @ University of California, San Diego
Project Summary In modern biological research, computing has become an integral component in Biological Big Data (BBD) analysis, yet education in computing has not been fully incorporated into life science education: many biologists are given a diluted treatment of computational genomics that presents the methods central to the field as nothing more than a toolkit. ?The pedagogical challenge facing the development of a computational genomics curriculum is the need to convey the important ideas ?without assuming previous exposure to programming. Biologists would also profit from knowing how to effectively apply various existing genomics software tools and, at the same time, ?understand how these tools work, a condition that is often violated in existing courses. We believe that high-quality online computational genomics education offers a particularly attractive solution to the problem because many universities have failed to address this challenge. It offers a promising pedagogical innovation because it is not ?replacing anything, but rather is fulfilling an important need. It bypasses the need for extensive curricular reform at the level of individual universities and instead adapt to high-quality, open online resources that lower the cost per student. We believe that our proposed online ?Computational Genomics Specialization will contribute to various offline courses (e.g., by enabling a flipped course) that will be developed in response to the same NIH Funding Opportunity Announcement ?Initiative to Maximize Research Education in Genomics.? We have published popular bioinformatics and algorithms textbooks, ?have published papers on various challenges of education in computational biology in reputable journals, delivered a TEDx talk on online education, founded a conference specializing in bioinformatics education (RECOMB-BE), developed multiple successful MOOCs in bioinformatics (including the first bioinformatics MOOC), and advised the development of Rosalind, an open online platform for learning bioinformatics through problem solving that has been used by over 100 professors. Our goals are (1) to develop open, modular, extendable, and adaptable MOOCs covering a broad range of topics in modern computational genomics, (2) use the developed MOOCs to competitively recruit the participants into the proposed offline computational genomics short courses and to bring underrepresented minorities to these events, and (3) establish ?the Computational Genomics Education Alliance, a community of educators who will help develop open, high-quality, modular online content and serve as instructors at our annual courses.
|
1 |