2009 — 2014 |
Korkin, Dmitry |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: a Computational Approach to Study Molecular Mimicry in Host-Pathogen Interactions @ University of Missouri-Columbia
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
This is a CAREER award to support the research of Dr. Dmitry Korkin, in the Computer Science Department and Informatics Institute, at University of Missouri-Columbia. Dr. Korkin is a second-year, tenure-track Assistant Professor. Infections are complex biological processes that are common among a variety of microbial pathogens, targeting host organisms from virtually all kingdoms of life. The pathogen's strategy of entering the host organism and breaching its immune defenses often involves interactions between the host and pathogen proteins. In many cases, the pathogen can alter the host's cellular functions for the microbe's benefit by mimicking either an entire structure of a host protein or its important functional part. This PI is developing computational methodologies for accurate detection of pathogen protein mimicry. These methods are improving our ability to predict virulent pathogen mimicry when pathogen and host proteins are structurally similar. In addition, this research is developing new methods to predict mimicry when the structures of virulent and host proteins are not related but the functional elements are. A comprehensive web-based database of host pathogens mimicry interactions is being produced from this research. This research will help biologists elucidate the key mechanisms behind this mimicry phenomenon and will provide crucial information for combating infections in plants, animals and humans. Databases and tools produced under this project will be accessible via the PI's web site at http://korkinlab.org.
As part of his CAREER plan, Dr. Korkin is computationally developing accurate predictions and characterizations of host-pathogen interactions. The PI is integrating this research with education by attracting a new generation of scientists into the study of biology using computers (i.e. computational biology), broadening the presence of women and groups traditionally underrepresented in science, and designing unique courses that are providing interdisciplinary training in computational biology.
|
0.939 |
2012 — 2016 |
Korkin, Dmitry Thelen, Jay [⬀] Xu, Dong (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Systems Analysis of Protein Interactome in Cytosol and Nuclei of Developing Soybean and Rapeseed Embryos @ University of Missouri-Columbia
PI: Jay Thelen (University of Missouri-Columbia) CoPIs: Dong Xu, Dmitry Korkin (University of Missouri-Columbia)
Seeds are carbon- and nitrogen-dense structures that serve as the (plant) propagule and the (human) socioeconomic foundation for modern agriculture. Genetics and environmental conditions dictate the developmental program that ultimately determines both seed yield and composition, although the underlying regulatory proteins and mechanisms are only now beginning to be understood. Regulation of seed development has recently focused on transcriptional and epigenetic control of both embryogenesis and maturation. From a biochemical perspective, few regulatory processes take place within a cell without direct or indirect involvement of proteins, and protein-protein interactions. The nucleus and cytosol are the major locations for transcriptional, translational, and post-translational regulation. A quantitative inventory of the proteins involved in these processes and their interactome network, would be a valuable resource to develop testable models for regulatory control of embryo maturation. The main focus of this project is the parallel characterization of nuclear and cytosolic protein interactomes from developing embryos of soybean and rapeseed using a combination of biochemical, proteomic, and bioinformatic approaches. The major objectives of this project are to: 1) systematically and quantitatively analyze protein-protein interactions from cytosolic and nuclear fractions of developing embryos from soybean and rapeseed using a multi-dimensional biochemical strategy that includes protein crosslinking, co-sedimentation, and quantitative mass spectrometry; 2) perform bioinformatic and statistical analyses of potential interacting proteins using data from aim 1 for comparison between soybean and rapeseed, and develop generic protein complex identification tools for such data; and 3) disseminate software, proteomic and protein interaction data by deposition into extant databases (PRIDE, http://www.ebi.ac.uk/pride/; Database of Interacting Proteins, DIP, http://dip.doe-mbi.ucla.edu/; Soybean Knowledge Base, http://soykb.org) and develop a new web database.
While the research project targets the protein interactome in developing soybean and rapeseed embryos, it has broader significance for studying plants (especially crops) in general. This study will greatly enrich the data describing plant protein interactions, which is currently limited. These data can be used as a template for studying protein interactomes of other plants. Besides the data and their analyses, the techniques and bioinformatic tools that will be developed can also be applied to study interactomes of other species; in particular, experimental protocols and bioinformatic tools (including source code) will be made freely available to the public. The project will provide critical interdisciplinary training in plant biology, proteomics, and bioinformatics for undergraduates, graduate students, and postdoctoral scholars. Current discipline-based education provides very limited training on integrating biology, biotechnology, and bioinformatics, which is often needed in large-scale biological projects. This project provides a problem-based, interdisciplinary environment to provide training in this regard. Students and postdocs will be mentored in scientific writing, scientific presentations, publishing, grant applications, constructive peer review, project management, time management, and personnel management in an effort to direct the next generation of scientists towards the realistic and professional expectations of modern academic or commercial research. Efforts will be made to recruit people with diverse backgrounds to work on this project. Undergraduate researchers in life sciences and journalism will have opportunities to gain unique experience in computational biology and scientific communication through support from a recently awarded Howard Hughes Medical Institute grant.
|
0.939 |
2015 — 2018 |
Korkin, Dmitry |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Abi Innovation: Discovering Elements of Extreme and High Conservation in Eukaryotic Genomes @ Worcester Polytechnic Institute
The project seeks to address a fundamental problem of finding regions of eukaryotic genomes that share a remarkable property: when compared between genomes of two and more species, the sequences from these regions appear identical or perhaps with a very small number of errors (mutations). Discovered more than 10 years ago, the origins of this phenomenon and its functional implications for the living organisms remain a mystery. This is not surprising as until now, cataloging all such regions between all species was a computationally unfeasible task, requiring a conceptually different computational approach. In this project, a series of such novel algorithms will be developed that take a full benefit of the kind of biological data being processed and the type of hardware that it is run on. The algorithms will be optimized guided by a biological hypothesis on the distribution of the extreme genomic elements. The algorithms will also be designed to optimally use the internal memory of the computing processors, one of the main bottlenecks of the conventional software. The success of this project will have important implications. It will provide new insights into the eukaryotic evolution and introduce new functional class of genomic elements. Moreover, based on the recent literature and the preliminary data from this project's team, these extreme elements may be implicated in a number of complex genetic disorders in humans. Another important advancement is introducing a new computational paradigm in genomics and bioinformatics of designing algorithms that are biological data- and computing hardware-optimized. The project also proposes a series of interlinked educational activities targeting not only undergraduate and high-school students, but also reaching towards high-school teachers to help them integrate bioinformatics and computational genomics into the high school biology curriculum. By including both, the teacher and student components, the goal is to further broaden the impact by encouraging females and underrepresented minorities to pursue careers in genomics and informatics and involve them in outreach to their parents and high-school peers.
The goal of this project is to develop computational methodology for a fast and comprehensive detection of the regions of extreme and high conservation in one and across multiple eukaryotic genomes. The two main classes of genomic elements targeted by this study are long identical multispecies elements (LIMEs) that include but are not limited to UCEs, and near-identical multispecies elements (NIMEs), the highly similar genomic regions that allow only a few mismatches. The project includes three main research aims. Aim 1 is to develop new tools and improve existing tools for genome-wide comprehensive determination of regions of extreme and high conservation, LIMEs and NIMEs. Aim 2 is to apply the developed algorithms to determine a complete atlas of elements of extreme and high conservation in eukaryotes and test biological hypotheses on their evolution, structural organization, and relationship with the genetic variation within species populations. Finally, Aim 3 is to disseminate data on regions of extreme and high conservation and computational tools for their detection. The educational activities will include three components: (1) attract high-school students to the computational undergraduate sciences by educating them about the research in computational genomics and bioinformatics, (2) attract new and retain existing undergraduate students to the Ph.D. program in informatics by involving them in the interdisciplinary research, and (3) provide support for high-school teachers with the implementation of bioinformatics and computational genomics into the high school biology curriculum. The datasets and software tools will be freely available to public at http://korkinlab.org .
|
0.906 |
2018 — 2019 |
Korkin, Dmitry |
R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Functional Characterization of Genetic and Post-Transcriptional Variation Using Machine Learning Methods @ Worcester Polytechnic Institute
The goal of this research proposal is to develop new in-silico approaches for accurate functional annotation of genetic and post-transcriptional variants. The rapid growth of Next-Generation Sequencing (NGS) and high- throughput -omics data have brought us one step closer towards mechanistic understanding of the complex genetic disease, such as cancer, neurological disorders, diabetes, and others at the molecular level. In particular, these data revealed that complex diseases commonly manifest changes at the genetic and post- transcriptional levels. Bot of these types of changes often affect structure and function of the corresponding genes and their products. Understanding the functional implications of the genetic and post-transcriptional variation is an important task as it can provide critical insights into the molecular mechanisms underlying the disease. Here, we propose to leverage novel machine learning paradigms to design computational methods for predicting the effect of genetic and alternative splicing variants on the macromolecular interactions. Macromolecular interactions underlie many cellular functions in a healthy organism. The disease-induced changes in the genes, such as single nucleotide variations (SNVs) and alternative splicing variations (ASVs) have been recently reported to cause the protein-protein interaction network rewiring. Unfortunately, the experimental high-throughput techniques that characterize the large-scale effects of SNVs or ASVs on PPIs are expensive, time-consuming, and far from being comprehensive. The current in-silico methods either suffer from the limited applicability, or are less accurate when compared with the experimental methods. To overcome these challenges, we will use two recent machine learning paradigms, learning under privileged information (LUPI) and semi-supervised learning. If successful, we expect for the proposed methods to provide the critical advancement in the two main challenges of the current computational approaches, the limited coverage and lower than the experimental accuracy. The methods will be freely available to the community as the stand-alone tools as well as web- servers.
|
0.906 |