2010 — 2013 |
Cheng, Jianlin |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Integrated Prediction of Protein Struture At 1d, 2d and 3d Levels @ University of Missouri-Columbia
DESCRIPTION (provided by applicant): Computational prediction of protein structure from the amino acid sequence is one of the most important and challenging problems in bioinformatics and computational biology. With the exponential growth of protein sequences without solved protein structures in the post-genomic era, accurate protein structure prediction methods and tools are in urgent need. Here, we propose to develop an integrated approach to advance protein structure prediction at the 1-dimensional (1D), 2-dimensional (2D) and 3-dimensional (3D) levels. At the 1D level, novel information such as domain evolution signals, alternative gene splicing sites, and 2D protein contact map will be used to predict protein domain boundaries from the sequences. At the 2D level, new methods such as residue contact propagation, machine learning boosting, linear programming, and Markov Chain Monte Carlo simulations will be used to advance residue-residue contact prediction for a domain, or a protein. At the 3D level, 2D contact prediction, fold recognition via machine learning, and multi-template combination will be used to enhance both template-based and ab initio structure prediction. Finally, knowledge-based statistical machine learning methods and model combination algorithms will be developed to reliably evaluate and refine the quality of predicted protein structural models. One of several innovative aspects of this approach is to integrate 1D, 2D, and 3D predictions in order to improve each other through protein structural unit - domains. The 1D, 2D, and 3D protein structure prediction methods will be implemented as user-friendly software packages and web services released to the scientific community. These tools and web services will be useful for protein structure prediction, structure determination, functional analysis, protein engineering, protein mutagenesis analysis, and protein design.
|
1 |
2010 — 2015 |
Stacey, Gary [⬀] Xu, Dong (co-PI) [⬀] Cheng, Jianlin |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Soybean Root Hairs: a Model For Single-Cell Plant Biology @ University of Missouri-Columbia
PI: Gary Stacey (University of Missouri) Co-PIs: Jianlin Cheng (University of Missouri) and Dong Xu (University of Missouri) Collaborators: David Koppenaal (Pacific Northwest National Laboratory) and Ljiljana Pa?a-Toliæ (Pacific Northwest National Laboratory)
Systems biology is the comprehensive, quantitative analysis of the manner in which all components of a biological system interact. The ultimate goal is a new, predictive view of cellular function, supplanting the older descriptive understanding. However, even with the availability of advanced technologies, systems biology has yet to achieve its promise due to a variety of issues, not the least of which is the overall complexity of biological systems. The project reduces this complexity by studying the response of a single, plant cell type, the root hair, to infection by the beneficial nitrogen fixing bacterium, Bradyrhizobium japonicum. The current project builds on past work that has developed the soybean root hair cell as an ideal model for genomic study. The goal is to focus specifically on the regulatory networks that control the infection process leading to the successful establishment of a nitrogen fixing symbiosis. Beyond its relevance to systems biology, the work promises to yield new insights into plant cellular function, root development and the nitrogen fixation process, which is of great agronomic and ecological importance.
The root hair is the first site of plant response to bacteria during the nitrogen fixation process. Therefore, identifying the plant response pathways at a systems level could provide new avenues for manipulating nitrogen fixation in crop plants. The project will yield sequence data that will be made publicly available through GenBank and through our project website (www.soyroothair.org), as well as the Soybean Knowledgebase (www.SoyKB.org). The latter is being designed as a comprehensive resource to house, visualize and analyze all types of soybean functional genomic data. The project has a substantial outreach and educational component that partners with the University to address K-12, undergraduate, graduate and postdoctoral training.
|
1 |
2012 — 2017 |
Cheng, Jianlin |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Analysis, Construction and Visualization of 3d Genome Structures @ University of Missouri-Columbia
This project will investigate the three-dimensional (3D) structures of genomes. A 3D genome structure is critical for studying genome folding, genome function, and spatial gene regulation, but it has not been well studied in comparison with a one-dimensional (1D) linear genome. The main goal of this CAREER project is to design and develop contact-based computational methods to analyze, construct, and visualize 3D structures of genomes using chromosomal contact (interaction) data generated by genome conformation capturing techniques and next-generation DNA sequencing. The research data, methods and tools will provide materials to develop two new bioinformatics courses, integrating computational optimization, molecular structure modeling, genome sequencing, and genome annotation.
The successful completion of this project will produce a set of novel computational methods and bioinformatics tools to analyze, construct, and visualize 3D genome structures. The methods and tools will boost the study of genome structure, function, and gene regulation in the spatial context, which will have broad applications in almost every aspect of modern biological sciences in the post-genomic era. The research component will generate a rich set of new data and methods to train K1-12, undergraduate, graduate and postdoctoral students. The wide dissemination of the two new bioinformatics courses will enrich the curriculum of bioinformatics, computational biology, computer science, large-scale constrained optimization, and molecular structure modeling. More information about the project is available at: http://www.cs.missouri.edu/~chengji/nsf_career.html.
|
1 |
2015 — 2018 |
Cheng, Jianlin Tanner, John J (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Integrated Prediction and Validation of Protein Structures @ University of Missouri-Columbia
? DESCRIPTION (provided by applicant): Knowledge of three-dimensional protein structure is indispensable in biomedical research. Protein structure and function are intimately linked, and thus structure facilitates drug discovery, aids investigations of protein-protein interactions, informs mutagenesis analysis, guides protein engineering and the design of new proteins, and provides a foundation for understanding the molecular basis of disease. However, the number of protein sequences available in the genomic era far exceeds the capacity of the main experimental structure determination techniques of X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, resulting in a substantial sequence- structure gap. We address this ever-widening gap by developing and disseminating novel protein structure modeling tools. This renewal project is a new collaboration between experts in computational modeling (Cheng) and experimental structural biology (Tanner). We plan to develop innovative, integrated machine learning (e.g., deep learning), data mining and statistical modeling methods to address major challenges in both template-based structure modeling and template-free (ab initio) structure modeling. We will apply these tools to enzymes in the aldehyde dehydrogenase (ALDH) superfamily, a group of enzymes that are involved in numerous important biological processes and implicated in many diseases due to mutations. The ALDH models will be experimentally validated using X-ray crystallography and biochemical assays. Furthermore, we will combine the modeling power of our structural Input-Output hidden Markov model with experimental small- angle X-ray scattering (SAXS) to predict the tertiary structures of large multi-domain proteins. The integration of computational and experimental sciences in this project positions us uniquely in structure modeling space.
|
1 |
2016 — 2021 |
Birchler, James [⬀] Cheng, Jianlin |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Research-Pgr: Genomic Balance Analysis in Maize @ University of Missouri-Columbia
Genetic information in plants and animals is carried in the DNA of pairs of chromosomes that are carried forward from generation to generation. Change in the numbers, or doses, of individual chromosomes is known to occur, usually with detrimental effects on the stature and health of the affected organism. Surprisingly, there are no obvious differences if the entire set of chromosomes change. This phenomenon has puzzled scientists for decades: what features of the genome maintain balance when fully altered, but are imbalanced when only some chromosomes are affected? This project will examine how and why this partial change in the chromosomes has such an impact on plants. Maize is an ideal model for this study due to its global crop importance and also because of the powerful genetic resources available to test how the so-called genome balancing act occurs. One hypothesis is that the relative expression of regulatory genes, which are the factors that control the expression of other genes, has a significant impact on development and vigor of plants. The research identifies the underlying molecular mechanisms involved in maintaining genomic balance. This knowledge is essential to understand how plant vigor is controlled and will provide information about the traits needed for improving agriculture. Students and educators of all levels are engaged in addressing this problem through direct hands-on research, via social media outlets and outreach workshops. Understanding how regulatory dosage effects operate will guide breeding programs for crop improvement and will answer basic questions about how plant genomes function.
This project is based on a synthesis emerging from the idea of genomic balance known from classical genetics and more recent molecular studies. The overall hypothesis is that the stoichiometry of assembly of multisubunit gene regulatory complexes affects the function of the whole, which will impact global gene expression and ultimately the phenotype. The analysis of genomic balance issues will address: (1) how genomic imbalance affects gene expression levels of mRNA, siRNA, and miRNA as the baseline for understanding the circuitry involved; (2) how small RNAs are involved with genomic balance in modulating mRNA levels and how genomic balance affects small RNA levels; (3) how genomic imbalance impacts and/or operates through chromatin modifications; and (4) how genomic imbalance works on the single gene level by examining the effect of varying the dosage of single subunits on whole complex formation and function. These aspects will be studied in a set of aneuploids generated by translocation with the supernumerary B chromosome of maize that can be used to vary selected chromosome arms in haploid and diploid plants. By examining more complex changes in dosage with greater or lesser genomic imbalance, how the interactions of regulatory processes alter various aspects of gene expression will be tested. Single regulatory gene candidates responsible for target gene modulations will be examined in a transgenic dosage series to gain insight into the mechanism of genomic balance. Together, the information from all of these fields will contribute to an understanding of genomic balance and allow this information to be applied to issues of world food production.
|
1 |
2020 — 2021 |
Cheng, Jianlin |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Distance-Based Ab Initio Protein Structure Prediction @ University of Missouri-Columbia
Project Summary Predicting the three-dimensional structures of proteins without using known structures from the Protein Data Bank (PDB) as templates (ab initio) remains a grand challenge of computational biology. Whereas template-based modeling is now a mature field, ab initio modeling is a comparatively nascent one, especially for large proteins with complex topologies and multiple domains. The need for advances in ab initio modeling is evident. A lot of protein sequences do not have (recognizable) templates in the PDB, and the pace of experimental structure determination is incommensurate with the scale of the problem. Herein, we propose a new approach to ab initio modeling that consists of novel deep learning architectures to predict inter- residue distances and domain boundaries as well as robust, iterative optimization methods to construct tertiary structures from the predicted distances. This project builds on the success of our current R01, particularly the outstanding performance of the Cheng group in the 2018 worldwide protein structure prediction experiment ? CASP13 ? where our MULTICOM suite ranked among the top three tertiary structure predictors, alongside Google DeepMind?s AlphaFold. The methods will be implemented as open-source tools for the emerging field of distance-based ab initio protein structure modeling. We will apply the methods to study protein homo-oligomers and self-assemblies, based on our novel discovery that the quaternary structure contacts within homo-oligomers can be predicted by deep learning methods from the co-evolutionary signals embedded in multiple sequence alignments of protein monomers. Furthermore, we will apply the methods to predict the folds, functional sites, superfamilies, and protein-protein interactions of proteins that contain ?essential Domains of Unknown Function? (eDUFs), a group of evolutionarily conserved, essential proteins that represents an important uncharted region of protein function/fold space. The predictions for a diverse and representative subset of eDUFs will be experimentally validated through a unique collaboration with the structural biology group of Dr. Tanner.
|
1 |
2022 — 2025 |
Birchler, James [⬀] Cheng, Jianlin |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
The B Chromosome of Maize: Drive and Genomic Conflict @ University of Missouri-Columbia
Varieties of corn have an extra chromosome that is not essential called the B chromosome. This chromosome has properties that, based on analysis of its DNA sequence, have allowed it to persist from one generation to the next for millions of years, despite the fact that it is not needed for the plant. While these properties are beneficial to the success of the B chromosome, they often have negative consequences for the normal corn chromosomes such as chromosome fracture and the activation of transposable elements that have the potential to jump around the genome and cause mutations. A genetic and molecular analysis will be conducted to learn the nature of these B chromosome properties and how the corn genome has been impacted by the presence of the B chromosome. The B chromosome appears to have been involved in restructuring the corn chromosomes, so an understanding of these properties will help provide insight into the nature of, and variation in the maize genome. The research will involve participants at all educational levels and will involve training in genetics, genomics, and computational biology.
The supernumerary B chromosome of maize is a novel, selfish genetic system involving a whole chromosome that impacts genomic processes in general. The B chromosome has manipulated cellular processes to ensure proper segregation in male meiosis by increasing recombination in heterochromatin, by using a novel meiotic process to stabilize itself in meiosis, by delaying replication of the centromere at the one mitosis that makes the two sperm, and by mediating fertilization of the egg by B containing sperm. Experiments will examine the mechanism of centromere nondisjunction, the identity and function of the trans-acting factors needed for nondisjunction, the nature of the univalent stabilization process, the nature of the modulation of recombination across the genome, and aspects of genomic conflict such as de-silencing of transposable elements, identifying the gene responsible for genomic shattering by B chromosomes in some backgrounds, and the gene responsible for variation in preferential fertilization. The completion of this project will provide data that support a new paradigm in the understanding of a suite of co-opted functions by a mega selfish genetic entity. The interdisciplinary team will seek to understand this selfish entity to gain new insight into concepts previously unexplored in any system about genetic drive and genomic conflict. A training program across biological disciplines with computational analyses will be conducted at all educational levels.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |