1997 |
Ghosh, Debashis |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Crystals of Enzymes Involved in Estrogen &Progesterone Biosynthesis @ Cornell University Ithaca
structural biology; enzymes; hormones; biomedical resource; biological products;
|
0.94 |
1998 — 1999 |
Ghosh, Debashis |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Human Type 1 17b Hydroxysteroid Dehydrogenase-Equilin Complex: Breast Cancer @ Cornell University Ithaca
During two beam-time accesses at CHESS, each of 48 hours duration, we conducted feasibility studies with regard to the prospect of data collection on crystals of several complexes of type 1 human 17b-Hydroxysteroid Dehydrogenase (17b-HSD1) and small crystals of other steroidogenic enzymes, such as human Cytochrome P450 aromatase. We were able to record diffraction under cryogenic conditions and establish the feasibility of data collection on several of these crystals. Two partial data sets were collected on the crystals of equilin- and coumestrol-complexes of 17b-HSD1. Despite being small (< 0.2mm) and thin (< 0.05mm), both these crystals diffracted to about 2.5 [unreadable] resolution. Data sets were about 55-75% complete. More beam time is required to complete the data collection. Our regular application for more beam time has recently received a favorable rating, permitting us to continue with the process. Preliminary structure analyses with these data sets have enabled us to locate equilin and coumestrol molecules in the active site of 17b-HSD1. These results are being presented at the upcoming Endocrine Society meeting in New Orleans.
|
0.94 |
1999 — 2002 |
Ghosh, Debashis |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Inhibited Complexes of Human Type I 17: Drug Design Target in Breast Cancer @ Cornell University Ithaca
Enzymes that utilize rare transition metals are of interest to help us understand unique enzymic reactions. Formate dehydrogenase H (FDH) from E. coli is such an enzyme. FDH contains multiple redox centers, which include a molybdopterin cofactor, an iron-sulfur cluster and a natural selenocystein residue at its active site. This protein shares no sequence homology with other dehydrogenases which all have the well-known protein binding fold. We hope to understand the mechanism of this class of enzymes that utilizes molybdopterin and selenocystein to mediate their redox reaction. Another unique class of enzymes whose catalytic mechanism is not understood is that of the blood converting enzymes, a-N-acetylgalactosaminidase (a-NAGAL) and a-galactosidase (a-GAL). The ABO antigenic specificity in human blood is determined by the nature and linkage of monosaccharides at the end of the carbohydrate chains of either glycoproteins or glycolipids embedded in the cell membrane. An approach to providing large quantities of group O blood is to convert group A and B cells to group O by using the appropriate exoglycosidases. Understanding the mechanism of conversion of the carbohydrate chains through three-dimensional structure determination will hopefully provide the information necessary to produce an enzyme with both A and B converting function.
|
0.94 |
2004 — 2005 |
Ghosh, Debashis |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Rational Design of Selective Estrogen Enzyme Modulators @ Cornell University Ithaca |
0.964 |
2004 |
Ghosh, Debashis |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Statistical Methods For the Analysis of Microarray Data @ University of Michigan At Ann Arbor
[unreadable] DESCRIPTION (provided by applicant): With the advent of high-throughput molecular assay technologies, biologists are having to deal with the analysis of high-dimensional genomic datasets. While statistical methods have been proposed for issues such as differential expression with these data, relatively little work has been done in terms of incorporating biological knowledge in the statistical analysis of high-throughput biological data in human disease settings. [unreadable] [unreadable] In this grant, we propose the development of statistical procedures for modeling of complex high-dimensional biological data with an emphasis towards incorporating functional biological knowledge. The methods we propose will be implemented and distributed in software available to biologists. While the major biological data example in this grant is from a microarray experiment in cancer, the methods proposed here are general and can be developed for studying high-dimensional genotype-phenotype associations in other contexts. Given this, we propose the following aims: [unreadable] [unreadable] 1. Development of hierarchical models for modelling of high-dimensional data in complex cell systems. [unreadable] 2. Development of statistical methodology for the identification of disease progressor genes. [unreadable] 3. Development of statistical methodology for assessing the role of functional pathways based on integration of gene expression and pathway data. [unreadable] 4. Development of statistical methodology for determining regions of overexpression and underexpression based on integration of gene expression and chromosomal location data. [unreadable] 5. Dissemination of these results in user-friendly statistical software. [unreadable] [unreadable]
|
0.945 |
2005 — 2008 |
Ghosh, Debashis |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Statistical Methods For the Analysis of Functional Genomic Data @ Pennsylvania State University-Univ Park
With the advent of high-throughput molecular assay technologies, biologists are having to deal with the[unreadable] analysis of high-dimensional genomic datasets. While statistical methods have been proposed for issues[unreadable] such as differential expression with these data, relatively little work has been done in terms of[unreadable] incorporating biological knowledge in the statistical analysis of high-throughput biological data in[unreadable] human disease settings.[unreadable] In this grant, we propose the development of statistical procedures for modeling of complex highdimensional[unreadable] biological data with an emphasis towards incorporating functional biological knowledge.[unreadable] The methods we propose will be implemented and distributed in software available to biologists. While the[unreadable] major biological data example in this grant is from a microarray experiment in cancer, the methods[unreadable] proposed here are general and can be developed for studying high-dimensional genotype-phenotype[unreadable] associations in other contexts. Given this, we propose the following aims:[unreadable] 1. Development of hierarchical models for modelling of high-dimensional data in complex cell systems.[unreadable] 2. Development of statistical methodology for the identification of disease progressor genes.[unreadable] 3. Development of statistical methodology for assessing the role of functional pathways based on[unreadable] integration of gene expression and pathway data.[unreadable] 4. Development of statistical methodology for determining regions of overexpression and underexpression[unreadable] based on integration of gene expression and chromosomal location data.[unreadable] 5. Dissemination of these results in user-friendly statistical software.
|
0.945 |
2007 — 2011 |
Ghosh, Debashis |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Integral Membrane Enzymes in Estrogen Biosynthesis
This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Human steroid (Estrone/DHEA) sulfatase (STS) and cytochrome P450 aromatase (P450arom), two integral membrane proteins of the endoplasmic reticulum, catalyze biosynthesis of the active estrogens. Selective inhibition of one or more of these enzymes by high affinity, highly specific small molecule inhibitors could provide an effective means of prevention and treatment of hormone-dependent breast carcinoma. We have determined the crystal structure of full-length STS purified from human placenta at 2.1 angstrom resolution. In order to investigate molecular mechanisms of activation and inhibition of the enzyme, we continue to analyze the structures of its complexes with ligands and inhibitors. The plan is to gather X-ray diffraction data on newly grown crystals of inhibited complexes of STS. Recently, we have for the first time grown large single crystals of androstenedione-complex of the full-length P450arom, also purified from human placenta. With CHESS beam time, we plan to conduct cryo-protection optimization experiments and collect diffraction data for solving the crystal structure.
|
0.957 |
2007 — 2008 |
Ghosh, Debashis |
P41Activity Code Description: Undocumented code - click on the grant title for more information. |
Crystals of 143 Kda Bovine Interphotoreceptor Retinoid Binding Pro @ Cornell University Ithaca |
0.94 |
2008 — 2011 |
Ghosh, Debashis Gonzalez-Fernandez, Federico [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Interphotoreceptor Retinoid Binding Protein: Structure and Function @ State University of New York At Buffalo
DESCRIPTION (provided by applicant): Interphotoreceptor retinoid-binding protein (IRBP), the major soluble protein component of the interphotoreceptor matrix (IPM), has access to M[unreadable]ller cells, photoreceptors, and RPE. The mechanism by which IRBP protects retinoids from isomeric and oxidative degradation while targeting their delivery/release between the above cells during the visual cycle is poorly understood. Our long-term goal is to understand at the molecular level how IRBP accomplishes its remarkable functions. The mechanism underlying IRBP's function or role in disease remains unknown because little is understood about its structure-function relationships. Its structure is unusual being composed of tandem homologous "modules" each ~300 residues in length. Although the individual modules have some functional activity, they are not equivalent, and important interactions take place between them. A critical gap is that little is known about the structure of the full-length protein and quaternary association of the "modules". However, obtaining IRBP at the concentrations needed for X-ray crystallography has been problematic as the protein denatures and precipitates when concentrated above 3 mgs/ml. In the current funding period, we purified to homogeneity full-length bovine, xenopus, human and zebrafish IRBPs in stable and functionally active pristine forms, devoid of fusion tags. These protein solutions can now be readily concentrated without denaturation or precipitation. We have optimized conditions for growing diffraction-quality crystals of these full-length IRBPs. Preliminary structure elucidation analysis for Xenopus IRBP suggests that the single module structure may be substantially modified in the full-length functional protein. Analyses of the expression, purification, stability, crystallization, ligand-binding, anti-oxidant activity, and homology-modeling data on these IRBPs have led to our hypothesis that quaternary association of the "modules" contributes to the structural scaffold(s) that bind and protect retinoids from degradation, and that the "modules" contribute unequally in these roles as well as in target retinoid delivery and release at the cell surface. This hypothesis will be evaluated through the following complementary aims. Aim 1: To determine the crystal structures of IRBPs composed of four modules (human, bovine, Xenopus), and two modules (zebrafish). Aim 2: To define binding-interactions of physiologically relevant ligands with IRBP and elucidate the molecular basis of IRBP's protective/anti-oxidant roles. Aim 3: To determine how IRBP efficiently targets retinoid removal/delivery. Aim 4: To determine the structural and functional "hot-spots" in IRBP. PUBLIC HEALTH RELEVANCE: The mechanism by which IRBP protects retinoids from isomeric and oxidative degradation while targeting their delivery/release between photoreceptors, retinal pigmented epithelium and M[unreadable]ller cells during the visual cycle is poorly understood. A critical gap is that little is known about the structure of the full-length protein and quaternary association of the "modules" that comprise the structure of IRBP. Analyses of the expression, purification, stability, crystallization, ligand-binding, anti-oxidant activity, and homology-modeling data on these IRBPs have led to our hypothesis that quaternary association of the "modules" contributes to the structural scaffold(s) that bind and protect retinoids from degradation, and that the "modules" contribute unequally in these roles as well as in targeting retinoid delivery and release at the cell surface.
|
0.943 |
2009 — 2012 |
Ghosh, Debashis |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Structure and Function of Integral Membrane Enzyme Human Aromatase @ Upstate Medical University
DESCRIPTION (provided by applicant): Human cytochrome P450 aromatase (P450arom), an integral membrane hemeprotein of the endoplasmic reticulum, catalyzes the synthesis of estrogens from androgens in the presence of P450-reductase. Despite intense biochemical and biophysical investigations for the past 35 years, the structure-function relationships of P450arom and its catalytic mechanism remain poorly understood. Inhibition of estrogen biosynthesis by P450arom inhibitors is an effective therapy for hormone-dependent breast cancers. Attempts to obtain diffraction-quality crystals of human P450arom have so far been unsuccessful. We have grown single crystals of the androstenedione-complex of the full-length, highly active P450arom purified from human term placenta, gathered complete diffraction data to 2.90E resolution and obtained a solution for the structure. Here, we propose to launch an investigation in order to determine of the structure-function relationships of human P450arom. Our hypothesis is that analysis of the atomic structures of human P450arom-ligand complexes in terms of their functional properties will lead to the elucidation of the origin of substrate and inhibitor specificities, roles of the catalytically important residues, nature of the reaction intermediates, as well as the mechanism of action, and that ligand design and optimization guided by these structural basis will lead to novel high-affinity inhibitors that are exclusive for the target. The specific aims to test the hypothesis are to determine the crystal structures of the complexes of P450arom with its (1) natural substrates androstenedione, testosterone, and 161-hydroxy-testosterone, as well as with (2) the inhibitors exemestane, letrozole, formestane, anastrozole, fadrozole and aminoglutethimide. These findings will reveal the molecular mechanism for substrate and inhibitor specificities and help build a structure-activity database based on the atomic level descriptions of the enzyme-ligand interactions. Next, in specific aim 3 we will investigate the enzyme mechanism by combining the data from aims 1 and 2 with structural data on reaction intermediates. By initiating the aromatization reaction in an enzyme-substrate complex crystal with X-ray photoelectrons during diffraction and in situ monitoring of the Soret band transition with a spectrophotometer, we plan to capture structural snap shots of the reaction intermediates. Completion of these goals will lead to the implementation of specific aim 4 - discovery of new inhibitors through docking, design, synthesis, testing, and optimization, in collaboration with chemists. As a future direction of the project, we plan to crystallize a complex of P450arom with P450- reductase for investigating the mechanism of redox reactions by electron transfer. PUBLIC HEALTH RELEVANCE: Aromatase is a unique enzyme that makes all estrogens in the human body. We propose a research plan to unravel the molecular details of how aromatase works and how aromatase inhibitors prevent it from making estrogens. Results from this investigation will form the basis for future discovery of novel breast cancer drugs that are highly specific for aromatase but cause minimal side effects.
|
0.939 |
2012 — 2021 |
Ghosh, Debashis Taylor, Jeremy M.g. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Statistical Methods For Cancer Biomarkers
DESCRIPTION (provided by applicant): Biomarkers in cancer research are considered a central component of the expected improvements in prevention, detection, treatment and monitoring. There are potentially useful in many different types of studies and for many different purposes. Critical questions are whether they are valid to use, how can they be utilized in a valid and efficient way, and then if they are used how confident is one in the conclusions that are obtained. The use of biomarkers to advance understanding in cancer science has great potential, but also has some risks. Biomarkers are subject to uncertainty in their measurement, they may not be measuring exactly the quantity of interest, and since they are not explicitly measures of symptoms their use to aid in decision making or evaluation of therapies in a clinical setting is subject to uncertainty. Thus careful analysis of data from studies that involve biomarkers is crucial. There are many statistical challenges that arise in such studies. This application is concerned with developing, evaluating and applying statistical methods for data that involves biomarkers. The first aim is concerned with adding biomarkers to prediction models that may be used to stratify or classify patients. In this aim we develop approaches for integrating data from other sources to improve the prediction models. This research will have broad applicability. Innovative aspects involve the use of targeted ridge regression, multi-kernel machine modeling, and importance sampling to incorporate information from the literature. The second aim is concerned with clinical trials where the biomarker is to be used to evaluate a therapy as a surrogate endpoint. Because of the nature of the scientific question causal modeling is very natural in this context. We propose to develop both potential outcomes and structural causal models. We will investigate both single trial and multi trial settings with different endpoint types. The third aim is concerned with therapies that may be effective only for a subgroup of patients, and to be useful this subgroup is determined by a small number of predictive biomarkers. For data from randomized clinical trials we suggest a unified modeling approach, and will investigate the use of single index models with variable selection and multivariate partial least squares to aid in the subgroup identification. Inference following subgroup identification is challenging, we suggest an innovative scheme to simulate data under an appropriate null distribution. All 3 aims in this proposal address fundamental and significant problems in translational oncology research. Successful completion of the aims will have an impact both in understanding and utilizing biomarkers and also in developing statistical methodology that can be more broadly applicable to other fields.
|
0.961 |
2013 — 2017 |
Ghosh, Debashis |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Multivariate Statistical Methods For Genomic Data Integration @ Pennsylvania State Univ University Park
This project addresses a key modeling issue faced by many data analysts working with genomic data. For a set of individuals or observations, many different types of high throughput experimental datasets are generated, and the question then becomes how to model these data. In many problems, the goal is to prioritize which parts of the genome one wishes to study. While it is commonly assumed that the different data types are linearly correlated in either an unconditional or conditional sense, in many settings the nature of the correlation is unknown. This research focuses on multivariate methods of analysis with high-dimensional genomic data that relax the linearity assumption. Two classes of problems will be studied during the course of the project. The first is Hidden Markov Models and the second is multiple testing procedures, whose use have become commonplace with genomic datasets. This project proposes novel multivariate extensions of both types of method with a goal of being characterized by sound theoretical statistical principles while simultaneously being computationally feasible on big datasets. The methodology will be evaluated using several real datasets as well as through simulation studies.
This work will involve an interplay between statisticians and biologists. The broader use of this work will be to prioritize molecules for follow-up studies in any biological setting. It will be useful for biologists and scientists studying disease processes who wish to find new therapeutic targets or further advance basic etiological understanding. The educational goals of the project include new course components for graduate students at Penn State and training of graduate students in Statistics.
|
0.943 |
2013 — 2014 |
Ghosh, Debashis Hardison, Ross C [⬀] Shashikant, Cooduvalli S. (co-PI) [⬀] |
T32Activity Code Description: To enable institutions to make National Research Service Awards to individuals selected by them for predoctoral and postdoctoral research training in specified shortage areas. |
Computation, Bioinformatics, and Statistics (Cbios) Training Program @ Pennsylvania State University-Univ Park
DESCRIPTION (provided by applicant): Genomic data are transforming how scientists in medicine and basic science conduct research. The advancement of genome science requires a new generation of scientists with strong computational and statistical skills and the ability to effectively interact with experimentalists. The proposed Penn State Computation, Bioinformatics, and Statistics (CBIOS) Training Program will prepare a cadre of investigators to think innovatively and keep pace with the quickly evolving landscape of high throughput genomic technologies. The program faculty are interdisciplinary and highly collaborative, with expertise in computation, bioinformatics, statistics, functional, medical, and evolutionary genomics. Learning these discipline-crossing skills will make trainees competitive for future careers in emerging and rapidly advancing fields of comparative, systems, statistical and medical genomics. The educational objectives of the CBIOS program are to engender in the trainees the following: 1. A thorough understanding of hypothesis testing in the scientific process. 2. The ability to work from theory to data and back. 3. Fluency in the use of computational and statistical tools for high throughput data. 4. The ability to integrate and innovate computational and statistical analysis of high throughput data. 5. Excellence in cross-disciplinary scientific communication including ethical implications of computational and bioinformatics research. 6. The ability to lead cross-disciplinary research teams The CBIOS training program will accomplish these objectives through a set of existing core and elective courses along with a new practicum course, all of which are integrated with a journal club and seminar series. The program will enhance professional development through invited seminar speakers and retreats, and will specifically develop trainees' communication skills to enable dissemination of genomics research to a broad audience. Predoctoral trainees will be selected early in their graduate program for two years of intensive training. A total of 15 trainees (10 NIH and 5 PSU supported) will be trained during a five-year granting period. The faculty supporting this training program have a combined annual research funding base of $65 million direct costs, and thus offer a robust mentoring foundation for student research experience and opportunities.
|
0.931 |
2016 |
Epstein, Michael Philip Ghosh, Debashis |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Statistical Tests For Mapping Genetic Determinants of Complex Traits
? DESCRIPTION (provided by applicant): Genotyping and emerging sequencing technologies have enabled comprehensive interrogation of genetic variation across the human genome, thereby facilitating a study's ability to map genetic variants that influence phenotypes of interes. Nevertheless, genome-wide association studies (GWAS) and next-generation sequencing (NGS) projects have uncovered only a limited number of trait-influencing loci. While large increases in sample size will improve power to detect such variation, the ascertainment and sequencing/genotyping of such samples are costly and inefficient. Therefore, it is desirable to increase power to detect such variants without requiring additional sample collection. We propose novel methods for improved gene mapping of common and rare susceptibility variants that move beyond standard strategies typically applied to GWAS and NGS studies of complex traits. The first topic we consider is pleiotropic or cross- phenotype effects of genetic variants. Empirical studies have suggested that pleiotropy is widespread throughout the genome and that leveraging this additional information for gene mapping yields a more powerful analysis than an analysis that ignores such information. In Aim 1, we propose novel statistical methods for genetic analysis of high-dimensional phenotype data using an innovative kernel distance-covariance (KDC) framework that allows for an arbitrary number of phenotypes both continuous and/or categorical in nature, as well as an arbitrary number of genotypes (permitting gene-based testing of both rare and common variants). We will use the KDC framework to implement tests of pleiotropy as well as tests of mediation. The second topic we consider is the mapping of rare susceptibility variants using affected pedigrees, which provide many attractive features for rare-variant testing that case-control studies lack. In Aim 2, we propose a series of powerful statistical methods for rare-variant association testing in affected pedigrees that are based on a framework (recently published in AJHG) for rare-variant association testing in affected sibships. The existing framework compares rare-variant burden in a region by an affected sib pair to the number of regions that pairs shares identical by descent. We have shown the method is more powerful than case-control association testing given fixed sample size and further is robust to population stratification. In this proposal, we will extend the framework to handle affected pedigrees of arbitrary size and structure (rather than just affected sib pairs) and devise a powerful two-stage screening and validation strategy for rare-variant mapping that first compares familial cases in the pedigrees to external controls and then follows up the most interesting findings using an independent test based on our identity-by-descent sharing statistic among the affected relatives used in the first stage. We will apply the methods in Aims 1-2 to relevant data from genetic studies of complex traits in which we are directly involved. We also will implement the methods in public user-friendly software (Aim 3).
|
0.966 |
2017 — 2019 |
Epstein, Michael Philip Ghosh, Debashis |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Statistical Tests For Mapping Genetic Determits of Complex Traits
? DESCRIPTION (provided by applicant): Genotyping and emerging sequencing technologies have enabled comprehensive interrogation of genetic variation across the human genome, thereby facilitating a study's ability to map genetic variants that influence phenotypes of interes. Nevertheless, genome-wide association studies (GWAS) and next-generation sequencing (NGS) projects have uncovered only a limited number of trait-influencing loci. While large increases in sample size will improve power to detect such variation, the ascertainment and sequencing/genotyping of such samples are costly and inefficient. Therefore, it is desirable to increase power to detect such variants without requiring additional sample collection. We propose novel methods for improved gene mapping of common and rare susceptibility variants that move beyond standard strategies typically applied to GWAS and NGS studies of complex traits. The first topic we consider is pleiotropic or cross- phenotype effects of genetic variants. Empirical studies have suggested that pleiotropy is widespread throughout the genome and that leveraging this additional information for gene mapping yields a more powerful analysis than an analysis that ignores such information. In Aim 1, we propose novel statistical methods for genetic analysis of high-dimensional phenotype data using an innovative kernel distance-covariance (KDC) framework that allows for an arbitrary number of phenotypes both continuous and/or categorical in nature, as well as an arbitrary number of genotypes (permitting gene-based testing of both rare and common variants). We will use the KDC framework to implement tests of pleiotropy as well as tests of mediation. The second topic we consider is the mapping of rare susceptibility variants using affected pedigrees, which provide many attractive features for rare-variant testing that case-control studies lack. In Aim 2, we propose a series of powerful statistical methods for rare-variant association testing in affected pedigrees that are based on a framework (recently published in AJHG) for rare-variant association testing in affected sibships. The existing framework compares rare-variant burden in a region by an affected sib pair to the number of regions that pairs shares identical by descent. We have shown the method is more powerful than case-control association testing given fixed sample size and further is robust to population stratification. In this proposal, we will extend the framework to handle affected pedigrees of arbitrary size and structure (rather than just affected sib pairs) and devise a powerful two-stage screening and validation strategy for rare-variant mapping that first compares familial cases in the pedigrees to external controls and then follows up the most interesting findings using an independent test based on our identity-by-descent sharing statistic among the affected relatives used in the first stage. We will apply the methods in Aims 1-2 to relevant data from genetic studies of complex traits in which we are directly involved. We also will implement the methods in public user-friendly software (Aim 3).
|
0.966 |
2019 — 2021 |
Ghosh, Debashis Kechris-Mays, Katherina |
U01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Addressing Sparsity in Metabolomics Data Analysis @ University of Colorado Denver
Project Summary Comprehensive profiling of the small molecule repertoire in a sample is referred to as metabolomics, and is being used to address a variety of scientific questions in biomedical studies. Metabolomics offers more immediate measures of the physiology of an individual, and more direct examination of the effects of exposures such as nutrition, smoking and bacterial infections. For human health, metabolomics studies are being used to investigate disease mechanisms, discover biomarkers, diagnose disease, and monitor treatment responses. Metabolomics is increasingly recognized as an important component of precision medicine initiatives to complement and enhance collected genomic data. This is critical as the metabolome cannot be predicted from knowledge of the genome, transcriptome or proteome, but provides important information on the phenotype. Recent technological advances in mass spectrometry-based metabolomics have allowed for more comprehensive and sensitive measurements of metabolites. We focus on untargeted ultra-high pressure liquid chromatography coupled to mass spectrometry, which is one of the more commonly used methods. Despite the technological advances, the bottleneck for taking full advantage of metabolomics data is often the paucity and incompleteness of analytical tools and databases. Our goal is to develop novel statistical methods and software for the research community to improve the utilization of metabolomics data. There are many steps in a metabolomics data analysis pipeline, and we will focus on the downstream steps of normalization, and univariate, multivariate and pathway analyses. In particular, we will address the high levels of sparsity, which is one of the more unique aspects of metabolomics data compared to other ?omics data sets. For metabolomics data, there is sparsity in individual metabolites due to a large percentage of missing data for biological or technical reasons, and sparsity in connections between metabolites due to high collinearity and sparsely connected networks in metabolic pathways. The methods and software we develop will maximize the potential of metabolomics to provide new discoveries in disease etiology, diagnosis, and drug development.
|
0.943 |