1991 — 1993 |
Stoeckert, Christian J. |
R29Activity Code Description: Undocumented code - click on the grant title for more information. |
Transferred Globin Regulation in Human Erythroblasts @ University of Pennsylvania |
1 |
1993 |
Stoeckert, Christian J. |
R29Activity Code Description: Undocumented code - click on the grant title for more information. |
Tranferred Globin Regulation in Human Erythroblasts @ Children's Hospital of Philadelphia
The severity of sickle cell disease and beta-thalassemia is ameliorated under conditions where individuals produce high levels of fetal hemoglobin. It follows that a rational therapy for these diseases would be to increase fetal globin levels in patients where this does not otherwise occur. To this end, the immediate goal of the research is a better understanding of the mechanisms regulating fetal and adult globin levels in adults. At least three regions of DNA sequences outside of the globin gene transcription unit have been identified which have important roles in regulating fetal gamma- and adult beta-globin gene expression. These are locus activation region (LAR) sequences flanking the beta-globin gene cluster (which contains the two gamma-globin genes), gamma- and beta-globin gene promoter elements, and enhancer elements 3' to the gamma- and beta-globin genes. We plan to study the effects of these sequences on the expression of transferred gamma- and beta-globin genes. In particular, we will determine whether individual gamma- and beta-globin genes can compete for activation by LARs and enhancers thereby influencing their relative expression. An integral set of experiments will be to determine causality of naturally-occurring mutations linked to elevated fetal globin production in otherwise normal adults (the nondeletion types of hereditary persistence of fetal hemoglobin). We will accomplish these aims by retroviral transfer of globin regulatory sequences and globin genes into early erythroid progenitors (burst forming units-erythroid or BFU-E) isolated from human peripheral blood. BFU-Es carrying the transferred gene will be cultured under serum-free conditions to produce mature erythroblasts expressing physiological levels of gamma-globin. Expression of the transferred globin genes will be measured in RNA isolated from the erythroblasts and compared to endogenous globin gene levels. Retroviral transfer of globin genes into human erythoblasts will provide information on human globin gene regulation unobtainable in other systems and will also directly help develop gene therapy.
|
0.909 |
1994 — 1995 |
Stoeckert, Christian J. |
R29Activity Code Description: Undocumented code - click on the grant title for more information. |
Tranferred Globin Regulation in Erythroblasts @ Children's Hospital of Philadelphia |
0.909 |
2000 — 2003 |
Stoeckert, Christian J. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
High Throughput Annotation of Genomic Dna Sequence @ University of Pennsylvania
DESCRIPTION (Applicant's abstract): Now that a working draft sequence of the human genome is in hand and an ongoing effort is in place to provide a draft of the mouse genome, the challenge is to identify the genes encoded by these genomes. Several efforts are underway in this regard including our own using ab initio gene finders and transcribed sequences in the form of mRNAs and ESTs. Gene prediction is the first step in identifying genes. Additional steps are to predict the function of those genes and associate any other information such as where (and when) the gene might be expressed. The goal of the proposed project is to provide a public database that will provide a central repository of gene predictions and associated annotation. The project will provide data integration such that predictions and annotations for the same gene (as defined by co-localizing to the same genomic location) will be linked. Associated annotation will be extended to include functional predictions and expression profiles. The intended users of the database are researchers seeking to extend their knowledge of a gene starting with an expression profile, a cDNA, or a genetic locus or to search generally for candidates genes. The prototype annotation framework for genomic sequence, GAIA, has been combined with prototypes for a gene index of ESTs and mRNAs, DoTS, and gene integration, EpoDB. The result is a database based on a global schema, GUS, that integrates sequence-centered entries from GenBank, dbEST, and SWISS-PROT and transforms the entries into gene-centered entities. This process includes data cleansing and adding value through annotation of the resultant genes (mRNAs and proteins). A first pass of this resource is on-line with ad hoc boolean queries and integrated visual tools as www.allgenes.org. The resource will provide an integrated set of known and predicted genes from GenBank, gene finders, and assembled ESTs and mRNA. Ontologies will be used to structure the annotations of biological concepts and gene function. Gene expression information will be augmented with RAD (RNA Abundance Database). No other public resource of this nature currently exists. Data currency of this resource will be maintained through periodic updates every 2-3 months. The updates will include integration of previously annotated genes with newly available GenBank and dbEST entries and recalculation of gene similarities, gene location, tissue distribution, and gene function. An annotation interface has been developed to complement and extend computational analysis through manual assessment of predictions for genes and their functions. Radiation hybrid mapping data for mouse sequences will be incorporated as has been done for human ESTs. Links between the genes in GUS and gene expression data in RAD will be established. To respond to the public community, queries to the web interface will be incorporated and bulk files provided in response to users of the allgenes.org site. Planned is the inclusion of on-demand annotation of new contigs.
|
1 |
2004 — 2008 |
Stoeckert, Christian J. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Plasmodium Genome Database @ University of Pennsylvania
DESCRIPTION (provided by applicant): The Plasmodium Genome Database was established in 2000, to expedite access of the malaria research community to genomic-scale datasets, beginning with the reference genome sequence for P. falciparum. PlasmoDB incorporates finished and unfinished sequence from the P. falciparum genome project, genetic & optical mapping data, curated annotation of finished sequences provided by the sequencing centers, automated analyses of DNA and predicted genes/protein sequences using multiple algorithms, comparisons with GenBank/EMBL and other databases, unfinished genomic sequences and Plasmodium ESTs from various species, and expression data from SAGE and microarray projects. All information is incorporated into a comprehensive relational database, facilitating user queries. The on-line database receives up to 13,000 hits per day, from users in more than 100 countries, and many thousands of CD-based copies have been distributed worldwide. PlasmoDB was initially supported through a seed grant from the Burroughs Wellcome Fund; this application seeks funding to maintain and extend the database, for the benefit of the research community at large. Database enhancement is an ongoing research and development project, resulting in tools that are broadly applicable to genomic-scale datasets for any organism. Specific aims seek to: 1. Enhance access to PlasmoDB as a community resource, including development of an improved web interface for relational queries. 2. Provide a mechanism for community input and response, including implementation of comment forms (for the web site, gene annotation, and other data types), tracking responses to comments, and posting comments and responses at PlasmoDB. 3. Expand the depth and scope of data available, including expression profiling and proteomics data, etc. 4. Provide comparative analyses of Plasmodium species, facilitating identification of genes and regulatory elements, genetic mapping, evolutionary studies, etc. The overall goal of this project is to develop and maintain a database that expedites biological discovery, including the identification of drug targets and vaccine antigens useful in the fight against malaria.
|
1 |
2005 — 2009 |
Ives, Zachary (co-PI) [⬀] Stoeckert, Christian White, Peter Tannen, Val (co-PI) [⬀] Davidson, Susan [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ii: Data Cooperatives: Rapid and Incremental Data Sharing With Applications to Bioinformatics @ University of Pennsylvania
Generic tools and technologies for creating and maintaining data cooperatives- confederations whose purpose is distributed data sharing-will be developed to overcome the difficultiess encountered in the sharing of information in life sciences, specifically in bioinformatics.
The vision of large-scale data sharing has been a long-time goal of the bioinformatics field, much of it proceeding through data integration efforts. However, conventional approaches to data integration do not have the necessary flexibility and adaptability to make the existing and future plethora of data accessible and usable to typical biologists, while keeping it rapidly extensible to new concepts, domains, and types of queries, and thus fostering new research developments. The main reasons are that (1) different biologists work with different types of data and at differing levels of abstraction; (2) schemas in the bioinformatics world are typically large and complex; (3) queries and mappings may "break" without warning because of asynchronous updates; (4) it is logistically, economically and politically difficult to operate centralized data integration facilities. In response to these difficulties data cooperatives emphasize: decentralization for both scalability and flexibility, incremental development of resources such as schemas, mappings, and queries, rapid discovery mechanisms for finding the resources relevant to a topic, and tolerance for intermittent participation of members and for approximate consistency of mappings. More specifically, the technical goals of the proposal include: (1)collaboratively developed yellow pages of biological topics; (2) schema templates, capturing the part of the structure of data pertaining to a specific interest and functioning also as visual templates from which a query form created; (3) incremental specification of mappings; (4) reasoning about uncertainty in mappings by measuring with statistical tools their degree of reliability and using it in query answering; (5) multi-path answering for queries with caching and replication in a large-scale data cooperative where the participation of individual members may not always be assured.
Data cooperatives will have broader impact through applications in a variety of scientific and industrial fields, but it is in the field of bioinformatics that they are likely to have an immediate and significant impact. Therefore, a specific data cooperative as a biological testbed for evaluating the proposed technologies. This testbed is based on a small set of databases which are already collaborating and exchanging data related to Plasmodium falciparum. Broader impact will be also be achieved through the proposed educational initiatives, specifically through a "compu-tational orchestra" bioinformatics course which will expose students to data integration issues through project work, and a workshop for the Greater Philadelphia Bioinformatics Alliance (GPBA). Minority involvement will also be encouraged through a GPBA internship program.
|
0.915 |
2008 — 2009 |
Stoeckert, Christian J. |
R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Annotation-Based Meta-Analysis of Microarray Experiments @ University of Pennsylvania
DESCRIPTION (provided by applicant): Meta-analyses of microarray experiments require the usage of meta-data annotations, however these annotations are often a barrier because they usually entail significant manual evaluation. Combining data from assays in different experiments for analysis is challenging as it requires suitably transforming these data so that they are on equal footing. In particular, extra care needs to be taken to ensure that the results are not driven by confounding factors but rather by biologically-relevant ones. Consideration of annotations can improve meta-analyses through guiding choice of experiments, assays (within each experiment), data transformations, and analysis procedures. We propose to develop software that will extract annotations for use in meta-data analyses and which should motivate better annotation of microarray experiments using established standards. Standardized experiment annotations can be generated using the MGED Ontology (MO) and can be extracted from files based on the MAGE (MicroArray Gene Expression) standard that have information covering the MIAME (Minimal Information About a Microarray Experiment) checklist. Standardized MO-based assay annotations are also available from MAGE based files, but further relevant information (such as treatment descriptions) also resides in free-text annotation fields in these files. Thus, in order to get fully standardized annotation for assays, more work is needed than just extracting the MO terms associated with them. Our first aim is to develop software that will extract annotations either directly from appropriate MAGE fields or parse them as needed from free-text descriptions. The annotations will be used to generate dissimilarity measures between experiments and assays based on shared annotation. The software will need to recognize synonymous terms when terms from different experiments or assays for the same annotation (e.g., organism part) are drawn from different sources. Our second aim is to develop software to compute with annotations based on these measures, e.g. to find experiments or assays related to a query experiment/assay, or to cluster experiments or assays based on their annotation (as opposed to clustering based on gene expression profiles). These clusters can be used as the basis for organizing experiments/assays and performing meta-analyses of gene expression profiles. Additionally, annotation-based dissimilarity measures can be used to evaluate existing (gene expression profile based) clusters of experiments or assays and the annotation itself can be input into analyses aimed at identifying over-enriched terms.Narrative: Microarray technology has been used to understand the molecular basis of diseases including heart, lung, blood, and sleep disorders and cancer. We will develop software applications to demonstrate the feasibility and utility of using microarray annotations to drive meta-analyses and quality control (QC) of experiments. The applications will be tested on files from the public repository ArrayExpress but are meant to work with appropriate files from any source. To the best of our knowledge, the usage of annotations for this purpose has not been explored previously and therefore the risk of the proposal is that it is exploring uncharted territory and the severity and type of pitfalls are unclear. The potential high impact of the proposed applications to the bench biologist is the ability to generate additional insights from their microarray results. Moreover, these methods would eventually be extensible to annotated experiments employing high-throughput technologies other than microarrays. This can further facilitate integration of data of different types to address a scientific question of interest. These benefits may encourage better annotation of experiments and use of standards.
|
1 |
2010 — 2012 |
Stoeckert, Christian J. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Integrative Tools For Protozoan Parasite Research @ University of Pennsylvania
DESCRIPTION (provided by applicant): We propose research on data-representation and integration tools to advance understanding of protozoan parasite genome biology. Many of these tools will also be useful for other pathogens and some will be useful for providing links to the host. In this proposal, we build on previous accomplishments with apicomplexan and functional genomic databases to attack critical problems that can be addressed through data integration and enhanced data processing. In order to make data integration possible for pathogen genome databases, algorithms and tools, we will develop the necessary semantics (ontologies) in the context of the Open Biomedical Ontologies (OBO) Foundry. We will also further develop WebProtigi to facilitate collaborative ontology building and integration of ontologies. Taking advantage of these semantics, we will develop new tools to facilitate research on protozoan and links to hosts (human, mouse and other vertebrates), by building upon Web services and the Galaxy open source platform to enable large-scale data analysis without dedicated software development. PUBLIC HEALTH RELEVANCE: Protozoan parasites such as those that cause malaria and toxoplasmosis remain major threats to global health, and a significant biodefense concern. Current treatments are limited and sometimes compromised by acquired resistance. Solutions will come from the integration and mining of ongoing research. The proposed research will provide new frameworks and tools to facilitate data integration and computational mining of genomic-scale datasets.
|
1 |