2004 — 2006 |
Deelman, Ewa [⬀] Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sci/Nmi/Sger: Towards Cognitive Grids: Knowledge-Rich Grid Services For Autonomous Workflow Refinement and Robust Execution @ University of Southern California
This SGER proposal describes research on grid workflow refinement and execution for abstract workflows in two areas: preplanning and advance resource reservation; and context-aware dynamic planning and failure repair. Drawing from AI planning techniques, resource reasoning, and languages for expressive knowledge representation, techniques for workflow refinement will be developed. Mechanisms for monitoring and failure-detection based on models for expressive representation of the environments will also be developed. Resulting service implementations will build upon the existing Pegasus workflow mapping system and disseminated through Pegasus.
|
1 |
2006 — 2009 |
Deelman, Ewa (co-PI) [⬀] Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf Workshop On Scientific Workflows Challenges (Wsw-06) @ University of Southern California
Workflows have recently emerged as a paradigm for conducting large-scale scientific analyses. The structure of a workflow specifies what analysis routines need to be executed, the data flow amongst them, and relevant execution details. These workflows often need to be executed in distributed environments, where data sources may be available in different physical locations and the steps may have execution requirements calling for high-end computing and memory resources at remote locations. Workflows help manage the coordinated execution of related tasks. They also provide a systematic way to capture scientific methodology and provide provenance information for their results. Yet, robust and flexible workflow creation, mapping, and execution are largely open research problems.
Scientific workflows present new challenges over business workflows and other kinds of process models. They typically use very large, distributed data sets, employ computationally intensive tasks, and require high-end and distributed computing technology. They are also often iteratively and interactively designed, since that is the nature of the scientific exploration and analysis process they reflect. On the other hand, scientific workflows also have simplified requirements in terms of their data flow structure, execution management, or security/privacy constraints. Currently, scientific workflows are mostly designed without formal principles and are rarely optimized, scalable or reusable.
The aim of this workshop is to bring together IT researchers and practitioners as well as domain scientists. Application scientists will be asked to describe requirements and desired new analyses and computations that are not possible with today's technologies. IT researchers will be asked to identify problems in their specific areas of expertise. Discussions will focus on four main topics: (1) applications and requirements; (2) dynamic workflows and user steering; (3) data and workflow descriptions; and (4) system-level management to support large-scale workflows.
The outcome of the workshop will be a report outlining research directions and activities that will bring the needed communities together to work on producing a new paradigm for scientific workflows. Easy-to-use tools for building efficient, scalable and reusable scientific workflows are likely to bring benefits to many fields, and can raise the pace and quality of research work in many areas.
The workshop Web site (http://vtcpc.isi.edu/wiki) provides further information about the workshop and will be used for disseminating the workshop report and other results.
|
1 |
2007 — 2011 |
Deelman, Ewa [⬀] Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Designing Scientific Software One Workflow At a Time @ University of Southern California
PROPOSAL NUMBER: 0725332 TITLE: Designing Scientific Software One Workflow at a Time PI: Ewa Deelman and Yolanda Gil
Much of science today relies on software to make new discoveries. This software embodies scientific analyses that are frequently composed of several application components and created collaboratively by different researchers. Computational workflows have recently emerged as a paradigm to manage these large-scale and large-scope scientific analyses. Workflows represent computations that are often executed in geographically distributed settings, their interdependencies, their requirements and their data products. The design of these workflows is at the core of today?s scientific discovery processes and must be treated as scientific products in their own right. The focus of this research is to develop the foundations for a science of design of scientific processes embodied in the new artifact that is the computational workflow. The work will integrate best practices and lessons learned in existing workflow applications, and extend them in order to define and formalize design principles of computational workflows. This work will result in a fundamentally new approach to designing workflows that will greatly improve the scientific software design methodology by defining and formalizing design principles, and by familiarizing the scientific community with these effective workflow design processes.
|
1 |
2009 — 2012 |
Gil, Yolanda Shaw, Erin Kim, Jihie Ragusa, Gisele (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Hcc: Small: Pedworkflow: Workflows For Assessing Student Learning @ University of Southern California
This pedagogical qorkflow research project will create a novel hybrid-workflow framework that supports efficient assessment of student learning through interactive generation and execution of various assessment workflows. Unlike in many existing workflow systems, the task of student assessment includes steps that cannot be fully automated, such as obtaining grade, background and student survey information. The system will provide assistance in executing and integrating the results of the manual steps. Research steps will include (a) knowledge-based modeling of computational and non-computational assessment tools as workflow components; (b) interactive generation of assessment workflows while propagating and combining constraints from both computational and non-computational components; and (c) interactive execution of hybrid workflows that incorporates new constraints that are inferred from execution of non-computational components. Evaluations will focus on the effects of Pedagogical Workflow technology on learning assessment performance, especially the assessment of pedagogical discourse in undergraduate engineering courses.
Educational technology to support online learning is now centrally supported by many colleges and universities. The perceived mandate to use technology for instruction, in addition to the enormous amount of information available for consumption on the Web, places a considerable burden on instructors who must learn to integrate appropriate student practices and learning assessment via the new media. Pedagogical workflows will allow instructors with little or no training in educational assessment to perform large-scale complex diagnosis and assessment of student learning in ongoing lessons. Facilitating the integration of personal student information into assessment will point to directions to improve STEM participation, learning, and retention. The finding will provide benefits to society by sharing results and technology with instructors and educational experts. The proposed work also has the potential to lead to a new research field on e-Learning workflows, similar to the way in which workflow technology transformed e-Science research with e-Science workflows.
|
1 |
2009 — 2011 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii/Eager: Towards Workflows as First-Class Citizens in Cyberinfrastructure: Designing Shared Repositories @ University of Southern California
Scientific computing has entered a new era of scale and sharing with the arrival of cyberinfrastructure for computational experimentation. A key emerging concept is scientific workflows, which provide a declarative representation of scientific applications as complex compositions of software components and the dataflow among them. Workflow systems manage their execution in distributed resources, track provenance of analysis products, and enable rapid reproducibility of results. In current cyberinfrastructure, there are well-understood mechanisms for sharing data, instruments, and computing resources. This is not the case for sharing workflows, though there is an emerging movement for sharing analysis processes in the scientific community.
This project explores computational mechanisms for sharing workflows as a key missing element of cyberinfrastructure for scientific research, with three major research foci: (1) Elicitation of new requirements that workflow sharing poses over current techniques to share software tools and libraries; (2) Understanding how shared workflow catalogs should be designed, as the existing data catalogs are a successful model, and software components require different representations and access functions; and (3) Studying what sharing paradigms might be appropriate for scientific communities.
Expected results from this work include: use cases for workflow sharing and reuse that motivate this research area, a comparison between software reuse and workflow reuse requirements, a specification of a workflow catalog defining expected functions and services, and an investigation of social issues that arise in building this new kind of shared resource in scientific communities. Results are available at the project Web site (http://workflow-sharing.isi.edu).
|
1 |
2011 — 2012 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iis: Iii: Workshop On Discovery Informatics @ University of Southern California
The workshop aims to identify the research challenges and opportunities for transforming the scientific discovery process through advances in computing and information sciences in general, and intelligent systems in particular. It seeks to define a research agenda in Discovery Informatics. The workshop is organized around three themes: (1) efficient experimentation and discovery processes, (2) practical issues in building and refining predictive models from scientific data, and (3) social computing for science.
The participants include experts and visionaries in the areas of knowledge representation and inference, machine learning and data mining, experiment design and planning, information integration, computational models of discovery, collaborative technologies, robotics, social networks, visualization, and representative application (science) domains.
Research in Discovery Informatics is expected to integrate advances in multiple subdisciplines of artificial intelligence and cognitive science to develop the next generation informatics driven exploratory apparatus for scientific discovery. The resulting formal frameworks and computational tools have the potential to not only accelerate discovery but enable new modes of discovery by providing the tools that empower scientists to reach across disciplinary boundaries. Such tools can also contribute to enhanced modes of teaching and learning in science, technology, engineering, and mathematics (STEM) disciplines.
The results of the workshop (including the workshop report) will be freely disseminated to the larger scientific research and educational community.
|
1 |
2011 — 2014 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Hcc: Small: An Analytical Framework For Provenance-Rich Social Knowledge Collection @ University of Southern California
This project will investigate a new generation of provenance-rich social knowledge collection systems that will greatly improve the ability of people to create online communities of interest and share information. The research will transform the state of the art in social content collection in several important ways. First, social knowledge collection systems will be augmented to support contributors to structure factual content, so that information can be aggregated to answer reasonably interesting albeit simple factual queries. We will build on a semantic wiki framework to allow users to create structured factual content as object-property-value triples. It will not assume pre-defined ontologies, but rather develop algorithms that analyze current content and suggest opportunities for structuring contributions so they can be aggregated to answer simple queries. Second, they will include detailed provenance records that reflect how the content was created, allowing contributors to enter alternative viewpoints and enabling consumers to make quality and trust judgments. The research will include developing algorithms that derive trust metrics from the provenance records, and to allow users to define views on the content based on provenance criteria. It will create novel approaches to propagate trust across content topics and categories and complement existing algorithms that propagate trust in social networks. Third, the systems will proactively guide contributors to invest effort where it is most needed, developing novel algorithms to detect knowledge gaps, and by allowing users to define queries that will be used to drive further contributions.
This work has the potential for a broader impact in many areas where social content collection is already widely used, not only in scientific communities but also for societal issues, such as citizen participation in local communities, health, and governance. All these communities would benefit from further structure, provenance models, and guided knowledge collection. Despite their popularity, social content collection sites currently have important limitations. First, because the content has very little structure they cannot aggregate information and answer many simple questions. Second, contributors have uneven expertise and skills and therefore the content is of very varying quality, yet there is no assistance for consumers to tell apart the valuable from the dubious. Third, these sites depend on the initiative of contributors to figure out how the content needs to grow, and there is no systematic analysis to expose knowledge gaps and guide contributors proactively. This research project addresses all three of those issues.
|
1 |
2012 — 2013 |
Deelman, Ewa (co-PI) [⬀] Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Earthcube Community Workshop: Designing a Roadmap For Workflows in Geosciences @ University of Southern California
EarthCube is focused on community-driven development of an integrated and interoperable knowledge management system for data in the geo- and environmental sciences. By utilizing a cooperative, as opposed to competitive, process like that which created the Internet and Open Source software, EarthCube will attack the recalcitrant and persistent problems that so far have prevented adequate access to and the analysis, visualization, and interoperability of the vast storehouses of disparate geoscience data and data types residing in distributed and diverse data systems. This awards funds a series of broad, inclusive community interactions to gather adequate information and requirements to create a roadmap for a critical capability (workflow) in the development of EarthCube, a major new NSF initiative. Workflow in the context of EarthCube, and cyberinfrastructure in general, encompasses a broad range of topics including distributed execution management, the coupling of multiple models into composite applications, the integration of a wide range of data sources with processing, and the creation of refined data products from raw data. A key benefit of the funded work in terms of evaluating and creating community consensus on the best way forward for this capability (i.e., workflow) is the ability to document the provenance of data used in modeling and reproduce model and data-enabled scientific results. The funded workshop and information collecting activity will be open to all interested parties and is being led by a diverse and expert team of cyberinfrastructure developers, computer scientists, and geoscientists. Broader impacts of the work include converging on approaches, protocols, and standards that may be applicable across the sciences. They also include the fostering of close interaction between communities that do not commonly interact with one another and focusing them on the common goal of creating a new paradigm in data and knowledge management in the geosciences.
|
1 |
2013 — 2017 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Learning Big Data Analytic Skills Through Scientific Workflows @ University of Southern California
Big data analytics has emerged as a widely desirable skill in many areas. Although courses are now available on a variety of aspects of big data, there is a lack of a broad and accessible course that covers the variety of topics that concern big data analytics. As a result, acquiring practical data analytics skills is out of reach for many students and professionals, posing severe limitations to our ability as a society to take advantage of our vast digital data resources. The goal of this work is to develop curriculum materials for big data analytics to provide broad and practical training in data analytics in the context of real-world and science-grade datasets and data analytics methods. A key technical basis of the approach is the use of workflows that capture expert analytic methods that will be presented to users for practice with real-world datasets within pre-defined lesson units. The results of this work include lesson units for learning expert-level skills in big data analytics, a framework for non-programmers to understand basic concepts in big data analytics, and a hands-on workflow framework to learn by direct experimentation and exploration with scientific data. The work focuses on big data problems relevant to geosciences, such as water quality analysis, hydrology, lake ecosystem sustainability, climate science, and earth modeling. This project will supplement existing academic training materials in big data. The PIs will use real-world geosciences data and domain tasks. All the materials will be available under open source licenses. The proposed work will have great impact in the ability of students to pursue careers in big data analytics. The framework will be accessible to students who lack the programming skills required to assemble themselves end-to-end data analysis systems for experimentation and practical learning. The wide adoption of the proposed approach could ultimately lead to broad societal impact by changing the way people interact with data, learn from using scientific data, and their ability to participate in big data analysis.
|
1 |
2013 — 2016 |
Gil, Yolanda Mattmann, Chris Peckham, Scott Robinson, Erin Duffy, Christopher |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Earthcube Building Blocks: Software Stewardship For the Geosciences @ University of Southern California
Geoscience and environmental science software is crucial for data analysis and generating new knowledge and understanding about the Earth. Because reproducibility of operations, calculations, and predictions done with this software is important for science, commercial, and regulatory applications, it is important that the software generated by geoscientists and their colleagues be captured, curated, managed, and made available to all interested parties upon request. This project initiates that process by building partnerships between computer scientist, software developers, and scientists across all geoscience domains with the goal or creating a software ecosystem and a culture of software stewardship that will empower geoscientists and others to make their software accessible and manage it as a valuable scientific asset. This work will examine the possibility of creating effective on-line tools that will guide users through best practices in software use and development, including componentizing codes, describing and documenting their codes, licensing their programs, and maintaining reusable code. It will also explore the best and most effective means for training geoscientists and providing appropriate educational/training materials to dramatically improve the ability of geoscientists to more effectively develop and curate software generated by themselves, their students, and others. Broader impacts of the work include building infrastructure for science, engagement of early career professionals in both computer and geoscience, and increasing cyber sophistication of geoscience practitioners. The project supports a core team that is geographically distributed (California, Pennsylvania, and Colorado) and partners with a nation-wide consortium of academic organizations dedicated to advancing geoscience geospatial needs.
|
1 |
2013 — 2017 |
Gil, Yolanda Duffy, Christopher [⬀] Hanson, Paul |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Inspire Track 1: the Age of Water and Carbon in Hydroecological Systems: a New Paradigm For Science Innovation and Collaboration Through Organic Team Science @ Pennsylvania State Univ University Park
This INSPIRE award is partially funded by the Geobiology & Low Temperature Geochemistry Program in the Division of Earth Sciences in the Directorate for Geoscience; the Human Centered Computing Program in the Division of Information & Intelligent Systems in the Directorate for Computer & Information Science & Engineering; and the Virtual Organizations as Socio-technical Systems Program in the Division of Advanced Cyber-Infrastructure in the Directorate for Computer & Information Science & Engineering.
This project will develop new scientific work practices and cyberinfrastructure tools to advance the fields of hydrology and limnology (lake ecology). The project will develop a socio-technical model of "organic team science" in which scientists are motivated to collaborate across diverse scientific communities and to share and normalize data to solve scientific problems through an open framework. potentially creating new cross-disciplinary collaborations around the modelling problems. The project will advance hydrology by making already-collected geospatial data more usable for analysis and simulations. It will advance limnology by developing an integrated hydrodynamic model of lakes as connected to the broader hydrologic network to quantify water, material, nutrient and energy fluxes, which is potentially transformative for limnology. The project will be carried out with collaborators including the NSF Susquehanna/Shale Hills Critical Zone Observatory and the GLEON projects.
The project will provide benefits by developing cyberinfrastructure to provide access for limnology to climate and geospatial data and models as well as novel practices for supporting organic team science. The later is potentially a significant and transformative contribution to the infrastructure for science. The hydro-dynamic model could be useful for those managing lakes. The proposal includes plans for outreach to the scientific community to share these findings.
|
0.937 |
2014 — 2017 |
Gil, Yolanda Mattmann, Chris |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Earthcube Building Blocks: Collaborative Proposal: Geosoft: Collaborative Open Source Software Sharing For Geosciences @ University of Southern California
Geosciences software embodies crucial scientific knowledge, and as such it should be explicitly captured, curated, managed, and disseminated. The goal of this project is to create a system for software stewardship in geosciences that will empower scientists to manage their software as valuable scientific assets. Scientific software stewardship requires a combination of cyberinfrastructure, social infrastructure, and professional development infrastructure. The framework will result in an open transparent and broader access to scientific software to other scientists, software professionals, students, and decision makers. It will significantly improve the adoption of open data and open software initiatives, improve reproducibility, and advance scientific scholarship.
The proposed research will advance knowledge and understanding of scientific software as a valuable community asset that is worth sharing, curating, cataloging, validating, reusing, and maintaining. 1) Facilitating software publication through TurboSoft, a personal assistant (analogous to TurboTax) that guides a user through best practices. Users will choose the degree of investment they are willing to make in componentizing, describing, licensing, and maintaining their software. The system will encourage open source publication, the formation of communities around the software, and set up mechanisms for software citation and credit. 2) Enabling broad software dissemination through GeoSoft, a "software commons" for geosciences that will support software contributions (prepared through TurboSoft or otherwise), software discovery through multi-faceted search, and foster social interactions through dynamic formation of communities of interest. GeoSoft will interoperate with existing software repositories and modeling frameworks in geosciences. 3) Providing just-in-time training materials through GeoCamp, an annotated collection of educational units ranging from basic education to professional training on all aspects of software stewardship. GeoCamp will be seamlessly integrated with TurboSoft and GeoSoft, and present a wide range of options for learning in the context of a user?s context of interaction with the framework or independently.
|
1 |
2015 — 2016 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Workshop On Intelligent Systems Research to Support Geosciences and Earthcube Mission @ University of Southern California
This workshop will serve as a conduit for jump-starting synergistic research advancing our understanding of the Earth system through innovative cyber-infrastructure that pushes the envelope on information systems research. The workshop will help synthesize a vision and needs for intelligent systems research that will provide new capabilities envisioned by EarthCube to advance geosciences. The workshop will catalyze a community and research agenda in the emerging area of Discovery Informatics grounded on geoscience requirements.
Participants will discuss how to tackle problems in heterogeneous data integration and visualization (e.g., hand-made sketches, aerial imagery, field-data repositories, stakeholder interviews), ontological reasoning with scientific metadata and mathematical models (e.g., representing uncertainty, simulation predictions, evolving theories). The salient themes arising from discussions at the workshop will be articulated in detail in a final workshop report, which will be made available for the broad research community.
|
1 |
2015 — 2017 |
Gil, Yolanda Emile-Geay, Julien [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Earthcube Ia: Collaborative Proposal: Linkedearth: Crowdsourcing Data Curation & Standards Development in Paleoclimatology @ University of Southern California
Natural climate variability signficantly modulates anthropogenic global warming, and only paleoclimate observations can adequately constrain it. Moreover, such observations are most powerful when many records are brought together to provide a spatial understanding of past variability. However, there is currently no universal way to share paleoclimate data between users or machines, hindering integration and synthesis. Large-scale, international, paleoclimate data syntheses have a long and successful history, but have been needlessly labor-intensive. Recognizing that (1) paleoclimate data curation requires expert knowledge; (2) top-down data management approaches are ineffectual; (3) existing infrastructure does not foster standardization; there emerges a critical need for a flexible platform enabling crowdsourced data curation and standards development.The platform will be combined with editorial and community-driven processes which will result in a system that has the potential to engage a broad user base in geoscientific data curation. The proposed framework will lower barriers to participation in the geosciences, enabling more "dark data" to join the public domain using community-sanctioned protocols. The pilot project will facilitate the work of hundreds of paleoclimate scientists, accelerating scientific discovery and the dissemination of its results to society.
Semantic wikis provide a simple, intuitive interface to semantic languages and infrastructure that build on open Web architecture. Like traditional wikis, they enable the collaborative authoring of content. Secure access and time-stamped content also enable the tracking of changes and the accountability of users, as well as moderation capabilities by community members of recognized expertise. In contrast to traditional wikis, semantic wikis allow contributors to assign meaning to their content, specifying relationships between the objects they describe. This enables artificial intelligence reasoners to parse, process and translate these data into more useful forms. The technology is well-proven, scalable, and completely transparent to the user, requiring no computer science knowledge or more sophisticated technology than a web browser. The LinkedEarth Wiki will automatically translate this information into Linked Open Data, a universal format to share data across the Web. To demonstrate this concept?s broad applicability across paleoclimate science, the project?s target community is the PAGES2k consortium, an international collaboration dedicated to the climate of the Common Era. Social technologies will be developed to power collective curation, standards development and quality control by the community itself. The project will demonstrate applicability to other paleogeosciences, serving as a potential template for other geoscientific disciplines.
|
1 |
2015 — 2017 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Earthcube Ia: Collaborative Proposal: Integrated Geoscience Observatory @ University of Southern California
The habitability of planet Earth depends on a complex interaction between interior regions, solid surface, oceans, atmosphere, near-earth space environment, and Sun. Yet, study of this Sun-Earth system is traditionally broken up into separate geoscience disciplines, so that progress can be made by scientists working in reasonably-sized communities that share a common language and base of knowledge. To broach the bigger question of the interaction of the subsystems studied by the separate communities, it is necessary to overcome the barriers of communication posed by different observational instruments, software tools for interpreting data, and modeling methods. In answer to this challenge, the Integrated Geoscience Observatory is a pilot project that creates an online platform for integrating data and associated software tools contributed by separate geoscience research communities, into a unified toolset that brings them together. The vision is to expand the individual domains of geoscience research toward study of the whole Sun-Earth system, and in so doing to uncover the system level effects critical to the habitability of planet Earth.
EarthCube aims to develop a framework for assisting researchers in understanding the Earth system. This systems science challenge is recognized in the Decadal Survey in Solar and Space Physics [2012], with the conclusion "Data from diverse space- and ground-based instruments need to be routinely combined in order to maximize their multi-scale potential." The Integrated Geoscience Observatory is a pilot project that explores realization of this vision by focusing on the limited context of geospace research. The observatory creates an integrated package of software tools contributed by researchers with specific capabilities, and designed to enable integration of diverse observational data. Features of the toolkit include: (A) linking diverse data sets from multiple data repositories and automatically mapping them to a common user-specified coordinate grid; (B) implementing the well-known Assimilative Mapping of Ionospheric Electrodynamics (AMIE) procedure for assimilation of this data to yield a global picture; and (C) utilization of the EarthCube building blocks GeoSoft, for communicating ontology, and GeoDataspace, for attributing credit to contributors through publication of processed data. The toolset can be accessed and used either through a web-based computing environment, or through download packages for local installation, with a nearly seamless transition between the two.
|
1 |
2016 — 2018 |
Gil, Yolanda Mallick, Parag Kumar [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
A Discovery Engine For Reproducible and Comparable Multi-Omic Analysis
Abstract Development of tools to analyze and integrate the avalanche of heterogeneous, multi-omic data (e.g. genomics, transcriptomics, ChIP-seq, and proteomics) is a major NIH priority. These tools are necessary in order to transform huge datasets into the knowledge that will enable prevention, detection, and treatment of disease. Unfortunately, multi-omics analysis is exponentially more challenging than single-ome analysis, and vast majority of labs in the biological community are not equipped or capable of performing. We propose to develop and test an open-source workflow platform to facilitate multi-omic data analysis: The WINGS MultiOmic Discovery Engine (WINGS-MDE) will extend the capabilities of WINGS, an open-source semantic workflow system developed by Dr. Gil. Our innovative approach includes four key features. Ease of use - users interact with a simple, cloud-based web interface for workflow development and execution). Intelligence - a semantic workflow reasoner will significantly automate development, validate new or altered workflows and perform meta-analyses that will trigger a researcher's workflows on new data and alert them of interesting findings. Flexibility ? an adaptable plug-in architecture allows the alteration of parameters, the addition of algorithms and the assessment of incremental changes in workflow. Designed for multi-omics ? WINGS- MDE will support diverse, multi-omic workflows of broad interest. WINGS-MDE will also contain an execution engine able to execute over distributed resources and manage data at large scale. In addition, we will develop a provenance repository to capture how data were analyzed to facilitate reuse and reproducibility. Our Specific Aims are: to (1) Create a repository of semantic workflows for performing multi-omic analysis which will contain the most common multi-omic analysis components and enable their use and reuse; (2) Develop a multi-omic discovery meta-workflow engine which will enable researchers to compare workflows and establish how sensitive results re to particular aspects of a workflow and (3) Develop an inter-lab workflow sharing environment which will support the enhanced annotation and dissemination of public datasets. This work will enable diverse researchers to develop and perform multi-omic analysis in a rigorous, reproducible, and shareable manner. We anticipate that the analytical methods produced through this project will improve the ease-of-use, transparency, reproducibility, and testability of multi-omic analysis improving their impact in understanding disease biology and treatment.
|
0.954 |
2017 — 2020 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Earthcube Data Infrastructure: Collaborative Proposal: a Unified Experimental-Natural Digital Data System For Cataloging and Analyzing Rock Microstructures @ University of Southern California
When viewed at the micro-scale, rocks reveal structures that help to interpret the processes and forces responsible for their formation. These microstructures help to explain phenomena that occur at the scale of mountains and tectonic plates. Interpretation of microstructures formed in nature during deformation is aided by comparison with those formed during experiments, under known conditions of pressure, temperature, stress, strain and strain rate, and experimental rock deformation benefits from the ground truth offered through comparison with rocks deformed in nature. However, the ability to search for relevant naturally or experimentally deformed microstructures is hindered by the lack of any database that contains these data. The researchers collaborating on this project will develop a single digital data system for rock microstructures to facilitate the critical interaction between and among the communities that study naturally and experimentally deformed rocks. To aid in the comparison of microstructures formed in nature and experiment, we will link to commonly used analytical tools and develop a pilot project for automatic comparison of microstructures using machine learning.
Rock microstructures relate processes at the microscopic scale to phenomena at the outcrop, orogen, and plate scales and reveal the relationships among stress, strain, and strain rate. Quantitative rheological information is obtained through linked studies of naturally formed microstructures with those created during rock deformation experiments under known conditions. The project will develop a single digital data system for both naturally and experimentally deformed rock microstructure data to facilitate comparison of microstructures from different environments. A linked data system will facilitate interaction between practitioners of experimental deformation, those studying natural deformation and the cyberscience community. The data system will leverage the StraboSpot data system currently under development in Structural Geology and Tectonics. To develop this system requires: 1) Modification of the StraboSpot data system to accept microstructural data from both naturally and experimentally deformed rocks; and 2) Linking the microstructural data to its geologic context ? either in nature, or its experimental data/parameters. The researchers will engage the rock deformation community with the goal of establishing data standards and protocols for data collection, and integrate our work with ongoing efforts to establish protocols and techniques for automated metadata collection and digital data storage. To analyze the microstructures studied and/or generated by these communities, we will ensure StraboSpot data output is compatible with commonly used microstructural tools. They will develop a pilot project for comparing and analyzing microstructures from different environments using machine-learning.
|
1 |
2017 — 2019 |
Gil, Yolanda |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Proposal: Earthcube Integration: Accelerating Scientific Workflows Using Earthcube Technologies (Asset) @ University of Southern California
A major need in the geosciences is to reduce the amount of time geoscientists spend on "data wrangling" tasks (finding, accessing, subsetting, gridding, processing, reformatting, visualizing) so that they can spend more time on complex scientific analysis. This project will develop a means for scientists to document and interact with all of the data management and analysis tools that they need to create a scientific result, and it will add tools to find common patterns in their work to help find tools that could reduce the time needed to produce the result. This work may also help to make moethods and analyses more transparent by providing the means to track the steps in creating a scientific result.
This project builds on relationships among cyberinfrastructure experts and geoscientists examine current scientific workflows and the integration of EarthCube tools. The work will include a workflow sketching interface to specify the steps, dependencies, current tools, current duration, and other important aspects of a scientist's workflow. As part of the proposed pilot project, the team will examine two geoscience use cases (hurricane risk and water resources) in detail and assign cyberinfrastructure experts to integrate EarthCube tools into these science workflows, to demonstrate the increases in productivity that are realized. In addition, the team will conduct a workshop/clinic at a major geoscience conference to collect additional use cases from the community, where geoscientists will describe their workflows and receive personalized advice on the use of EarthCube tools. If successful, the researchers may expand on this pilot project to conduct many more workshops at other venues and to map many more scientific workflows to corresponding EarthCube technologies.
|
1 |
2020 — 2021 |
Gil, Yolanda Altintas, Ilkay Hiers, John Linn, Rodman |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf Convergence Accelerator ? Track D: Artificial Intelligence and Community Driven Wildland Fire Innovation Via a Wifire Commons Infrastructure For Data and Model Sharing @ University of California-San Diego
The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future. The broader impact and potential societal benefit of this Convergence Accelerator Phase I project is to create the WIFIRE Commons, a data-driven, artificial intelligence (AI) enabled and model-based scientific approach that ultimately aims to limit and even prevent the devastating effects of wildfires by using advanced technologies to support fire mitigation, preparedness, response, and recovery. The combination of wildfire data, AI and the physics of fire behavior in the main design of WIFIRE Commons drives multidisciplinary collaboration and engagement with educators, municipal leaders, and fire managers to ensure the Commons is designed for translational use. Data and model sharing are core to the effort, as is strategic partnerships and close collaboration with the private and public sectors. The project team includes educators from Hispanic-serving institutions and advocates for increasing participation of women in the fire workforce and data science fields. In addition, WIFIRE Commons? AI Gateway machine learning, scalable computing and interactive geospatial analysis tools will be applicable to any area that can benefit from modeling.
This project seeks to undertake convergence research on AI integrated wildland fire research and response, and to build a framework we call the WIFIRE Commons for using AI to enable innovative optimization of the evolving combinations of physics-based wildfire models and heterogeneous data sets used to monitor and predict wildfires in real-time. The Phase I effort will contribute toward this goal through a design-thinking approach with five streams of deliverables: 1) community convergence workshops, 2) a prototype data and model commons framework, 3) use-inspired case studies to demonstrate the proposed AI innovations, 4) prototyping of educational, outreach, and public information activities; and 5) Phase II planning. The long-term vision is to create a sustainable and open source AI-driven data and model commons to facilitate and leverage collaborations to ?harness AI innovations? supporting use-inspired societal and scientific wildland fire applications. Driven by design-thinking and building upon prior research by our team members (WIFIRE, MINT, QUIC-fire), the proposed WIFIRE Commons convergence research and data and model sharing framework will enable development of novel artificial intelligence techniques and reusable models that can be utilized in many applications. This Commons infrastructure will catalog, curate and integrate data and models for AI-driven fire science, maintain open programmatic access to data in a cloud-compatible form that can be integrated into the AI process through a gateway interface, and ensure provenance of data and models over time. This AI-enabled smart data/model integration will transform the agility of science based wildland fire decision making, allowing for new kinds of models and data to be assimilated rapidly and allowing an expanding base of users to understand levels of uncertainty.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.976 |
2021 — 2023 |
Gil, Yolanda Altintas, Ilkay Hiers, John Linn, Rodman |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf Convergence Accelerator – Track D: Artificial Intelligence and Community Driven Wildland Fire Innovation Via a Wifire Commons Infrastructure For Data and Model Sharing @ University of California-San Diego
A century of suppressing wildfires has created a dangerous accumulation of flammable vegetation on landscapes, contributing to megafires that risk human life and property, and permanently destroy ecosystems. Small controllable fires can dramatically reduce the risk of large fires that are uncontrollable. BurnPro3D is a decision support platform to help the fire response and mitigation community understand risks and tradeoffs quickly and accurately to more effectively manage wildfires or conduct controlled burns.
To achieve this vision, this project is developing specific AI innovations to: (i) Use knowledge management techniques to fuse data coming from diverse sources and prepare it for fire modeling; (ii) Conduct physics-based machine learning within next-generation fire models to use deep learning to understand complex processes that drive fire behavior; (iii) Apply constraint optimization methods to address complex tradeoffs in the decision process for the placement and timing of controlled burns; (iv) Employ explainable AI to increase the interpretability of data and models by diverse users all along the decision-making chain.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.976 |