2006 — 2011 |
Katz, Daniel Allen, Gabrielle Seidel, Edward Twilley, Robert (co-PI) [⬀] Wischusen, E. William Kosar, Tevfik |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mri: Development of Petashare: a Distributed Data Archival, Analysis and Visualization System For Data Intensive Collaborative Research @ Louisiana State University & Agricultural and Mechanical College
Review Analysis: Major Research Instrumentation (MRI) Program FY06
Proposal #: CNS 06-19843 PI(s): Kosar, Tevfik Allen, Gabrielle D.; Seidel, Edward; Twilley, Robert R.; Wischusen, E. William Institution: Louisiana State University Baton Rouge, LA 70803-2701 Title: MRI/Dev: Dev. of PetaShare: A Distributed Data Archival, Analysis and Visualization System for Data Intensive Collaborative Research Ratings: E, V, V, V Panel Ranking: Competitive (C) Result: Recommend Amount Req: $ 957,678 Amount Rec: $ 957,678
Project Proposed:
This project, developing a distributed data archival, analysis, and visualization instrument (called PetaShare) for data intensive collaborative research, enables transparent handling of underlying data sharing, archival, and retrieval mechanisms and makes data available to the scientist for analysis and visualization on demand. Designed to scale to the petabyte level, the instrument responds to an urgent need of scientists working with large data generation, sharing, and collaboration requirements. Involving five universities in the state (LSU, LaTech, Tulane, ULL, and UNO), the infrastructure consists of three layers of storage distributed at multiple sites: Primary very high speed RAM storage for data visualization; Secondary disk storage for data analysis and processing; and Tertiary tape storage for data archival and long term studies. Unlike existing approaches, PetaShare treats data resources and the tasks related to data access as first class entities just like computational resources and compute tasks, and not simply the side effect of computation. Expected key technologies include data-aware storage systems and data-aware schedulers, which take the responsibility of managing data resources and scheduling data tasks from the user, performing these tasks transparently. The instrument supports many important data intensive applications from different fields, including coastal and environmental modeling, geospatial analysis, bioinformatics, medical imaging, fluid dynamics, petroleum engineering, numerical relativity, and high energy physics.
Broader Impact: The system complements the high-performance computing resources at the five interconnected campuses in this EPSCoR state, boosting interdisciplinary research among them. In addition to directly servicing and promoting research, PetaShare contributes in the training of hundreds of students. The system exhibits a high potential of increasing the accuracy and efficiency of storm surge models and hurricane tracking predictions, thereby enabling rapid and effective disaster responses that could affect millions of people in the world.
|
0.924 |
2009 — 2016 |
Kosar, Tevfik |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Data-Aware Distributed Computing For Enabling Large-Scale Collaborative Science @ Louisiana State University & Agricultural and Mechanical College
CAREER: Data-aware Distributed Computing for Enabling Large-scale Collaborative Science
PI: Tevfik Kosar, Louisiana State University
Abstract
Applications and experiments in all areas of science are becoming increasingly complex and more demanding in terms of their computational and data requirements. Some applications generate data volumes reaching petabytes. Sharing, disseminating, and analyzing these large data sets becomes a big challenge, especially when distributed resources are used.
This Faculty Early Career Development (CAREER) project proposes a new distributed computing paradigm called ?data-aware distributed computing?, which will include a diverse set of algorithms, models, and tools for mitigating the data bottleneck in distributed computing systems; and will support a broad range of data-intensive as well as dynamic data-driven applications. As part of this project, research and development will be performed on three main components: i) a data-aware scheduler which will provide capabilities such as planning, scheduling, resource reservation, job execution, and error recovery for data movement tasks; ii) integration of these capabilities to the other layers in distributed computing such as workflow planning, resource brokering, and storage management; and iii) further optimization of data movement tasks via dynamically tuning of underlying protocol transfer parameters.
Research will be integrated to literally all levels of education which will include science projects, seminars and summer camps on data-intensive computing with K-12 students (where 99% is minority); curriculum development, mentoring, and international student/intern exchange programs for undergraduate and graduate students; summer internships and workshops specifically for HBCU community including faculty members. The tools and software developed in this project will be available to public via open-source distribution.
|
0.924 |
2009 — 2013 |
Ott, Christian Allen, Gabrielle Loffler, Frank Diener, Peter Schnetter, Erik Pullin, Jorge (co-PI) [⬀] Kosar, Tevfik |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Community Infrastructure For General Relativistic Mhd (Cigr) @ Louisiana State University & Agricultural and Mechanical College
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5). The Community Infrastructure for General Relativistic Magnetohydrodynamics (CIGR) collaboration will create a modern, scalable, and open, community toolkit and cyberinfrastructure for general relativistic magnetohydrodynamics (GRMHD). These tools will advance computational capabilities across the fields of numerical relativity and computational astrophysics and provide the collaborative infrastructure needed to accelerate the development of simulation codes able to accurately model grand challenge problems such as the coalescence of binary neutron stars, core-collapse supernovae, and gamma-ray bursts.
CIGR includes four core thrusts that capitalize on accumulated experience with the Cactus framework, the scalable Carpet adaptive mesh refinement driver and the Whisky code for general relativistic hydrodynamics developed by the European Union Astrophysics Network: (i) providing a featureful toolkit for GRMHD that research groups can use and extend to build their own cutting edge production codes; (ii) providing an open code for GRMHD that integrates together components for general relativistic hydrodynamics, microphysical equations of state, magnetohydrodynamics, and radiation transport; (iii) developing new enabling cyberinfrastructure for numerical relativity, including highly reliable and optimized input/output, distributed storage and archives, data and simulation classification and provenance; and (iv) supporting these toolkits and tools on increasing large and complex computing environments such as the NSF's TeraGrid, DOE's LCF, and prepare for soon to arrive petascale and data-intensive environments such as NSF XD, Blue Waters and DataNet programs.
|
0.924 |
2009 — 2013 |
Tohline, Joel [⬀] Kosar, Tevfik |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Stci: Development of Stork Data Scheduler For Mitigating the Data Bottleneck in Petascale Distributed Computing Systems @ Louisiana State University & Agricultural and Mechanical College
This proposal will be awarded using funds made available by the American Recovery and Reinvestment Act of 2009 (Public Law 111-5), and meets the requirements established in Section 2 of the White House Memorandum entitled, Ensuring Responsible Spending of Recovery Act Funds, dated March 20, 2009.
The STCI Development of Stork Data Scheduler for Mitigating the Data Bottleneck in Petascale Distributed Computing Systems project will enhance the Stork data scheduler to mitigate the end-to-end data handling bottleneck in petascale distributed computing systems and make it available for a wide range of user community as production quality software. New functionalities of Stork will include: data aggregation and caching; early error detection, classification, and recovery; integration with workflow planning and management; optimal protocol tuning; and data transfer performance prediction services.
Intellectual Merit: The Stork data scheduler will make a distinctive contribution to petascale distributed computing in the areas of planning, scheduling, monitoring and management of data placement tasks and application-level end-to-end optimization of networked I/O for petascale distributed applications. Unlike existing approaches, it will treat data resources and the tasks related to data access and movement as first class entities just like computational resources and compute tasks, and not simply the side effect of computation.
Broader Impact: This project will impact not just traditionally compute intensive disciplines from science and engineering, but also new emerging computational areas in the arts, humanities, business and education. The PI will be collaborating with other leading institutions in the area of distributed data management such as LBNL, ISI/USC, UNC, UCSD, and University Chicago/Argonne to integrate Stork with their data management solutions and disseminate it to their user communities. The comprehensive education component of the project will include science projects and summer training camps on data-intensive computing with K-12 students (where 99% are minority students), undergraduate and graduate student training, international student/intern exchange program, and minority workshops.
|
0.924 |