2010 — 2015 |
Li, Bin Peng, Lu |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Exploring Statistical Models to Optimize Hardware and Software Under Processor Reliability Constraints @ Louisiana State University & Agricultural and Mechanical College
With technology scaling coupled with increasing power densities, modern processors suffer from potential soft errors and hard errors. The reliability analysis of such multi-threaded processors, e.g. Simultaneous Multithreading (SMT) and Chip-Multiprocessors (CMP), where inter-thread resource contention exists, is a relatively unexplored area. Furthermore, the modeling complexity is exacerbated by two additional factors: (1) increasing number of cores in a chip; and (2) heterogeneity brought by manufacturing process variation. Software wise, traditional compiler designs are aimed at providing high performance and recently low power when generating object codes. With increasing hardware vulnerabilities, however, high performance computing programs suffer from unexpected errors and exceptions, which might be mitigated by using fault-tolerance techniques such as error detections and check pointing, but still eventually hurt their performance. Apart from a reliable hardware platform, software designers can further improve system reliability by generating error resilient codes. Moreover, analysis of software's architectural vulnerability is still in an ad hoc stage. Therefore, this project proposes a predictive framework to handle the above challenges by employing modern statistical and machine learning methods. The outcomes of this project include a predictive framework which guides for reliable software and hardware optimization and its applications to high performance computing.
The broader impact plans include outreach activities and undergraduate and graduate training. The interdisciplinary nature of the proposed work allows students to learn cutting-edge knowledge from different areas to broaden their scope of training as well as to enhance their productivity. Students from the under-represented groups will be encouraged and given priorities for joining the project.
|
0.943 |
2013 — 2014 |
Peng, Lu |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Travel Support For the 20th Annual Ieee International Symposium On High Performance Computer Architecture (Hpca 2014) @ Louisiana State University & Agricultural and Mechanical College
The award supports the attendance of students to the 20th Annual IEEE International Symposium on High Performance Computer Architecture (HPCA 2014), to be hosted on February 15-19th, 2014 in Orlando, Florida. The HPCA Symposium, a top-tier conference on computer architecture related research, has enjoyed a rich history of success. The research results presented at HPCA conferences have shown strong impacts in this research field and are highly regarded among the computer architecture community. Students have much to gain from attending HPCA such as learning state-of-the-art methodologies, being exposed to novel techniques, and interacting with senior researchers in their chosen area of expertise.
The importance of formal verification is becoming widely accepted. Increasing the attendance of students in HPCA will expose more students to the exciting developments in the field and in the long run will help to build a broad base of expertise in the field, which will impact both industry and academia.
|
0.943 |
2014 — 2017 |
Srivastava, Ashok (co-PI) [⬀] Choi, Jin-Woo (co-PI) [⬀] Peng, Lu |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Leveraging Dynamic Pin Switching to Power Up Dark Silicon and Increase Off-Chip Bandwidth @ Louisiana State University & Agricultural and Mechanical College
The end of Dennard Scaling, i.e., as transistors get smaller the power density is no longer constant, has led to a large number of inactive or significantly under-clocked transistors on modern chip multi-processors in order to comply with the power budget and prevent overheating. This so-called ?dark silicon? is one of the most critical constraints that will hinder scaling in accordance with Moore?s Law in the future. Additionally, off-chip memory bandwidth has also proven to be a major performance limiting factor, especially for multi- and many-core processors.
To address these concerns, this project proposes a novel design in which off-chip pins can dynamically switch between supplying power and transmitting signals. The circuit implementation for the proposed dynamic pin switching design, requiring only minor changes to existing processor and motherboard circuitry, will be investigated. Many issues including the impact of interfacing with the DRAM and the power delivery network, signal transmission and integrity, thermal issues, and area overhead will be thoroughly and carefully considered. A switchable pin design could be used to mitigate dark silicon by delivering extra power or to boost processor performance by increasing memory bandwidth. The broader impacts include incorporating the research advances into undergraduate and graduate education, as well as K-12 outreach activities.
|
0.943 |
2015 — 2018 |
Peng, Lu |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr: Small: Collaborative Research: Comprehensive Algorithmic Resilience (Car) For Big Data Analytics @ Louisiana State University & Agricultural and Mechanical College
Big Data analytics is the process of mining useful knowledge in very large data sets, critical to the advancement of many research and application fields. With manufacturing technology downscaling of processor size coupled with increasing power densities, modern computer systems suffer from potential hardware and software failures, which can manifest themselves as errors. The errors happen when long-running Big Data analytics are executed on the systems, causing system crashes or worse, returning undetected incorrect results. While numerous hardware-based resilience methods exist, they often come at the cost of excessive power efficiency reduction and substantial design complexity enlargement among others.
The project aims to reinforce popular Big Data analytics by embracing a host of comprehensive algorithmic resilience (CAR) software techniques that include concurrent error detection, coordinated checkpointing, and execution recovery, for high execution resilience. Upon detecting potential hardware and software errors concurrently during analytics, CAR enables execution recovery from detected errors without lofty overhead common to hardware-based resilience methods. Research activities of the project aim to achieve six main objectives that focus on addressing several technical challenges to realize CAR, based on investigators' encouraging preliminary results and prior work. The success of this project can benefit wide scientific and industrial applications due to its better support of Big Data analytics and processing. Research advances from this research are to be incorporated into undergraduate and graduate education, to be disseminated and shared broadly through technical presentations and by a website, and to inspire high school students for their STEM interest.
|
0.943 |
2020 — 2022 |
Kutter, Thomas Peng, Lu Yan, Le Byerly, Zachary (co-PI) [⬀] Chen, Feng |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cc* Compute: Deep Bayou: Accelerating Scientific Discoveries With a Gpu Cluster @ Louisiana State University
Modern science increasingly relies on high performance computing (HPC) and data analytics to make discoveries about our world. In recent years, graphics processing units (GPUs), a specialty computer hardware originally developed for graphic applications with a unique, highly parallel architecture, has become a key enabling technology for both types of workloads. Through this project, Louisiana State University (LSU) expands its existing computing facilities with the addition of Deep Bayou, a GPU cluster consisting of 12 compute nodes and 26 NVIDIA GPU devices.
The initial research projects enabled by Deep Bayou include particle physics, gravitational wave source characterization with LIGO, ocean and ocean-atmosphere modeling, bioinformatics, infrastructure modeling for disaster management, material sciences, computational biology, and fundamental GPU architecture design. The Deep Bayou infrastructure and organization is also open to additional initiatives and projects through an application process. In addition to researchers across LSU campus, the project partners with the Open Science Grid (OSG) to share the GPU resources among the research community across the nation. Deep Bayou will make a significant contribution to the building of future HPC workforce, as many training and education activities plan to leverage its availability. Through existing state- and federal-sponsored programs such as Louisiana Optical Network Infrastructure (LONI), Baton Rouge: Bringing Youth Technology, Education and Success (BRBYTES) and Research Experiences for Undergraduates (REU), K-12 students and undergraduates can have access to courses and workshops facilitated on the cluster, many of whom are from underrepresented minority communities.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.951 |