2012 — 2015 |
Bhattacharjee, Abhishek Bianchini, Ricardo (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Heterogeneous Memory Architectures For Future Many-Core Systems @ Rutgers University New Brunswick
As computation devices are becoming increasingly ubiquitous, they are running software from a variety of domains ranging from the mobile and graphics space all the way to high-performance, large-data server domains. This software must achieve high performance, while consuming minimal power for environmental reasons. As a result, computing systems are becoming increasingly heterogeneous, accommodating diverse hardware structures conducive to different software domains on a single platform. While processors in computing systems have traditionally been the focal point for this proliferation of heterogeneity, memory systems continue to be architected using traditional means. As such, a major challenge in forward-looking heterogeneous computing systems is how best to design memory systems to support this heterogeneity. These designs are crucial to ensure that computing systems continue to adapt to their varied software, performance, and energy requirements.
This research provides the foundation to construct memory systems using a variety of architectures and technologies to run software from a variety of domains in a high-performance, yet energy-efficient manner. This work focuses on (1) constructing a systematic design methodology, software characterizations, and analytical models that guide how diverse current and future memory technologies should be combined to best support heterogeneous computing systems software, and (2) designing a range of detailed, yet flexible experimental platforms to test such studies. This project impacts society in a number of ways, most keenly by (1) reducing the energy consumption of current and future heterogeneous computing systems without sacrificing their high performance; (2) providing a systematic and rigorous scientific means to match different memory technologies and components to different software requirements; and (3) creating the evaluation tools necessary to accommodate a host of future memory subsystem research.
|
0.951 |
2013 — 2019 |
Bhattacharjee, Abhishek |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career:Cross-Core Learning in Future Manycore Systems @ Rutgers University New Brunswick
As computing devices solve increasingly complex and diverse problems, engineers seek to design processors that provide higher performance, while remaining energy-efficient for environmental reasons. To achieve this, processor vendors have embraced manycore devices, where thousands of cores cooperate on a single chip to solve large-scale problems in a parallel manner. They have further incorporated heterogeneity, combining cores with different architectures on a single chip in a bid to provide ever-increasing performance per watt. This project boosts the search for higher energy-efficient performance by inventing novel cross-core learning techniques. Cores in current chips individually learn about the behavior of parallel programs in order to run programs more efficiently in the future, devoting complex and power-hungry hardware structures to do this. However, this research observes that parallel programs tend to exercise the hardware structures of different cores in correlated ways, meaning that the behavior of the program run on one core can be communicated to other cores for various performance and power benefits. As such, this form of intelligent cross-core information exchange is effective in achieving high performance per watt across computing domains from datacenters to embedded systems
In this light, this research provides techniques to deduce how similarly a parallel program's various threads exercise their cores' hardware structures (looking at a range of different programmer, compiler, and architectural mechanisms to do so). When this is detected, cross-core learning hardware gleans the information that is most useful to exchange to improve performance or power, and then transmits this information among heterogeneous cores using low-overhead hardware/software techniques. This project develops a lightweight runtime software layer to orchestrate this information exchange, relying on dedicated hardware support when necessary. Through developing this framework, cross-core learning is applied to a number of specific cases, ranging from higher-performance manycore cache prefetching and branch prediction, to performance and power-management techniques for interrupts and exceptions in scale-out systems, as well as thread and instruction scheduling. Furthermore, this project heavily disseminates knowledge on how to design and program large-scale manycore systems (or scale-out systems) by involving students at the graduate, undergraduate, and high-school levels through active research and coursework. Overall, this work impacts the engineering community and broader society by: (1) helping to achieve high-performance, but also energy-efficient and environmentally-friendly computing systems; (2) providing academics and chip designers a design methodology and infrastructure to study manycore design; (3) broadening the participation of underrepresented groups in computer science; (4) educating graduate, undergraduate, and high-school students on parallel programming for manycore systems.
|
0.97 |
2013 — 2017 |
Bhattacharjee, Abhishek Bianchini, Ricardo [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Taming the Combinatorial Explosion of Power Management For Future Manycore Systems @ Rutgers University New Brunswick
We will soon enter the manycore era, in which systems will include hundreds of cores, large multi-level cache structures, sophisticated networks-on-chip, many memory controllers, and huge amounts of main memory. These systems will also embody a rich array of power management mechanisms and will strive to achieve high performance under various power, energy, and thermal constraints. Unfortunately, the uncoordinated power management of a system's components can produce oscillating behavior, higher power/energy/temperature, and/or excessive performance degradation. Given their large number of hardware components and the wide spectrum of available mechanisms, manycore systems will have to coordinate the actions of their various power managers. Moreover, to decide on a (coordinated) course of action, these systems will have to comprehensively and rigorously reason about various performance and power/energy/temperature tradeoffs.
This project will develop global power coordination (GPC) to manage the combinatorial explosion of possible power management configurations in manycore systems. GPC will be realized using an engine that analyzes the space of possible power state configurations for the entire system. It will then decide which configurations are most appropriate, and use per-component power managers to actuate the proper settings. To optimize the search for good configurations, GPC will consider greedy search heuristics that prune the space by relying on novel techniques that estimate the benefit of power state changes, group resources with similar power management hooks, and leverage prior observed behavior and a hierarchical organization.
This project will broadly impact society in many ways. First, addressing the power, energy, and temperature problems of manycore systems can greatly impact the datacenters that run our Internet services and the high-performance systems that advance our science. Second, tackling these problems will address one of the key technological barriers in the computing industry. Third, the project will educate graduate, undergraduate, and high-school students, while broadening the participation of underrepresented groups in computer science.
|
0.951 |
2013 — 2017 |
Bianchini, Ricardo (co-PI) [⬀] Bhattacharjee, Abhishek |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Xps: Clcca: Enhancing the Programmability of Heterogeneous Manycore Systems @ Rutgers University New Brunswick
As computing devices are used to solve increasingly complex and diverse problems with ever-increasing multidimensional data-sets, programmers are tasked with writing high-performance and energy-efficient code. To run this code, processor vendors are adopting heterogeneous systems, where conventional general-purpose cores are integrated with accelerators like graphics processing units (GPUs), cryptographic accelerators, database accelerators, and video encoders/decoders. To ensure the widespread adoption of these systems, it is essential that their programming models are effective and easy to use. Unfortunately, current programming models for these systems are challenging, requiring the programmer to explicitly allocate, manage, and marshal memory back and forth between cores and accelerators. As a result, software is often error-prone and buggy, and suffers overheads from data replication and movement. As future systems incorporate increasing levels of heterogeneity, this problem will worsen.
This proposal develops unified address spaces for cores and accelerators, which is a key part of an effective programming model. A unified address space (in both virtual and physical addresses) increases system programmability because: (1) programmers need not manually allocate and manage their data movement between hundreds of heterogeneous compute units; (2) the system automatically allocates, replicates, and migrates data among heterogeneous components as execution shifts; (3) these systems support new algorithms that require simultaneous core and accelerator access to common data structures (e.g., producer-consumer programs where CPUs and GPUs communicate through software task queues); (4) programs are now more portable across systems with alternate memory hierarchies. This work studies mechanisms to support these benefits (while maintaining high performance and low power) by developing novel hardware (e.g., new memory controllers, Translation Lookaside Buffer augmentations, shootdown mechanisms) and operating system (OS) support (e.g., new OS memory allocation mechanisms and support for page allocation, replication, and migration on heterogeneous systems and memory).
|
0.951 |
2017 — 2020 |
Nguyen, Thu Bhattacharjee, Abhishek Kremer, Ulrich (co-PI) [⬀] Rodero, Ivan Parashar, Manish (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ii-En: Collaborative Research: Enhancing the Parasol Experimental Testbed For Sustainable Computing @ Rutgers University New Brunswick
This project will enhance an experimental datacenter for sustainable computing. Datacenters consume vast amounts of energy, totaling about 1.8% of the US electricity usage in 2014. Thus, the energy efficiency, energy-related costs, and overall sustainability of datacenters are critical concerns. NSF funded an experimental green datacenter called Parasol, which has previously demonstrated that the combination of green design and intelligent software management systems can lead to significant reductions in energy consumption, carbon emission, and cost. The enhanced version of this project will update energy sources, network technologies and management software.
Running real experiments in live conditions using Parasol led to findings that were not possible in simulation. This proposal seeks to update and enhance Parasol with current and next generation power-efficient servers, improve network connectivity and integrate software-defined networking (SDN) and Wi-Fi capabilities, increase solar energy generation capacity, add a low emission fuel cell power source, diversify energy storage, and improve the cooling system to advance green computing. The investigators will update and enhance Parasol's current software stack for monitoring, programmatic control, and remote access for the new hardware enhancements. Specific research goals are resource management in green datacenters, including coordinated workload, cooling, and energy scheduling against environmental and load variability to maximize the benefits of green datacenters and to help improve grid power management. A specific goal is to leverage accelerators such as GPUs and deep learning hardware, which promise excellent performance/watt ratios.
|
0.951 |
2018 — 2021 |
Bhattacharjee, Abhishek |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Architectural Techniques For Energy-Efficient Brain-Machine Implants @ Rutgers University New Brunswick
This project focuses on the development of neural prostheses or brain implants to advance the scientific community's understanding of how the brain works, and to take a step towards devising treatment for neurological disorders. Brain implants are devices that are surgically embedded under the skull (of animals or humans in the context of scientific experiments and treatment of neurological disorders respectively) and placed on brain tissue, where they stimulate and record from hundreds of neurons. These devices are being used today to record neuronal electro-physiological data to unlock mysteries of the brain; to treat symptoms of Parkinson's disease, Tourette's syndrome, and epilepsy, with techniques like deep brain stimulation; and to offer treatment to those afflicted by paralysis or spinal cord damage via motor cortex implants. A key design issue with brain implants is that they are highly energy constrained, because they are embedded under the skull, and techniques like wireless power can heat up the brain tissue surrounding the implant. This project offers architectural techniques to lower the power consumption and energy usage of processing elements integrated on brain implants, whether they are general-purpose processors, customized integrated circuits, or programmable hardware. In tandem with its scientific studies, this project integrates an educational component to train high-school students, undergraduates, and PhD students on neuro-engineering techniques crucial to the society's continued efforts to shed light on how the brain works.
In terms of technical details, this project performs the first study on architectural techniques to improve the energy efficiency of embedded processors on implants by leveraging their existing low-power modes. Low-power modes can be used in the absence of interesting neuronal activity, which corresponds to periods of time when the implant is not performing useful work and the processor can be slowed down. A critical theme of this project is to show that hardware traditionally used to predict program behavior (e.g., branches or cache reuse) can also be co-opted to also predict brain activity, and hence anticipate interesting/non-interesting neuronal spiking. Such predictors can consequently be used to drive the implant processor in and out of low power mode. This project studies how to design hardware brain activity predictors that predict neuronal activity accurately, scalably, and efficiently, and how to integrate such predictors with low power modes on commodity embedded processors. The techniques are drawn from hardware machine-learning approaches for program prediction and consider neuronal spiking data extracted from brain sites on mice, sheep, and monkeys. Successful deployment of these approaches is expected to save as much as 85% of processor energy, effectively quadrupling battery lifetimes on implants being designed for mice.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.97 |
2020 — 2021 |
Cohen, Jonathan [⬀] Cohen, Jonathan [⬀] Bhattacharjee, Abhishek Willke, Theodore |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf Convergence Accelerator - Track D: a Standardized Model Description Format For Accelerating Convergence in Neuroscience, Cognitive Science, Machine Learning and Beyond
The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future. Accelerating convergence in science and technology depends on the ability to represent and share not only data, but also theories and models in the most objective, transparent, and reproducible way possible. This project will develop a Model Description Format (MDF) that can be used for computational models that span from neuroscience and psychology to machine learning, and that can serve as the foundation for extensions that serve an even broader scope of models in population biology and the social sciences.
Such an MDF would have numerous benefits, both scientific and technological, including: dissemination and validation of model reproducibility; migration of models across domains (e.g., use of models of brain function in machine learning applications); integration of models at different levels of analysis (e.g., biophysically-realistic neural models into models of cognitive function, cognitive models as agents in population level models); exploitation of complementary strengths of existing packages (e.g., design in a familiar environment but execute in one with better tools for parameter tuning and/or data-fitting); and more efficient development of new tools, by providing developers with a representative diversity of models, all in a common format.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2021 — 2022 |
Shao, Zhong [⬀] Zhong, Lin (co-PI) [⬀] Bhattacharjee, Abhishek Khandelwal, Anurag |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Pposs: Planning: High-Performance Certified Trust For Global-Scale Applications
A global-scale public infrastructure of distributed computing resources, in the form of data centers of various scales, has emerged in the past decade. Today, a user of this global infrastructure must trust the infrastructure vendors based on their informal textual contracts. This trust model provides limited legal protection of user interests and has become a key barrier for more services to migrate into the public infrastructure, stymieing innovation and competition. This project's key novelty is to build highly performant, certified execution environments (CEEs) for large-scale distributed systems. In doing so, the project explores, refines, and discovers design principles for scaling certified trust --- specifically, scaling up to include the entire software stack, and scaling out to include globally distributed resources. The project's main impact is to enable and promote trustworthy, performant, cost-effective uses of the public global infrastructure, empowering applications and services for a global market. Specifically it will lower the barrier of entrance for startups to enter a global market and as a result, foster competition and innovation, and make information technologies more accessible. It is intended to profoundly change many industries that traditionally heavily rely on proprietary IT infrastructures, e.g., mobile networks.
The project makes three related scientific contributions. First, it contributes new technologies for building distributed CEE enclaves for running global-scale applications. CEEs extend remote attestation (as in trusted execution environments (TEEs)) with formal verification so the chain of trust can be used to establish not only the authenticity of enclave binaries but also the trustworthiness properties. Second, it provides hardware and software support to accelerate the underlying mechanisms for isolation, integrity, and confidentiality. These themes range from support for better isolation to CPUs and TEEs, but also include fast mechanisms for emerging hardware accelerators. Finally, the team of researchers explores the extension of certifiably trustworthy execution environments to emerging disaggregated datacenter designs using a software-defined-network-based decomposition of functionalities. The insights gleaned from their study guide the development of new algorithm-driven, data structure-driven, and hardware-driven solutions for the trustworthy disaggregated cloud design. During the Planning stage, the investigators are developing a prototype testbed to evaluate the feasibility of building a high-performance trustworthy global-scale mobile network using cloud-scale disaggregated CEEs. They are compiling a list of challenges which become the central research agenda for a full-scale, large proposal.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.97 |
2022 — 2025 |
Zhong, Lin [⬀] Puri, Shruti Schoelkopf, Robert (co-PI) [⬀] Ding, Yongshan Bhattacharjee, Abhishek |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mri: Development of Paragon: Control Instrument For Post Nisq Quantum Computing
This project will design and implement PARAGON, an instrument of control systems for superconducting circuit-based quantum computers. Using an ultra-low latency, scalable network of Field-Programmable Gate Array (FPGA) accelerators, PARAGON will support real-time measurement, error correction, and control of 100s of qubits. The project will also develop the necessary systems, programming and debugging support for realizing and evaluating new quantum hardware and algorithms with PARAGON. PARAGON will substantially advance the Nation’s research capabilities in quantum computing, enabling operational tests of error-corrected algorithms and accelerating the arrival of fault-tolerant quantum computing.<br/><br/>Toward cost-effective scalability, PARAGON employs a balanced, fat tree to organize the large number of building blocks and to distribute data, clock, and time (trigger). The leaves of the tree feature Radio Frequency System-on-Chip (RFSoC) for quantum control and the internal nodes of the tree Multiprocessor System-on-Chip (MPSoC) for integration. PARAGON will empower two broad research communities that tackle quantum computing from different fronts. It will allow Physicists to investigate the theory and realization of better qubits, and experiment with sophisticated error correction and fault tolerance methods on real qubits, at a previously impossible scale. It will allow Computer Scientists to experiment with novel architectures and programming schemes for quantum control. Most importantly, it will serve as the meeting place for both communities, fostering cross-pollination and catalyzing collaboration. Through its open design and open-source software, PARAGON will empower the broad community of academic and industrial researchers in superconducting quantum computing to experiment in previously impossible ways. While PARAGON will be implemented for quantum computers based on superconducting circuits, its design can be adapted for those based on other technologies, which also face similar challenges in their control systems. It will provide critical know-how to the budding industry of quantum control systems so that the latter can further lower the cost for wider, commercial availability. The instrument will advance research agendas in multiple disciplines, creating opportunities in cross pollination between applied physics, computer science and engineering. It will create new opportunities to engage both graduate and undergraduate students, especially underrepresented minorities and women, providing unique training for multidisciplinary research. Source materials produced by the project can be found at https://github.com/yale-paragon. The repositories will be actively maintained by the project team during the award period. During the lifetime of PARAGON, the repositories will transition into community-based development and maintenance with the project team being one of the contributors. The project team will ensure the repositories are available at least five years after the lifetime of the physical testbed of PARAGON.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.97 |