1997 — 2003 |
Scheuermann, Peter (co-PI) [⬀] Lee, D. (co-PI) [⬀] Banerjee, Prithviraj [⬀] Sarrafzadeh, Majid (co-PI) [⬀] Choudhary, Alok (co-PI) [⬀] Taylor, Valerie Hauck, Scott |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cise Research Infrastructure: a Distributed High-Performance Computing Infrastructure @ Northwestern University
CDA-9703228 Prithviraj Banerjee Northwestern University A Distributed High-Performance Computing Infrastructure This award is for the acquisition of 20 high-end UNIX workstations, 50 low-end UNIX workstations, three UNIX fileservers, an 8-processor distributed shared memory multiprocessor, and a 64-ported ATM switching hub. The machines would be networked together using high-speed OC-3 ATM networks with bandwidths of 155 Mbps. As the use of high-speed networking moves from the laboratory to the workplace, new opportunities arise for the design and implementation of a high-speed distributed computing environment. The goals of this project are: (1) to explore the use of high-speed networking and computing to investigate file systems and data management issues for high-performance distributed computing, (2) to investigate the parallel programming support of networks of high-speed workstations and personal computers as an alternative to stand-alone parallel computers, (3) to study high-performance computer-aided design of electronic systems in a heterogeneous environment, and to develop a Web-based CAD computing center, that takes advantage of high-speed networking, (4) to explore new instructional techniques that take advantage of the high bandwidth and high speed.
|
1 |
1999 — 2004 |
Hauck, Scott |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: a Logic Emulation Infrastructure For Research and Teaching @ Northwestern University
CCR-9875564 Hauck
Logic emulation systems provide an essential tool for the design of complex circuits, but there are significant obstacles to design of the mapping software that translates designs to the emulator. Such obstacles limit the potential to use logic emulation to simulate designs. This research is exploring new algorithms, distributed computing techniques, and quality/performance tradeoffs needed to increase the performance of the mapping software. The project is also developing a publicly available mapping flow to aid research into these systems, as well as set of large, realistic benchmark circuits. Finally, methods for integrating these devices into logic design curricula, allowing for near-speed execution of student designs, and much more realistic design experiences are being pursued.
|
1 |
2000 — 2007 |
Ebeling, Carl Allstot, David [⬀] Hauck, Scott Liu, Hui Bilmes, Jeffrey (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr: Heterogeneous System Integration in System-On-a-Chip Designs @ University of Washington
This project studies the integration of heterogeneous resources into a system-on-chip (SOC) solution. Heterogeneous SOC integration supports the fabrication of RF, analog, high performance digital, and re-configurable subsystems within a single piece of silicon, and includes issues of simulation, design, integration, test, and education. An example SOC is a human/machine transducer chip that provides a speech recognition interface to a ubiquitous wireless network. Such a system represents a standard interface modality. Multiple topics are being researched including low-power speaker identification, speech processing algorithms, and hardware implementations. Low power, high performance wireless protocols are also being developed to support the asymmetric communication loads, sending low bandwidth control messages produced from the recognized speech and receiving high-bandwidth information return for visual, audio, and other feedback to the user.
|
0.952 |
2001 — 2007 |
Ebeling, Carl Allstot, David (co-PI) [⬀] Sechen, Carl (co-PI) [⬀] Soma, Mani (co-PI) [⬀] Hauck, Scott |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cise Research Infrastructure: An Infrastructure For Integrated Systems Education and Innovation @ University of Washington
0101254 Scott A. Hauk University of Washington
CISE Research Infrastructure: An Infrastructure for Integrated Systems Education and Innovation
The research contained in this proposal represents a wide-ranging investigation into the future of single-chip systems. We will seek to develop a design methodology that can provide the benefits of multiple different resource types for numerous design domains. To support the design of such cutting-edge silicon systems, we will develop innovative techniques to handle numerous design issues. These will include investigations into the following critical issues in chip design: Development of techniques for integrating RF and Analog components into future 1V SoC designs. Creation of high-performance, power efficient digital logic families for supporting the stringent requirements of these systems. Investigation into reconfigurable subsystems for SoC designs, providing post-fabrication customization for support of multi-protocol and multi-algorithm systems. Integrated testing methodologies for complex, heterogeneous systems that can provide complete system test. Complete simulation and design methodologies that can handle complete system integration, architectural exploration, and validation. In addition to the development of new approaches to future chip design, we will also develop innovative techniques for educating future chip designers. By providing an integrated curriculum in VLSI/CAD, embedded systems, and complex system design, we will help create system architects capable of harnessing these radically new design techniques and opportunities. We will also seek to increase the opportunities in chip design for new constituents, especially under-represented groups to help increase the pipeline of new designers
|
0.952 |
2004 — 2008 |
Hauck, Scott |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Achieving High-Performance Reconfigurable Computing in Commodity Devices @ University of Washington
Achieving High-Performance Reconfigurable Computing in Commodity Devices Abstract
FPGAs are chips that can be programmed and reprogrammed to implement complex digital logic. They combine the performance of hardware with the flexibility of software. Potential applications range from hardware accelerators for high-performance computers, as well as in everyday electronic devices. Thus, improving their performance is of significant interest. Although FPGAs can provide high performance for a wide range of applications, their achieved clock cycles are typically 5x-10x slower than other circuits. This is due to the programmable nature of the underlying hardware, as well as the limitations in the input circuits. FPGAs can support much higher theoretical clock rates than currently can be achieved in practice. This research will develop architectural features and tools needed to realize this potential. This approach will combine established techniques with new algorithms for generating and mapping highly pipelined circuits. The key is to allow for very significant levels of circuit pipelining in situations that demand it, while trading area for performance. We will also optimize the FPGA architectures to support pipelining, while not adversely affecting general-purpose designs. This will include optimized logic blocks that can support aggressive pipelining, as well as routing designed for interconnect pipelining. This proposal contains new approaches to radically increase the speed of FPGAs, a major building block in today's electronics systems. By providing faster hardware, we can provide greater flexibility, capabilities, and speed in many different systems. This may include high-end computers with reconfigurable hardware units, and versatile electronics like multi-network, multi-service cell phones and enhanced multi-media capable PDAs.
|
0.952 |
2007 — 2011 |
Ebeling, Carl Hauck, Scott |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Tools For Exploring Low-Power, High-Performance Reconfigurable Computing Architectures @ University of Washington
Proposal ID: 0702621 PI Institution: University of Washington, Seattle PI name: Cark Ebeling Title: Tools for Exploring Low-power, High-performance Reconfigurable Computing Architectures
Abstract: Creating high-performance, low-power systems for embedded communications, media and signal-processing applications in future nanoscale technologies will require new parallel computing architectures that can be reconfigured on-the-fly to match the structure of the application. Computing infrastructures based on these coarse-grained reconfigurable platforms will be better able to meet the future performance and power demands of embedded applications than traditional processor-based architectures.
The potential design space of these large-scale coarse-grained reconfigurable computing architectures is enormous, and efficient, architecture-independent tools will be needed to quickly explore and evaluate the different points in this space for different applications and application domains. This research project will develop architecture-independent tools that map the intermediate representations of algorithms produced by compilers to coarse-grained reconfigurable architectures comprising a wide range of architectural features. These tools will rely on integrating scheduling, placement and routing algorithms to perform this mapping. They will enable the fast exploration of the overall space of large-scale coarse-grained reconfigurable architectures, as well as the impact of specific architectural features and structures.
|
0.952 |
2011 — 2015 |
Ebeling, Carl Hauck, Scott |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Cgras - Control and Architecture For Next-Generation Fpgas @ University of Washington
This research effort is developing new electronic devices and mapping software to improve the speed, power-efficiency, and cost of digital electronics. They start with the concept of the Field-Programmable Gate Array (FPGA), logic chips that can be programmed and reprogrammed to implement complex digital circuits. FPGAs are an important driver for the semiconductor industry, reaching almost $3B in annual worldwide sales. Current FPGAs are essentially seas of 1-bit compute units, each configured to do one function over and over. To support more complex operations modern devices have a sprinkling of more complex units, including multipliers and memories, which have a more multi-bit flavor. All the components of these devices are interconnected via a static, single-bit routing network, and are primarily programmed in hardware description languages such as Verilog or VHDL. An FPGA single-bit programmability provides a great deal of flexibility for creating arbitrary logic, but has significant inefficiencies as well. Word-based architectures, that compute and route multi-bit values simultaneously, can be much more efficient than standard FPGAs. Word-based alternatives to FPGAs exist, such as CGRAs and MPPAs, but limitations in their control systems significantly reduce their quality and usefulness for many applications. One of the major thrusts of this work is to merge together the customizable logic of FPGAs with the time-multiplexing ability of MPPAs and CGRAs, as well as the complex control flow supported by modern multi-core CPUs. Unlike a standard FPGA, that statically configures all of its resources to do a single task, this system allows each compute element in the device to run a small program. This provides a significantly greater compute density in these devices. However, to boost this even further, they are exploring mechanisms to make use of branching and conditional operation. Specifically, where a microprocessor might take a branch based upon a loop condition or as part of an if-then-else construct, their hardware system can either change the instructions loaded during that cycle, or branch to a different portion of the overall operation. However, unlike MPPAs and CGRAs, their system can perform data-dependent instruction selection within a large, automatically mapped computation region operating in lock-step. Alternatively, for control-heavy portions of a computation they can embed complete, simple VLIW processors into the fabric of their system. To support these efforts, they are developing new compilation strategies to convert computations into efficient implementations on these architectures. They are also looking at the hardware resources required to support these operations. This includes methods for stalling portions of the array when their communication demands temporarily cannot be met, as well as mechanisms to synchronize the program counters of regions of the array operating in lock-step. When combined, they estimate these systems will provide an order of magnitude improvement in area-power product, and at least a factor of 2 performance improvement, over FPGAs. The resulting hardware and software systems should be able to significantly reduce the power consumption, lower the cost, and increase the speed of a large swath of electronic systems. Also, their improved programming models will make these systems easier to develop and maintain. This effort also includes a focus on improving the diversity of the engineering workforce at both the graduate and undergraduate level, with mentoring and research opportunities at each level. All of these activities are done within an overall effort towards outreach to underrepresented groups.
|
0.952 |
2019 — 2021 |
Hauck, Scott Hsu, Shih-Chieh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Advancing Science With Accelerated Machine Learning @ University of Washington
In the next generation of big science experiments, the demands for computing resources are expected to outstrip the capabilities of existing computing infrastructure. In light of this, a radical rethinking of the cyberinfrastructure is needed to contend with these developments. With the onset of deep learning, parallelized processing architectures have emerged as a solution. Combined with deep learning algorithms, parallelized processing architectures, in particular, Field Programmable Gate Arrays (FPGAs) have been shown to give large speedups in computing when compared with conventional CPUs. This project aims to bring machine learning based accelerated computing with FPGAs into the scientific community by targeting two big-data physics experiments: the Large Hadron Collider (LHC) and the Laser Interferometer Gravitational-wave Observatory (LIGO). This project will push the frontiers of deep learning at scale, demonstrating the versatility and scalability of these methods to accelerate and enable new physics in the big data era. This project serves the national interest, as stated by NSF's mission, by promoting the progress of science. The PIs and their collaborators will build upon their recent work to design and exploit state-of-the-art neural network models for real-time data analytics, reducing overall computing latency. This new computing paradigm aims to significantly increase the processing capability at the LHC and LIGO, leading to an increased scientific output of these devices and, potentially, foundational discoveries. The students to be mentored and trained in this research will interact closely with industry partners, creating new career opportunities, and strengthening synergies between academia and industry. In addition to sharing algorithms with the community through open source repositories, the team will continue to educate the community regarding credit and citation of scientific software.
In this project, the PIs will build upon their recent work developing high quality deep learning algorithms for real-time data analytics of time-series and image datasets using Field Programmable Gate Arrays (FPGAs) to accelerate low-latency inference of machine learning algorithms. The team will develop machine learning based acceleration tools focusing on FPGAs to be used within LIGO and the LHC experiments. The team's immediate goal is to take benchmark examples of LHC high level trigger processing and LIGO gravitational wave processing and construct demonstrators in each scenario. For this benchmark, they aim to design and implement an FPGA based accelerator that can perform low latency gravitational wave identification and LHC event reconstruction. Additionally, the PIs aim to add the capability of graph based neural network accelerators for FPGAs. The open source tools to be developed as part of these activities will be readily shared with LIGO, LHC, and LSST. The project will create an advisory group, including members of large and small projects, members of the neutrino physics, multi-messenger astronomy community, industry partners, computer scientists, and computational biologists. This project aims to bring together representatives of the different communities that will benefit from and can contribute to this work. The PIs will organize deep learning workshops and boot camps to train students and researchers on how to use and contribute to the framework, creating a wide network of contributors and developers across key science missions. This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity.
This project is part of the National Science Foundation's Harnessing the Data Revolution Big Idea activity. The effort is jointly funded by the Office of Advanced Cyberinfrastructure.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.952 |