2000 — 2006 |
Algazi, V. Ralph Duda, Richard Davis, Larry [⬀] Duraiswami, Ramani Liu, Qing Huo (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr: Personalized Spatial Audio Via Scientific Computing and Computer Vision @ University of Maryland College Park
This is the first 4 years funding of a five-year continuing award. Humans are very good at discerning the spatial origin of sound using a mixture of frequency-dependent interaural time difference (ITD), interaural level difference (ILD), and pinna spectral cues in disparate environments ranging from open spaces to small crowded rooms. This ability helps us to interact with others and the environment by sorting out individual sounds from a mixture, and helps us to survive by warning us of danger over a wider region of space compared to vision. These advantages of spatial sound are important for human-computer interaction.
While the frequency-independent ITD cues (delays) associated with the two ears are relatively easy to render over headphones, the ILD (level difference) and pinna elevation cues are not. For a given source location and frequency content, the sound scattered by the person's torso, head and pinnae, and is received differently at the two ears, leading to differences in the intensity and spectral features of the received sound. These effects are encoded in an extremely individual "Head Related Transfer Function" (HRTF) that depends on the person's anatomical features (structure of the torso, head and pinnae). This individuality has made it difficult to use the HRTF in the proposed applications. Recent research, including that of members of this team, has focused on measuring the HRTFs for individuals in specific environments, on constructing models of the HRTF, on understanding how the geometry of the body is related to the characteristics of HRTF, and how the brain processes the cues to derive spatial information. However, this research has also indicated that the brain is extraordinarily perceptive to errors in cues that result when sound is rendered with an incorrect HRTF.
In this project the PI and his team will use numerical methods to compute individualized HRTFs from accurate 3-D surface models of the body. They will use multiview, multiframe computational vision techniques to extract the surface models from imagery. They will then use boundary element methods employing fast multipole/ transform techniques and parallel processing to compute the HRTFs from the surface models. The resulting HRTFs will be evaluated both by objective comparisons with acoustically measured HRTFs and by psychoacoustic testing, and will be used in demonstrations of virtual reality, augmented reality, and teleconferencing. A major advantage of this vision-based approach is that it will allow the PI and his team to investigate and model the way that HRTFs change with body posture, providing the potential of tracking dynamic environments. Thus, the project will include fundamental research to extend the static HRTF measurements to dynamic situations in different environments, using a combination of visual tracking to locate the person in real space, and construction of in-room HRTFs from free-field HRTFs using fast iterative techniques. This will provide a scientific foundation for HCI applications of audio rendering. The research will in addition yield algorithms and understanding that will have an impact on varied fields, including computer vision based model creation; scientific computing; computational acoustics for noise control and land mine detection; neurophysiological understanding of human audition; etc.
|
1 |
2000 — 2004 |
Kanungo, Tapas (co-PI) [⬀] Davis, Larry [⬀] Duraiswami, Ramani |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Textual Information Access For the Visually Impaired @ University of Maryland College Park
An ever-increasing segment of the population suffers from low vision resulting from complications of disease and old age. Surveys conducted by one of the Co-PIs as part of a previous project have determined, that the key information which is not available to people with low vision is textual information, usually of a directive or warning nature. For example, shopping in a large department store in a mall might involve looking for signs indicating where the store is, reading aisle signs in the store, and looking at product names, at labels and prices. This research will develop a "seeing-eye" computer to help people with low vision to observe and receive such information, so that they can participate more efficiently and comfortably in every day activities, and thereby lead more fulfilling and productive lives. The system will be composed of a digital video camera, computer, user interface, and speech or magnified visual output that can detect textual information in the environment, understand it using OCR, and provide it to the user who either has low vision or is blind. To achieve these goals, the PIs will in collaboration with colleagues at Johns Hopkins University build, over the first six months and then over the first two years, prototype systems using mostly existing technology and extensions to vision algorithms we have developed for identification of text regions in images and OCR, which can be evaluated on volunteer patients at the Wilmer Ophthalmological Institute and the National Federation for the Blind.. The functionality and range of applicability of our prototypes will necessarily be limited. Simultaneously, the PIs will work on long-term research problems that must be addressed to develop next generation seeing-eye computers with greater scalability and capability. In year three patient-volunteers at Wilmer and at the NFB will perform evaluations of the developed prototypes Subsequently, successful results will be commercialized and brought to the larger patient body (as have previous developments at Wilmer). Fundamental research problems to be addressed include: real-time algorithms for detection and rectification of text on planes and cylinders subject to perspective distortions; OCR from digital video, and OCR for text on textured backgrounds; and more robust and efficient algorithms and systems for stabilization and super-resolution of text blocks from video streams.
|
1 |
2002 — 2008 |
Shneiderman, Ben (co-PI) [⬀] Davis, Larry (co-PI) [⬀] Massof, Robert Shamma, Shihab (co-PI) [⬀] Duraiswami, Ramani |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr/Aits: Customizable Audio User Interfaces For the Visually Impaired and the Sighted @ University of Maryland College Park
Although large parts of our brains are devoted to the processing of sound cues and sound plays an important role in the way we interface with the world, this rich channel has not been extensively exploited for displaying information. The mechanisms by which received sound waves are processed neurally to form objects with auditory properties in many perceptual dimensions, including three corresponding to the source location (range, azimuth, elevation) and three to qualities ascribed to the source (timbre, pitch and intensity), are beginning to be understood. There has been significant progress over the last decade in understanding the mechanisms by which acoustical cues arise and how the biological system performs transduction and neural processing to extract relevant features from sound, and in the way we perceive and organize objects in acoustical scenes. Our goal is to exploit this understanding, and uncover the scientific principles that govern the computerized rendering of artificial sound scenes containing multiple sound objects that are information and feature rich. We will test, use and extend this knowledge by creating auditory user interfaces for the visually impaired and the sighted. The work aims both at developing interfaces and answering fundamental questions such as: Is it possible to usefully map "X" to the auditory axes of a virtual auditory space? Here "X" could be an image (e.g., a face), a map, tabular data, uncertain data, or temporally varying data. Are there neural correlates that can guide natural mappings to acoustic cues? What limitations does our perception place on rendering hardware? How important is
|
1 |
2002 — 2006 |
Mayergoyz, Isaak (co-PI) [⬀] O'leary, Dianne (co-PI) [⬀] Elman, Howard (co-PI) [⬀] Duraiswami, Ramani Gumerov, Nail [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr/Sf&It: Fast Multipole Translation Algorithms For Solution of the 3d Helmholtz Equation @ University of Maryland College Park
This proposal concerns new improvements that have the potential to achieve significant speed-up for the fast multipole method (FMM) for use in solving the Helmholtz and other problems used to model phenomena encountered in electromagnetics, acoustics, biology etc. Solving larger problems holds promise for better design on the one hand, and elucidation of new physics/biology on the other. Discretizations of the partial differential equations arising from these equations yield large systems of equations for which both direct and iterative solution techniques are expensive.
The introduction by Rokhlin & Greengard of the FMM generated tremendous interest in the scientific computing community, as it demonstrated a way to generate structure and achieve fast solution of equations without relying on the discretization. Despite its promise, the algorithm has not achieved widespread implementation for many practically important problems that could use the promised speedups. Some researchers have reported that the approximate integrals both make implementation difficult, and in practice they have been shown to introduce stability problems. We have recently derived exact expressions for the translation and rotation of multipole solutions of the Helmholtz equation, which enable fast computation via simple recursions. Further we have obtained very promising results on the properties of the translation operators that enable creation of very tight error bounds. Our translations have the same asymptotic complexity as the standard integral expressions, but with much smaller coefficients. We have also found that the translation operator can be decomposed into the product of sparse recurrence matrices and this can be the basis for a T(p2) algorithm, which we propose to pursue. Based on these expressions, we will develop software for solution of different problems using the FMM. To be useful in pushing ahead the information technology revolution our software will be well documented and published in accessible peer reviewed forums. Such availability will act to improve adoption by large numbers of practitioners.
|
1 |
2004 — 2010 |
Ja'ja', Joseph O'leary, Dianne (co-PI) [⬀] Chellappa, Rama (co-PI) [⬀] Duraiswami, Ramani Varshney, Amitabh [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
High Performance and Visualization Cluster For Research in Coupled Computational Steering and Visualization For Large Scale Applications @ University of Maryland College Park
Researchers at the University of Maryland plan to build a high-performance computing and visualization cluster taking advantage of synergies afforded by coupling central processing units (CPUs), graphics processing units (GPUs), displays, and storage. The infrastructure will be used to support a broad program of computing research that will revolve around understanding, augmenting, and leveraging the power of heterogeneous vector computing enabled by GPU co-processors. The driving force here is the availability of cheap, powerful, and programmable graphics processing units (GPUs) through their commercialization in interactive 3D graphics applications, including interactive games. The CPU-GPU coupled cluster will enable the pursuit of several new research directions in computing, as well as enable a better understanding and fast solutions to several existing interdisciplinary problems through a visualization-assisted computational steering environment. In addition, it will foster research to cast several problems into a better spot on the price-performance curve.
Intellectual Impact: The proposed research that will use this cluster falls into several broad interdisciplinary computing areas. The researchers plan to explore visualization of large datasets and algorithms for parallel rendering. In high-performance scientific computing we plan to develop and analyze efficient algorithms for use with complex systems when uncertainty is included in models. The researchers plan to use the cluster for several applications in computational biology, including computational modeling and visualization of proteins, conformational steering in protein structure prediction, folding, and drug design, large-scale phylogeny visualization, and sequence alignment.
The researchers also plan to use the cluster for applications in real-time computer vision, real-time 3D virtual audio, and for efficient compilation of signal processing algorithms.
Broader Impact: An important aspect of this research is to ensure a high impact of the cluster towards educational and outreach goals. The investigators plan to enrich their current coursework with research results obtained on the cluster. The coupled cluster with a large-area high-resolution display screen will serve as a valuable resource to present, interactively explore, evaluate, and validate the ongoing research in visualization, vision, scientific computing, human-computer interfaces, and computational biology with active participation of graduate as well as undergraduate students.
|
1 |
2009 — 2011 |
Tiglio, Manuel [⬀] Duraiswami, Ramani Gumerov, Nail (co-PI) [⬀] Dorland, William (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Algorithms, Scientific Computing, and Numerical Studies in Classical and Quantum General Relativity @ University of Maryland College Park
A comprehensive set of new algorithmic and scientific computing tools to numerically investigate the modeling of binary black hole collisions with higher efficiency will be developed. Symbolic manipulation tools to automatize and obtain higher order post-Newtonian expansions of the Einstein equations using techniques from Quantum Field Theory will be developed. The use of heterogeneous computing and, more specifically, general purpose Graphics Processing Units (GPUs) in numerical relativity will be explored.
The tools developed in this project aim to explore the physics of binary black hole collisions, expected to be one of the main sources of gravitational waves to be detected by ground- and space-based interferometers such as LIGO and LISA. The parameter space is so large that new computing paradigms are needed to explore it. The results of this project should be of interest in the broad areas of general relativity, gravitational waves, symbolic computing algebra, field theory, and high performance computing.
|
1 |
2011 — 2013 |
Duraiswami, Ramani Zotkin, Dmitry (co-PI) [⬀] Daume, Hal (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Learning the Relationship Between the Anatomy and Spatial Hearing @ University of Maryland College Park
To apply machine learning to problems in the physical world, one needs models/algorithms that are faithful to physics. We consider understanding how the anatomical structure of the body and ears leads to the remarkable ability to localize a sound source in a complex and noisy environment that is innate in most animals and humans. The cues used in localization arise from the process of the acoustic wave scattering off the complex-shaped listener's body and ears. Numerically, these changes in the sound spectrum are characterized by the head-related transfer function (HRTF). Every person's body is unique, and the HRTF is highly individual. It is possible to measure the HRTF; however, the measurement requires specialized hardware and is tedious. There has been considerable interest in convenient methods to obtaining the HRTF. We propose to develop a framework to perform machine learning to establish a relationship between the anatomy and HRTF. An HRTF database with 100 subjects, along with their anthropometric measurements, is available. A novel LMA (Learning of Multiple Attributes) algorithm will be developed. The key properties of this algorithm are that it can incorporate physical constraints into the learning and predict complex structured outputs in continuous spaces. The algorithm will find the low-dimensional manifold in high-dimensional HRTF space and to map the manifold structure to anatomical parameters.
The research will create novel machine learning algorithms that are able to incorporate physics based constraints, and these will find application in other problems. HRTF generation from simple body measurements will allow introduction of personalized spatial audio into fields such as human-computer interaction, consumer electronics, auditory assistive devices for the vision-impaired, robotics, entertainment, education, and surveillance. Training of K-16 and graduate students in the proposed research will add to the nations talent pool.
|
1 |
2012 — 2013 |
Duraiswami, Ramani |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
I-Corps: Creating Immersive 3d Audio @ University of Maryland College Park
The process by which humans (and other animals) perceive space from auditory input is complex. Over the past decades, researchers worldwide (including those involved in the current proposal) have elucidated the mechanisms by which humans perceive sound and are able to develop algorithms for the creation of immersive audio. These include audio from games, from the mixing of music, and from the capture of live events. Because the wavelengths of audible sound are comparable to the sizes of environmental objects and of features on the listeners' bodies, the sound perceived by the listener is "colored" by scattering off these objects. Understanding and efficiently approximating the room impulse response and the Head Related Transfer Function (HRTF), which respectively characterize these scattering processes, has been a major contribution of the PI's research over the past decade. As each listener?s physical body features have different shapes and sizes, the HRTF shows considerable inter-personal variation. The PI has developed extremely rapid techniques for measuring HRTFs, for characterizing HRTFs based on the body and ear measurements of listeners, and for rapid creation of immersive sound using HRTFs and room responses. Further, single microphone recordings lose the spatial information in the sound signal. To retain the spatial information during auditory environment capture, the PI has developed novel spherical microphone array technology. A decomposition of the captured sound in terms of plane-wave basis functions allows the easy incorporation of HRTFs in the captured sound.
Today's consumers enjoy music, games, and other media on their mobile devices; most settle for lackluster sound produced over headphones. Current headphone sound is simply unable to produce the engaging sound experience that high-end music systems, movie theaters, or live events deliver. With the technologies developed under previous NSF support, the team has allowed entertainment content creators to achieve unmatched realism in sound presentation over headphones. Combined with the team's sound capture hardware, immersive reproduction of live events can be achieved. In the last year, the number of smartphone sales outpaced that of PCs for the first time in history; consequentially, the amount of content consumed over headphones is vastly increasing. Our technology will allow companies to make headphone listening more immersive and reach that growing consumer base. Several industries produce and consume audio products. These include mobile device / gaming / PC hardware manufacturers, entertainment content producers, and content delivery networks. They will all be positively impacted by the technology. Because of the large market, and the strength of the team, they anticipate that a large and successful business can be built around this technology. If successful, this effort will contribute to ongoing U.S. leadership in the fields of mobile technologies, gaming, entertainment, and immersive simulation.
|
1 |
2012 — 2017 |
Gumerov, Nail (co-PI) [⬀] Balachandran, Balakumar [⬀] Duraiswami, Ramani |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Standing On the Fourth Pillar: Data Enabled Understanding of Flapping Flight @ University of Maryland College Park
Flapping flight is known to be the single most successful mode of animal locomotion that is exhibited by over 1000 species of bats, more than 9000 living species of flighted birds, and somewhere between millions and tens of millions of flying insects. Inspired by animal flight mechanics, proven capabilities of the Fast Multipole Methods to study vortex interactions in fluid-structure interaction problems, and advances in General Purpose computation on Graphics Processing Units, an interdisciplinary team of applied mathematicians, mechanical engineers, and computer scientists from Mechanical Engineering and Computer Science has been assembled to pursue a three-year data intensive program to further our understanding of flapping flight. The overall goal of the proposed effort is to conduct creative computational studies informed by data from experimental studies to advance computational algorithms for understanding complex fluid-structure interactions associated with flexible structures as well as to discover biological clues related to flapping flight. These clues can help answer fundamental questions, to address which, advanced computational modeling and simulation are needed in concert with experimental observations of flapping wing insects. Computational studies coupled with parallel computing are required to investigate and interrogate the system in ways that nature does not permit. From simple parameter sweeps to flow field analyses, computational studies can educate the analyst in ways that experimental studies alone cannot. The specific goals of this work range from using experimental studies as a guide for computational modeling and simulation to leveraging advanced computing for carrying complex fluid-structure interaction simulations and applying advanced computational architectures and algorithms to accelerate these simulations.
A salient broader impact of the proposed efforts will be the advancement of tools associated with the fourth pillar, data intensive investigations into multidisciplinary, complex, and subtle systems. By demonstrating how the power of experimental data and computational analyses can be harnessed to a degree not attempted before, the efforts are expected to pave the path for transformative investigations into important and diverse fluid-structure problems such as flows interacting with small-scale micro-air-vehicles, blood flow through arteries, and flows through biological organs. Beyond data mining in the natural sciences, this work will usher in a new generation of engineers and scientists trained to use the tools presented by the fourth pillar, data intensive investigation. The cross-disciplinary research will provide exceptional learning opportunities for all involved including a postdoctoral researcher and two graduate students across programs and contribute to the nation's talent pool. Furthermore, computational modeling and sciences curriculum across departments will be enriched by the research findings of the proposed efforts and lead to exciting new additions in undergraduate and graduate courses such as computational dynamics and Fast Multipole Methods. Art-in-science displays featuring captivating fluid-structure interactions and flapping flight will be used to stimulate and nurture the interests of K-12 students who visit campus for different events including the annually held Maryland Day that draws nearly 75,000 visitors each spring semester to campus.
|
1 |
2016 — 2017 |
Gumerov, Nail [⬀] Duraiswami, Ramani |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
I-Corps: On Demand Simulations in the Cloud of the Equations of Mathematical Physics @ University of Maryland College Park
Numerical simulations, data-driven approaches and computer modeling are used more widely than ever, and have become an essential part of the creation of new products in diverse industries (mineral exploration, drug development, automotive, aerospace, finance, electronics, photonics, mechanical design and development, and defense). Under previous NSF funding the researchers have developed extremely fast approaches based on the fast multipole method to do simulations in acoustics, fluid mechanics and electromagnetics. They have also developed parallel approaches to accelerate their algorithms via the cluster hardware accelerated by graphic processors. The simulation industry is currently a niche industry, with several relatively small companies serving different industries with specific pieces of software. However, it is ripe for disruption and broadening - by taking advantage of advances in infrastructure - cloud and heterogeneous GPU accelerated computing; and the development of software engineering approaches that have shown that many traditional tasks that were done locally on a user's computer in the past, can now be delivered as a service using the network and the cloud infrastructures that are widely available. In the proposed I-Corps project the proposed team will investigate the possibility that such an approach may be commercially viable.
This I-Corps team has already been requested by a number of commercial entities for software simulations services for simulation of electromagnetic and acoustical scattering from complex objects. Through the I-Corps process, this team will be able to place these requests in context, and develop a larger plan for creating a successful company. The team will understand the steps needed to develop a service around high performance computing (HPC) methods and capabilities available via cloud computing, the use of advanced co-processors, such as graphics processors, and advanced scalable algorithms, such as the fast multipole methods. Based on the feed-back received in the I-Corps customer interview process, the team can add a number of capabilities to the codes, such as their efficient mapping into distributed computing systems, where each computing node consists of several CPU cores and one (1) or several GPUs. At the end of the I-Corps project, the team plans to provide a demonstration in which the proposed software is used to provide high performance cloud computing and interfacing for some basic problems in electrostatics and acoustics. The selection of the demonstration application will be guided by the customer discovery efforts undertaken in the I-Corps curriculum. The key outcome of this development will be a demo that can subsequently be applied and built upon to show off our technology to potential customers and investors.
|
1 |