2009 — 2012 |
Kanade, Takeo [⬀] Sheikh, Yaser |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ii-En the Human Virtualization Studio: From Distributed Sensor to Interactive Audiovisual Environment @ Carnegie-Mellon University
The Virtualization Studio spearheads research to reconstruct, record, and render dynamic events in 3D. The studio creates a ?full-body? interactive environment where multiple users are simultaneously given a visceral sense of three dimensional space, through vision and sound, and are able to interact, through action and speech, unencumbered by 3D glasses, head-mounted displays or special clothing. The studio pursues the thesis that robust sensors for hard problems, in this case audiovisual reconstruction of highly dynamic multiple actors/speakers, can be constructed by using a large number of sensors running simple, parallelized algorithms. High fidelity reconstructions are created using a grid of 1132 cameras, and a 128-node multi-speaker microphone array to localize and associate multiple sound sources in the event space. In addition, a multi-viewer lenticular display screen, consisting of 48 projectors, and a front surround sound speaker are used to render interactive environments. The reconstruction algorithms are parallelized and a cluster is used to process the data and respond to behaviors in the event space in realtime.
Audiovisual reconstruction and rendering of scenes containing multiple users will revolutionize research into collaborative interfaces, and will allow digital preservation of culturally significant events, like theatrical performances, sports events, and key speeches. In addition to these core research objectives, the Virtualization Studio will act as a gathering place for multidisciplinary research, bringing together researchers from interactive art, human behavior analysis, computer graphics, computer vision, psychology, big data research, and speech processing. The infrastructure will be used to develop a new course on Human Virtualization, and will be used as a pedagogical tool in several existing courses and outreach projects, introducing the next generation of students to the power of interdisciplinary research in computer science.
|
1 |
2009 — 2013 |
Sheikh, Yaser |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Spacetime Reconstruction of Dynamic Scenes From Moving Cameras @ Carnegie-Mellon University
The proliferation of camera-enabled consumer items, like cellular phones, wearable computers, and domestic robots, has introduced moving cameras, in staggering numbers, into everyday life. These cameras record our social environment, where people engage in different activities and objects like vehicles or bicycles are in motion. State-of-the-art structure from motion algorithms cannot reliably reconstruct these types of scenes. The overarching focus of this work is to develop the theory and practice required to robustly reconstruct a dynamic scene from one moving camera or simultaneously from several moving cameras.
To achieve this, the PI is developing a theory of imaging in dynamic scenes. A useful ?device? for analyzing dynamic scenes is to visualize them as constructs in spacetime, analogous to static structures in space. Much of the progress in multi-view geometry in static scenes has centered on the development of tensors that embody the relative positions of cameras in space. The dimensional analogue is being used to define corresponding analogues for multi-view geometry in dynamic scenes. A goal in this work is to derive geometric relationships within a system of independently moving cameras. To reconstruct unconstrained dynamic scenes, factorization approaches are being extended to spacetime to simultaneously reconstruct nonrigid structure from multiple moving cameras.
The algorithms that result from this research create the space for a host of new technologies in several industries such as autonomous vehicle navigation, distributed visual surveillance, aerial video monitoring and indexing, cellphone interface, urban navigation, coordination and planning for autonomous robots, and human-computer interface.
|
1 |
2010 — 2016 |
Dey, Anind [⬀] Sheikh, Yaser Kanade, Takeo (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Computational Behavioral Science: Modeling, Analysis, and Visualization of Social and Communicative Behavior @ Carnegie-Mellon University
Computational Behavioral Science: Modeling, Analysis, and Visualization of Social and Communicative Behavior Lead PI/Institution: James M. Rehg, Georgia Institute of Technology This Expedition will develop novel computational methods for measuring and analyzing the behavior of children and adults during face-to-face social interactions. Social behavior plays a key role in the acquisition of social and communicative skills during childhood. Children with developmental disorders, such as autism, face great challenges in acquiring these skills, resulting in substantial lifetime risks. Current best practices for evaluating behavior and assessing risk are based on direct observation by highly-trained specialists, and cannot be easily scaled to the large number of individuals who need evaluation and treatment. For example, autism affects 1 in 110 children in the U.S., with a lifetime cost of care of $3.2 million per person. By developing methods to automatically collect fine-grained behavioral data, this project will enable large-scale objective screening and more effective delivery and assessment of therapy. Going beyond the treatment of disorders, this technology will make it possible to automatically measure behavior over long periods of time for large numbers of individuals in a wide range of settings. Many disciplines, such as education, advertising, and customer relations, could benefit from a quantitative, data-drive approach to behavioral analysis. Human behavior is inherently multi-modal, and individuals use eye gaze, hand gestures, facial expressions, body posture, and tone of voice along with speech to convey engagement and regulate social interactions. This project will develop multiple sensing technologies, including vision, speech, and wearable sensors, to obtain a comprehensive, integrated portrait of expressed behavior. Cameras and microphones provide an inexpensive, noninvasive means for measuring eye, face, and body movements along with speech and nonspeech utterances. Wearable sensors can measure physiological variables such as heart-rate and skin conductivity, which contain important cues about levels of internal stress and arousal that are linked to expressed behavior. This project is developing unique capabilities for synchronizing multiple sensor streams, correlating these streams to measure behavioral variables such as affect and attention, and modeling extended interactions between two or more individuals. In addition, novel behavior visualization methods are being developed to enable real-time decision support for interventions and the effective use of repositories of behavioral data. Methods are also under development for reflecting the capture and analysis process to users of the technology. The long-term goal of this project is the creation of a new scientific discipline of computational behavioral science, which draws equally from computer science and psychology in order to transform the study of human behavior. A comprehensive education plan supports this goal through the creation of an interdisciplinary summer school for young researchers and the development of new courses in computational behavior. Outreach activities include significant and on-going collaborations with major autism research centers in Atlanta, Boston, Pittsburgh, Urbana-Champaign, and Los Angeles.
|
1 |
2013 — 2015 |
Sheikh, Yaser |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: 3d Event Reconstruction From Social Cameras @ Carnegie-Mellon University
This EAGER project explores the use of social cameras to reconstruct and understand social activities in the wild. Social cameras are an emerging phenomenon, producing video captures of social activity from the point of view of members of the social group itself. They are proliferating at an unprecedented rate, as smartphones, camcorders, and recently wearable cameras, become broadly adopted around the world. Users naturally direct social cameras at areas of activity they consider significant, by turning their heads towards them (with wearable cameras) or by pointing their smartphone cameras at them. The core scientific contribution of this work is the joint analysis of both the 3D motion of social cameras (that encodes group attention) and the 3D motion in the scene (that encodes social activity) towards understanding the social interactions in a scene. A number of internal models (such as maximizing rigidity or minimizing effort) for event reconstruction are being investigated to address the ill-posed inverse problems involved.
This research is establishing a new area of visual analysis by providing the requisite framework for social activity understanding in 3D rather than in 2D. The ability to analyze social videos in 3D space and time provides useful tools for almost any activity that involves social groups working together, such as citizen journalism, search-and-rescue team coordination, or collaborative assembly teams. The project is integrated with education through teaching and student training, and outreaches industry through collaborations.
|
1 |