2004 — 2007 |
Limp, W. Frederick Vranich, Alexei Shi, Jianbo Daniilidis, Kostas [⬀] Biros, George (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Computing and Retrieving 3d Archaeological Structures From Subsurface Surveying @ University of Pennsylvania
Today's archaeological excavations are slow and the cost for conservation can easily exceed the cost of excavation. This project is investigating and developing methods for the recovery of 3D underground structures from subsurface non-invasive measurements obtained with ground penetrating radar, magnetometry, and conductivity sensors. The results will not only provide hints for further excavation but also 3D models that can be studied as if they were already excavated. The three fundamental challenges investigated are the inverse problem of recovering the volumetric material distribution, the segmentation of the underground volumes, and the reconstruction of the surfaces that comprise interesting structures. In the recovery of the underground volume, high-fidelity geophysics models are introduced in their original partial differential equation form. Partial differential equations from multiple modalities are simultaneously solved to yield a material distribution volume. In segmentation, a graph spectral method for estimating graph cuts finds clusters of underground voxels with tight connections within partitions and loose connections between partitions. A method based on multi-scale graph cuts significantly accelerates the process while the grouping properties of the normalized cuts help in clustering together multiple fragments of the same material. In surface reconstruction, boundaries obtained from segmentation or from targeted material search are converted from unorganized voxel clouds to connected surfaces. A bottom-up approach is introduced that groups neighborhoods into facets whose amount of overlap guides the triangulation process. The archaeology PIs are providing prior knowledge on what structures are expected to be found which can lead the segmentation and the reconstruction steps.
The geoscience and archaeology PIs lead the effort of data acquisition at the Tiwanaku site in Bolivia. All original data as well as recovered 3D models will be made available to the public.
|
1 |
2004 — 2006 |
Lee, Daniel (co-PI) [⬀] Shi, Jianbo Taylor, Camillo (co-PI) [⬀] Daniilidis, Kostas (co-PI) [⬀] Kumar, R. Vijay |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
RR:Macnet: Mobile Ad-Hoc Camera Networks @ University of Pennsylvania
This project, developing an experimental testbed to work on different aspects of control and sensing for mobile networks of cameras and microphones, envisions a system of cameras, MACNet, moving in three dimensions enabling a Mobile Ad Hoc Camera Network. The development of this testbed provides the experimental infrastructure for the following interdisciplinary projects: Monitoring, Evaluation, and Assessment for Geriatric Health Care, Assistive Technology for People with Disabilities, Digital Anthropology, and Visual Servoing for Coordinating Aerial and Ground Robots. MACNet cameras will be used to track patients and salient dynamic features for the 1st project. MACNet will simulate intelligent homes and museums with active cameras providing feedback to smart wheelchairs and providing information about target areas in the 2nd Project. MACNet will allow datasets of video to be acquired and analyzed for three-dimensional reconstruction and archiving in the 3rdProject; and for the last project, MACNet will provide dynamic platforms with cameras which will simulate aerial vehicles to track, localize, and coordinate with existing ground based autonomous vehicles with applications to surveillance and monitoring for homeland security. Finally, MACNet will be used as a testbed for research on distributed active vision, a theme that will bring together all researchers.
Broader Impact: The results will be directly applicable to a large class of problems in which communication, control, and sensing are coupled, with applications in smart homes, communities for assistive living, and surveillance and monitoring of homeland security. This cross-fertilization will contribute to train students with broader perspectives and potential new approaches to problem solving. The lab will be used by students and the institutions will leverage existing outreach programs in which both faculty and students participate, such as Robotics for Girls and PRIME.
|
1 |
2007 — 2012 |
Smith, Jonathan [⬀] Smith, Jonathan [⬀] Shi, Jianbo |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ct-Isg: Birt - Biometric Identification Red Team @ University of Pennsylvania
Summary Statement, Biometric Identification Red Team (BIRT) The BIRT methodology will aid biometric system designers in making effective refinements in their systems. The measurement of biological characteristics (biometrics) such as fingerprints and facial images provides a means of identification that neither needs to be carried nor remembered. Evaluation of biometrics has traditionally been focused on the ability of biometric systems to identify members from a population, e.g., for purposes of authentication. As these systems come into more widespread use, attempts will naturally be made to test and frustrate their ability to identify individuals. Understanding these attempts requires a fundamental new analytic approach, based on modeling the capabilities of an adversary with full generality. BIRT develops the adversary model using the information controlled by the adversary, e.g., for recognition, the clothing, glasses and makeup they wear. BIRT uses disinformation theory to abstractly model the adversary capabilities to mask their identity from an interested observer. Disinformation theory is inspired by Shannon's information theoretic model for communications systems, but views the ?noise source? as controlled by the adversary, abstractly modeling the capacity of the adversary to control the noise in the channel (for example, by transforming the image ?signal?) between the biometric sender being identified and the biometric system receiving the identifying information. Face recognition systems will be used to gain experience with and refine the disinformation theory models, with a variety of disguises used as disinformation sources.
|
1 |
2008 — 2011 |
Shi, Jianbo Pereira, Fernando Taskar, Ben (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri-Medium: From Actors to Actions: Analysis and Alignment of Images, Video and Text @ University of Pennsylvania
Video clips and corresponding narrations together provide much richer information than either in isolation, yet most current recognition systems process visual and textual information separately. The PIs focus on the task of learning how to recognize corresponding actions in videos and textual narrative accurately and robustly. In particular, they focus on semantic descriptions of human actions. This research will have broad impact on applications including video retrieval in digital libraries, human behavior modeling, and video surveillance.
The PIs' research will tightly couple methods in computer vision, natural-language processing, and machine learning through robust, automatically learned correspondences. With a collection of loosely aligned video-text annotation pairs (such as movies or TV shows with their associated screenplays), the task is to learn how to associate action descriptions in text with actions, objects and actors in videos. This correspondence is essential for semantic grounding of text using visual action appearance. The fundamental challenge is bridging the semantic gap of images and of text: images depict geometrical relationships and properties of image regions, while natural language encodes abstract semantic relationships in grammatical structures. Bridging this semantic gap in the context of action understanding is the focus of our research effort.
The eventual goal is to be able to recognize actions in videos and create text description for actions in videos. While this goal challenges both computer vision and natural language processing, it also opens up an exciting new and very fruitful collaboration between the two research areas where the task of recognition is achieved by simultaneous learning and inference in both domains.
Information on this project, including papers, results, database and open source codes, will be available at http://www.seas.upenn.edu/~jshi/#research
|
1 |
2009 — 2010 |
Shi, Jianbo |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: 1st Sino-Usa Summer School in Vision, Learning, Pattern Recognition, Vlpr 2009 @ University of Pennsylvania
NATIONAL SCIENCE FOUNDATION Proposal Abstract
Proposal Title: Collaborative Research: 1st Sino-USA Summer School in Vision, Learning, Pattern Recognition VLPR 2009 Institution: Princeton University Abstract Date: 05/21/09
The 1st Sino-USA Summer School in Vision, Learning and Pattern Recognition (VLPR2009) is the first NSF-sponsored summer school in China that brings together leading American and Chinese researchers and students in the field of computer vision and machine learning for a week of educational and cultural exchange program. The summer school will be held on the campus of Peking University in Beijing, China. This award will provide travel support for American researchers and students to attend the summer school. The summer school provides a forum for not only technical interactions but also culture exchanges among researchers and students from two countries.
|
1 |
2009 — 2011 |
Lee, Daniel (co-PI) [⬀] Shi, Jianbo Daniilidis, Kostas (co-PI) [⬀] Likhachev, Maxim [⬀] Kuchenbecker, Katherine |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ii-En: Mobile Manipulation @ University of Pennsylvania
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
This project provides infrastructure to the University of Pennsylvania to build from their existing work into the area of mobile manipulation.
The GRASP Lab is an internationally recognized robotics research group at the University of Pennsylvania. It hosts 14 faculty members and approximately 60 Ph.D. students from the primary departments of Computer and Information Science (CIS), Electrical and Systems Engineering (ESE), and Mechanical Engineering and Applied Mechanics (MEAM).
GRASP recently launched a multidisciplinary Masters program in Robotics and involves these students, as well as many Penn undergrads, in its research endeavors. The research conducted by the members of the GRASP laboratory includes such areas as vision, planning, control, multi-agent systems, locomotion, haptics, medical robotics, machine learning and modular robotics. This proposal requests funding for instruments that would enable us to broaden our research to include the important new area of mobile manipulation.
An increasing amount of research in robotics is being devoted to mobile manipulation because it holds great promise in assisting the elderly and the disabled at home, in helping workers with labor intensive tasks at factories, and in lessening the exposure of firemen, policemen and bomb squad members to dangers and hazards. As concluded by the NSF/NASA sponsored workshop on Autonomous Mobile Manipulation (AMM) in 2005, the technical challenges critically requiring the attention of researchers are:
? dexterous manipulation and physical interaction ? multi-sensor perception in unstructured environments ? control and safety near human beings ? technologies for human-robot interaction ? architectures that support fully integrated AMM systems
The researchers at GRASP fully support these recommendations and want to help lead the way in addressing them. Though GRASP conducts a great deal of research on similar challenges in related areas, this group has not worked on the unique and potentially transformative topic of mobile manipulation. The primary inhibitor to pursuing these challenges is that research in mobile manipulation critically depends on the availability of adequate experimental platforms. This group is requesting funding that would allow them to acquire such equipment.
In particular, this group is requesting funding for (a) one human-scale mobile manipulator consisting of a Segway base, Barrett 7-DOF arm, and 3-fingered BarrettHand equipped with tactile sensors and a visual sensing suite; and (b) four small Aldebaran Robotics NAO humanoid robots capable of locomotion and manipulation.
These instruments will allow the GRASP community to perform research in such areas as autonomous mobile manipulation, teleoperated mobile manipulation, navigation among people, bipedal locomotion, and coordination of multiple mobile manipulation platforms. Advances in this area are beneficial to society in a variety ways. For example, mobile manipulation is one of the most critical areas of research in household robotics, which aims at helping people (especially the elderly and the disabled) with their chores. Mobile manipulators will also see great use in office and industrial settings, where they can exibly automate a huge variety of manual tasks.
|
1 |
2016 — 2018 |
Shi, Jianbo |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Construction of Social Interactions in 3d Space From First-Person Videos @ University of Pennsylvania
Precision modeling tools for realistic and complex human social interaction are not available today. First-person videos provide a unique opportunity to capture social interaction at unprecedented precision. In contrast, current third person surveillance video only records the few distance views of the interaction passively at a much reduced spatial resolution. This exploratory research project proposes to harness multiple first-person cameras as one collective instrument to capture, model, and predict social behaviors. The proposed research transforms the way we construct realistic social interaction models, while also advancing first-person video recognition. If successful, the envisioned computational model can act as a coach who learns what constitutes successful interactions and failures, thus being able to find solutions to mediate and prevent potential conflicts.
The proposed research will model dynamic social interactions in 3D space from multiple personal perspectives. Recognition and prediction of complex social group interactions are challenging because people in the group can carry out unexpected actions intentionally or by mistake. In addition, due to variances in individuals' preferences and abilities, the same activities could be carried out in different ways. First-person videos can be highly jittery, resulting in fast and unpredictable object motions in the field of view. Building on PI's recent work establishing computational foundations for modeling social (people-people) and personal (people-scene) interactions using first-person cameras, this research will explore the novel concept the duality between social attention and roles: social attention provides a cue for recognizing social roles, and social roles facilitate the predictions of dynamic social formation change and its associated social attention. The formal foundation of the 3D model is based on constructing a visual memory that stores first-person social experiences in three forms: (a) geometric social formation, (b) visual image of first-person view, and (c) first-person seen by nearby third person views. As a proof-of-concept, the 3D space model capturing social interactions will be tested on collaborative social tasks such as assembling (Ikea) furniture, or building a block house with a group of friends. This research will construct a labeled dataset capturing the interactions, and perform analysis on both accuracy in recognizing social roles and precision in predicting spatial movements of the members in that social interaction. The results of this project, including papers and dataset, will be disseminated to the public through our project website (http://www.seas.upenn.edu/~hypar/NSF_SocialMemory/nsf_social_visual_memory.html). The software created under this project will be made available to the public through GitHub, a web-based Git repository hosting service.
|
1 |
2016 — 2018 |
Schmidt, Marc F. (co-PI) [⬀] Bassett, Danielle (co-PI) [⬀] Lee, Daniel (co-PI) [⬀] Shi, Jianbo Daniilidis, Kostas [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mri: Development of An Observatory For Quantitative Analysis of Collective Behavior in Animals @ University of Pennsylvania
This project, developing a new instrument to enable an accurate quantitative analysis of the movement of animals and vocal expressions in real world scenes, aims to facilitate innovative research in the study of animal behavior and neuroscience in complex realistic environments. While much progress has been made investigating brain mechanisms of behavior, these have been limited primarily to studying individual subjects in relatively simple settings. For many social species, including humans, understanding neurobiological processes within the confines of these more complex environments is critical because their brains have evolved to perceive and evaluate signals within a social context. Indeed, today's advances in video capture hardware and storage and in algorithms in computer vision and network science make this facilitation with animals possible. Past work has relied on subjective and time-consuming observations from video streams, which suffer from imprecision, low dimensionality, and the limitations of the expert analyst's sensory discriminability. This instrument will not only automate the process of detecting behaviors but also provide an exact numeric characterization in time and space for each individual in the social group. While not explicitly part of the instrument, the quantitative description provided by our system will allow the ability to correlate social context with neural measurements, a task that may only be accomplished when sufficient spatiotemporal precision has been achieved.
The instrument enables research in the behavioral and neural sciences and development of novel algorithms in computer vision and network theory. In the behavioral sciences, the instrumentation allows the generation of network models of social behavior in small groups of animals or humans that can be used to ask questions that can range from how the dynamics of the networks influence sexual selection, reproductive success, and even health messaging to how vocal decision making in individuals gives rise to social dominance hierarchies. In the neural sciences, the precise spatio-temporal information the system would provide can be used to evaluate the neural bases of sensory processing and behavioral decision under precisely defined social contexts. Sensory responses to a given vocal stimulus, for example, can be evaluated by the context in which the animal heard the stimulus and both his and the sender's prior behavioral history in the group. In computer vision, we propose novel approaches for the calibration of multiple cameras "in the wild", the combination of appearance and geometry for the extraction of exact 3D pose and body parts from video, the learning of attentional focus among animals in a group, and the estimation of sound source and the classification of vocalizations. New approaches will be used on hierarchical discovery of behaviors in graphs, the incorporation of interactions beyond the pairwise level with simplicial complices, and a novel theory of graph dynamics for the temporal evolution of social behavior. The instrumentation benefits behavioral and neural scientists. Therefore, the code and algorithms developed will be open-source so that the scientific community can extend them based on the application. The proposed work also impacts computer vision and network science because the fundamental algorithms designed should advance the state of the art. For performance evaluation of other computer vision algorithms, established datasets will be employed.
|
1 |