1999 — 2004 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Controllable Visual Cues: Analysis and Applications of Images as Sensory Signals in Complex Control Systems
IIS-9876145 Soatto, Stefano Washington University $51,710.00
CAREER: Controllable Visual Cues: Analysis and Applications of Images as Sensory Signals in Complex Control Systems
This is the first year funding of a four-year continuing award. This project is concerned with the study of vision as a sensor for engineering systems operating in complex, dynamic and unknown environments. It entails the analysis of measurable properties of images that depend upon controllable parameters such as the geometry and optics of the imaging device. These properties are called "controllable cues" and they include, for instance, stereo, motion and accommodation. In the project, the purposeful aspect of vision is emphasized: knowledge and representation of the environment are only functional to the accomplishment of control tasks, as for instance visual-based navigation, docking, manipulation, endoscopic surgery, Human-Machine Interaction. In order to address the issue of modeling and representation, it is necessary to understand how the geometry and the dynamics of the environment are related to the information coming from the imaging sensor. In the long term, these issues will become crucial in the study of complex systems, where low-level information needs to be organized in order to perform effective communication between different levels of a control hierarchy, or between different agents involved in the control structure. The research material will be integrated into an educational plan that spans graduate, undegraduate and pre-college levels. In addition, to emphasize physical intuition, the PI plan to develop an experimental setup on the use of controllable cues, which will be accessible by precollege students through an Internet educational service.
|
1 |
2002 — 2006 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
A Variational Framework For Reconstructing 3d Shape and Photometry From Multiple Images @ University of California-Los Angeles
This project aims at designing algorithms to infer 3-D models of the geometry (shape) and photometry (material) of objects from collections of images. Such algorithms are based on a representing visual scenes as dense surfaces, defined implicitly as functions of the measured images, and entail the numerical solution of partial differential equations. Our research aims at integrating in a unified analytical framework many ``shape from X'' algorithms for reconstructing spatial properties of a scene from images, including stereo, shape from shading, and shape from motion. Applications of the technology we plan to develop ranges from geology to medicine, manufacturing, security, to entertainment.
|
1 |
2002 — 2006 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Modeling, Detection and Recognition of Spatio-Temporal Events From Vide @ University of California-Los Angeles
This project addresses the design and analysis of algorithms for the identification of dynamical models of image sequences for the purpose of detection, classification and recognition of spatio-temporal events from video. In particular, we concentrate on (segments of) image sequences that satisfy certain statistical regularity conditions, such as second-order stationarity, or certain physical constraints, such as Lambertian reflection. While this does not cover the most general video sequences, generality will follow from compositionality, by segmenting each sequence into portions that do satisfy the assumptions. The purpose of our models is to enable the detection, classification and recognition of dynamic events, such as the presence of smoke, moving foliage, fire, walking humans etc. in live video.
|
1 |
2005 — 2006 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Frontiers of Vision: a Collaborative Workshop On Computer Vision and Its Role in Application With Broad Societal Impact @ University of California-Los Angeles
Abstract
IIS-0513521 Soatto, Stefano University of California-Los Angeles Frontiers of Vision: A collaborative workshop on Computer Vision and its role in application with broad societal impact
Computer Vision is a relatively young but rapidly evolving field. The academic community is growing rapidly, as measured for instance by the number of submissions to the Computer Vision and Pattern Recognition conferences, which more than tripled since 1998 (submission was 453 in 1998, 466 in 2000, 920 in 2001, 1100 in 2003, 1300 in 2004, and 1500 in 2005). Such a growth is partly fueled by an increased demand for the use of vision in security and monitoring applications, medical imaging, robotics, persistent ISR, etc. A key stepping-stone was also achieved in the mid nineties with the availability of commercial off-the-shelf hardware to acquire and process imaging data in real time, thus enabling an entire new host of applications where vision is to be used as a sensor in the loop of real-time control applications. Vision is coming to play a key role in problems of broad societal and scientific impact from transportation to space, to entertainment, to environmental monitoring, to medicine, and to security. The increasing role that vision plays in applications is also documented by the increased investment by large industrial groups in vision research, from Microsoft to Siemens, General Electric, Mitsubishi, Honda etc.
Despite such a booming trend, the academic and research community remains heavily fragmented. Researchers come to vision from different backgrounds (mathematics, physics, engineering, computer science, neuroscience) and speak different technical languages (from graph theory to partial differential equations to information theory to dynamical systems and control theory). The community is also heavily segmented into sub-areas with few successful attempts to vertically integrate different approaches. Unlike other disciplines that have a common set of analytical tools and speak a unified language or that have a common set of benchmark tasks, Computer Vision remains a collection of separate approaches to separate problems, with no unifying theory, or even rational comparison between different approaches or common experimental benchmarks.
The goal of this workshop is to provide a venue for leading researchers to share their views on the future directions in Computer Vision, draw connections from different areas, and propose ways to help improve the impact of the community in applications that have broad societal consequences.
|
1 |
2005 — 2010 |
Pau, Giovanni (co-PI) [⬀] Gerla, Mario [⬀] Fitz, Michael Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nets-Prowin: Emergency Ad Hoc Networking Using Programmable Radios and Intelligent Swarms @ University of California-Los Angeles
The focus of this project is the networking of intelligent, autonomous swarms. The typical application scenario is a disaster area that requires the intervention of police, firemen, paramedics etc, but where the unfriendly environment bars direct access. The swarm establishes a communications network between the rescue teams and all critical fixed and mobile sensors and actuators in the disaster area. In the aftermath of a disaster typically some disconnected "islands" of sensors, monitors and actuators have survived the impact. A rapidly deployed swarm of air/ground agents reestablishes network connectivity, restores access to critical sensor probes, installs new probes as necessary and helps the collection and filtering of relevant data. This goal is achieved with the concurrent interworking of several elements: agile, programmable radios that can work in adverse, highly heterogeneous conditions; flexible network protocols for swarm to swarm communications and for "mobile backbone" deployment; adaptive video streaming, and; advanced vision processing for location and motion estimation. Scientific contributions and broader impacts will include: (1) robust, reconfigurable Mobile Backbone design methodology for emergency networking (2) modular, flexible, programmable MAR radio technology for unfriendly/hostile scenarios. (3) real time video streaming and "delayed" forwarding. (4) In-swarm processing of video data for image registration and mosaicing, including partial 3-D reconstruction and matching to existing blueprint and mapping data (5) Region-of-interest computation for visual odometry and swarm configuration refinement
|
1 |
2006 — 2010 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Adaptive and Intelligent Systems: Models of Photometric, Geometric and Dynamic Characteristics of Video Imagery For Segmentation, Classification and Synthesis, Including Layers @ University of California-Los Angeles
Intellectual Merit:
The objective of this research is to develop stochastic dynamical models of video imagery for the purpose of synthesis and classification of spatio-temporal events in video. The approach is based on exploiting tools from dynamical systems theory, statistical signal processing, differential geometry and functional analysis in order to infer the spatio-temporal statistics of a video segment and learn (identify) a dynamical model that represents its "signature". This allows generating novel portions of a video segment, or recognizing it in previously unseen video. The investigators will develop identification algorithms for such models, segmentation schemes to partition their spatio-temporal domain into statistically coherent regions, and endow them with a metric structure to enable classification and recognition.
Broader Impacts:
The models developed will allow the generation of synthetic portions of video, and the manipulation of their spatio-temporal statistics, which is relevant for compression and transmission, and post-production editing and development of interactive games. Furthermore, these models support classification tasks, including detection of events of interest in video and segmentation into spatio-temporal segments. This is important for video-based recognition in security, surveillance, video coding, and environmental monitoring (remote detection of fire, smoke, steam). A particular class of spatio-temporal processes studied includes human motion. The investigators will develop analytical and computational tools to enable the detection and recognition of individuals and their gait from video data. Training students in such a diverse set of analytical and computational tools is a challenge, but one that must be tackled in a modern engineering academic environment.
|
1 |
2010 — 2011 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Frontiers of Activity Recognition @ University of California-Los Angeles
This award is made in support of a collaborative project called "Frontiers in Activity Recognition" whereby a group of experts from different fields of computer science, engineering, mathematics and statistics convene in a workshop to be held in the vicinity of UCLA. One component of the workshop consists in interactive break-out sessions where different approaches to activity representation (descriptors) and recognition will be analyzed. A second component consists in a competition, announced to the broad public ahead of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), whereby an extensive dataset provided by a third party will be released, with benchmarks, and contestants will be invited to submit their best results in the detection of a number of action categories. The proposers of high-ranking approaches will be invited to the workshop to present their results and discuss it in the context of the analysis of the state of the art to be performed as part of the field assessment. The workshop can have broad impact to many applications ranging from security (surveillance, monitoring) to environmental science (habitat monitoring, global warming), to industrial operations (factory floor optimization), to multi-media and information retrieval (content-based video meta-data extraction), to entertainment (input devices for games), and to transportation (driver assistance).
|
1 |
2010 — 2013 |
Soatto, Stefano Estrin, Deborah (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Remote Sensing For Early Detection of Wildfires @ University of California-Los Angeles
Early detection of wildfires is critical to mounting a successful response, and a growing need given trends in both urbanization and climate change. As recently as a few months ago, large wildfires started in inaccessible unmonitored areas threatened large urban areas for weeks (witness the Station Fire in the Angels National Forest). Manned observation towers, the method of choice in decades past, has become unsustainable with the boundaries of urban sprawl growing and the budgets of local governments under strain.
This project tackles the problem head-on by developing algorithmic and engineering tools for remote detection of incipient fires using networked remote optical sensors in the visible and infra-red spectra. While previous efforts suggested blanketing the target area with networked temperature and smoke sensors, this approach does not scale well because it requires sensors to be close to the source in order to trigger an alarm. Remote sensors can detect events at a distance and are only limited by the topography of the environment. Thus one strategically placed camera can monitor an entire valley and would ultimately be suitable for co-deployment with other infrastructure such as cell towers. However, processing these video streams is not trivial since events of interest, such as the inception of a fire, can manifest themselves in a large number of ways depending on time of the day, season, weather, distance from the sensor, fuel etc. The challenge is to tease apart these "nuisance effects" and only detect events of interest. The team will focus on the algorithmic challenge of inferring spatio-temporal events in video streams, and on the systems trade-offs between computation, communication and energy resources.
|
1 |
2010 — 2012 |
Pearl, Judea (co-PI) [⬀] Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Modeling and Parsing Time Series For Causal Analysis With Application to Action Interpretation in Video of Natural and Man-Made Environments @ University of California-Los Angeles
This project tackles the development of new tools for the semantic analysis of temporal signals, in particular (but not restricted to) video sequences. While most of the emphasis in video analysis so far has been at the low-level, the investigators plan to explore the use of Causal Analysis to perform inference and decisions to analyze video signals. The challenge in this project is to bridge the gap between basic descriptor at the signal level and Causal Calculus, that acts on semantically meaningful representations. In particular, long-range prediction, not just short-range continuous extrapolation, requires the development of new tools that allow "interventions" into the model. How would the state "X" evolve if event "Y" were to occur? To attain the goals set forth in the proposal, the investigators must tackle fundamental problems in the analysis of time series, both at the low/mid-level (defining a proper notion of ``distance'' between time series that respects their intrinsic dynamics), at the mid-level (defining clustering schemes for action segments), and at the high-level (develop action semantics in an abductive framework).
During this pilot one-year project, the investigators plan to explore the feasibility of using causal analysis for performing long-range temporal prediction of events and actions from visual data. Sample applications that are impacted in case of success are broad ranging from surveillance to environmental monitoring to driver assistance in transportation, with significant societal impact in reducing traffic accidents.
|
1 |
2014 — 2017 |
Soatto, Stefano |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Engineering and Learning Visual Representations @ University of California-Los Angeles
Visual data, including video imagery, conveys "information" about objects of interest within the scene: Shape, material, identity, relations, etc. However, it is also highly redundant, and subject to variability that has little to do with the properties of the scene of interest, but instead depend on the sensor, the vantage point, and the nature of the illuminant, etc. This project addresses the question of determining what function of imaging data should be inferred and stored, that is, as "informative" as possible for a class of tasks such as object or scene detection, localization, recognition and categorization, and at the same time as "compressed" as possible, and insensitive to nuisance variability. Such a function is called a Representation. This research has pedagogical value, by framing seemingly unrelated methods as different approximations of an ideal Representation, thus facilitating the educational process in Computer Vision. This is further expected to facilitate the design of better Representations, and therefore improved algorithms for visual recognition (detection, localization, recognition, and categorization) systems, with impact in a range of applications from autonomy (e.g., robotic navigation and surveillance) to interaction (e.g., assisted surgery and augmented reality).
The project frames the problem of inferring optimal task-specific Representations in terms of the Information Bottleneck Principle, and addresses issues of computability, approximation, and dimensionality reduction within this framework. It also addresses questions of "learnability," to determine whether a generic learning architecture can approximate an optimal representation. The Information Bottleneck is a generalization and relaxation of the notion of minimal sufficient statistic, where complexity constraints and task relevance are explicitly taken into account. The challenge is that modeling the generative process for visual data entails complex geometry (surface shape), topology (occlusions), photometry (material reflection, illumination), and dynamics (motion) with the object of interest living in infinite-dimensional spaces. Thus, the Information Bottleneck is difficult to even formalize, let alone instantiate, compute, and optimize. The project focuses on developing approximations of the Information Bottleneck that are tractable and yet enjoy performance guarantees.
|
1 |