2005 — 2009 |
Su, Zhendong Chen, Hao (co-PI) [⬀] Chuah, Chen-Nee [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nets-Nbd: Automatic Validation, Optimization, and Adaptation of Distributed Firewalls For Network Performance and Security @ University of California-Davis
As the Internet becomes an essential part of our everyday computing and communication infrastructure, it has also grown to be a complex distributed system that is hard to characterize. There have been numerous studies on network topology, IP-reachability, and routing dynamics to analyze end-to-end packet forwarding performance. However, there is very little systematic investigation into the influence of other packet transformations that happen along the path, e.g., firewalls, packet filtering, and quality-of-service mapping. Among these, firewalls are ubiquitous as they become indispensable security defense mechanisms used in business and enterprise networks. Just as router mis-configurations can lead to unpredictable routing problems, misconfigured firewalls may fail to enforce the intended security policies, or may incur high packet processing delay. Unfortunately, firewall configuration for a large, complex enterprise network is a demanding and error-prone task, even for experienced administrators. Firewalls can be distributed in many parts of the network or across layers (IP-layer filtering versus application-layer solutions) to cooperatively achieve a global, network-wide policy. As distributed firewall rules are concatenated, it becomes extremely difficult to predict the resulting end-to-end behavior and whether it meets the higher-level security policy.
Intellectual merit: In this project, the principal investigators (PIs) propose to develop a unified framework for policy-checking, optimization, and auto-reconfiguration of distributed firewalls. This research will provide novel analysis, design techniques, and tools to better protect our critical information infrastructures from attacks. The PIs will explore providing consistent and efficient security protection for an enterprise that may have geographically distributed business networks served by different local Internet Service Providers. They adopt an inter-disciplinary technical approach that leverages multi-way communications among the three PIs with expertise in networking, security, and programming languages and compilers areas to design an integrated solution. In particular, the PIs propose a systematic treatment of the problem by casting it as a static program analysis question, exploiting well-established and rigorous techniques from the area of programming languages and compilers. The PIs will pursue the following closely related tasks:
Policy Validation for Security: The PIs first classify all possible policy anomalies (including both inconsistency and inefficiency) in firewall configurations. They will model firewalls as finite-state transition systems and apply symbolic model checking techniques on these finite-state representations to detect both intra-firewall and inter-firewall policy anomalies. The policy validation method consists of two phases. First, they perform control-flow analysis and identify all possible flow paths. Second, they perform data-flow analysis and check for anomalies on every path. Identifying most intra-firewall and inter-firewall anomalies can be accomplished in one traversal. The processing results of each path are further used to identify inter-path misconfigurations.
Policy Optimization for Performance: In a typical firewall setting, a packet is compared against a list of rules sequentially until the packet matches a rule. Firewalls with complex rule sets can cause significant delays on network traffic and therefore becomes a bottleneck (especially in high-speed networks) and an attractive target for DoS attacks. Therefore, it is important to optimize packet filtering to provide network Quality of Service (QoS) requirement. In addition, the total number of rules configured and the order of rules also play major roles in the load and efficiency of a firewall. The PIs approach this problem by representing filtering rules as binary decision diagrams (BDDs) and generating "optimal filter rule sets" from the internal BDD representation. They also apply dataflow analysis to hoist same or similar rules from different paths to a common location to reduce traffic. They will leverage the underlying network topology, routing, and traffic distribution information in the optimization step to improve the efficiency of firewall checking, which enhances packet-forwarding performance. The key advantage of this approach is the ability to pro-actively prevent vulnerabilities in firewalls since static analysis can be applied before the actual deployment of firewalls.
Broader Impacts: The proposed research efforts will help system and network administrators to configure networked systems more securely and efficiently. The educational component, which is directed at both undergraduate and graduate students, complements the research activities. Research results will be incorporated into new and existing courses. The PIs will actively participate in UC Davis' minority outreach programs to recruit students from underrepresented groups into science and engineering. In addition, firewall configuration tools developed in the project will be distributed for teaching
|
1 |
2006 — 2012 |
Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Reliability and Security of Database and Web Applications @ University of California-Davis
CCF-0546844 Zhendong Su University of California -Davis
CAREER: Reliability and Security of Database and Web Applications
Database and web applications contain many critical faults, seriously undermining the security and reliability of our national information system infrastructures. Many such errors are introduced because of these applications' dynamic and complex interactions with outside environments such as dynamically constructing queries to access a database. Thus, it is important to automatically enforce the correctness of these interactions, but no such technique or tool exists. Consequently, these applications are susceptible to serious failures and security threats. This project proposes a novel, systematic analysis framework to address this problem. With static analysis and runtime checking as its foundation, this framework will offer a rigorous and comprehensive approach to validate and enforce large classes of high-level properties specific to database and web applications. The additional educational component includes the development of an interdisciplinary course integrating concepts from databases, programming languages, software engineering, and computer security to address the educational need of engineering robust database-intensive applications. This project is expected to impact both industry and academia. Analysis tools developed in the project will be distributed for teaching, research, and experimental evaluation at other institutions and by industry. This project will also produce publicly available instructional materials including web sites and courseware that help people write more reliable and secure database and web applications, and serve as a clearing house for development tools such as the ones from this project.
|
1 |
2006 — 2011 |
Wu, Shyhtsun Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Ct-T: a Vertical Systems Framework For Effective Defense Against Memory-Based Attacks @ University of California-Davis
The security of national information infrastructures is undermined by constant malicious attacks exploiting vulnerabilities in systems software. Most existing attacks exploit memory-based flaws, such as stack or heap overflows, and format string vulnerabilities. Current defense mechanisms, either network- or host-based, are not sufficient against many advanced attacks such as polymorphic or metamorphic worm exploits. This project is to provide a comprehensive framework for detecting, analyzing, and exterminating such attacks. The PIs take an interdisciplinary approach, combining their expertise in computer architecture, computer and network security, programming languages, compilers, and software engineering to tackle this difficult problem. In particular, the PIs propose a layered defense and analysis framework that consists of: (1) an architecture layer for detecting and recovering from unknown attacks; (2) an analysis layer for diagnosing attacks and generating attack signatures; and (3) a testing layer for discovering and fixing unknown software vulnerabilities. The intellectual merit of this project will lie in the advanced techniques developed in this project to defend against unknown, large-scale memory-based attacks.
This interdisciplinary project will allow an effective approach to tackle this problem and advance knowledge in each of the requisite disciplines with both systems concepts and advanced programming language and analysis techniques. The broader impact of this project is the potential for a more reliable and secure information systems infrastructure. This will have tremendous economical impact on society because of our growing reliance on information technologies. Research results from this project (such as systems, simulators, and tools) will be widely disseminated so that they can be further evaluated, enhanced, and adopted to benefit other researchers and the industry.
|
1 |
2007 — 2011 |
Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Program Analysis For Reliable Numerical Software @ University of California-Davis
CCF-0702622 Program Analysis for Reliable Numerical Software Zhendong Su
Many software systems involve numerical computation, and numerical errors in these systems can be disastrous. Well-known examples include the loss of the Mars Climate Orbiter (due to a misuse of measurement units) and the explosion of the Ariane 5 rocket (due to an overflow). Studies show that such errors often occur, even in well-tested code, because it is difficult to test numerical software and few static tools exist to help detect such kinds of errors.
This project aims at developing practical program analysis techniques and tools to help avoid common classes of numerical errors. It focuses on three main aspects of the problem: (1) automatic dimensional analysis and unit checking, (2) static detection of uncaught exceptions such as overflows and underflows, and (3) static estimation of error propagation and numerical stability. The general approach is to cast these problems as constraint-based program analyses by modeling the formal semantics of the IEEE floating-point standards and designing approximate abstract semantics with appropriate constraint formalisms. For wide dissemination of the research results, analysis tools developed in the project will be distributed to the public domain for use in teaching, research, and experimental evaluation.
|
1 |
2009 — 2014 |
Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Tc: Small: Runtime and Static Analysis For Web Application Security @ University of California-Davis
Web applications are prevalent and enable much of today's online business including banking, shopping, university admissions, and various governmental activities. However, the quality of such applications is usually low, and they are increasingly popular targets for attack. This project aims at developing practical testing and analysis mechanisms and tools to secure web applications. In particular, it focuses on developing novel, principled techniques to address the following research issues: (1) how to formalize security threats in web applications; (2) how to provide runtime security for deployed applications via dynamic monitoring; and (3) how to provide static security enforcement during application development and testing.
The project is interdisciplinary, touching upon a number of requisite areas including computer security, software engineering, and programming languages. It has the potential to advance knowledge in each of these disciplines with novel formulations of security requirements, systems concepts, and advanced testing and analysis techniques. The project also has the potential for significant industrial and societal impact. Through the proposed research, education, and outreach activities, the project will empower web application developers with the knowledge, methodologies, and development tools to build secure web applications. Testing and analysis tools developed in the project will also be distributed to other institutions and the industry for teaching, research, and experimental evaluation.
|
1 |
2010 — 2015 |
Barr, Earl (co-PI) [⬀] Su, Zhendong Filkov, Vladimir (co-PI) [⬀] Devanbu, Premkumar [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Medium: How Do Static Analysis Tools Affect End-User Quality @ University of California-Davis
The perceived quality of a software product depends strongly on field failures viz., defects experienced by users after the software is released to the field. Software managers work to constantly improve quality control processes, seeking to reduce the number of field failures. Static analysis is a powerful and elegant technique that finds defects without running code, by reasoning about what the program does when executed. It has been incubating in academia and is now emerging in industry. This research asks this question: How can the performance and practical use of static analysis tools be improved ? The goal of the research is to find ways to improve the performance of static analysis tools, as well as the quality-control processes that use them. This will help commercial and open-source organizations make more effective use static analysis tools, and substantially reduce field failures.
Using historical data from several open-source and commercial exemplars, the research will retrospectively evaluate the association of field failures with static analysis warnings. The research will evaluate the impact of factors such as experience of the developer, the complexity of the code, and the type of static analysis warning on failure properties such criticality, and defect latency (time until a defect becomes a failure). A wide variety of projects will be studied, including both commercial and open-source. The resulting data will be analyzed using statistical modeling to determine the factors that influence the success of static analysis tools in preventing field failures. Some field failures may have no associated static analysis warnings. This research will identify and characterize these failures, paving the way for new static analysis research. An integrated educational initiative in this proposal is the training of undergraduates by using bug fixes as pedagogical material; undergraduates will also help annotate the corpus of field failures with information relevant to our analysis. An important byproduct of this research, is a large, diverse, annotated corpus of field failures of use to other educators and researchers in empirical software engineering, testing, and static analysis.
|
1 |
2011 — 2016 |
Barr, Earl (co-PI) [⬀] Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Reusing Debugging Knowledge @ University of California-Davis
Developers spend most of their time debugging software. This effort results in perfective changes to their applications, but is otherwise lost. No central repository exists that stores all bug descriptions and fixes. The reason for this state of affairs is the belief that debugging is an idiosyncratic, context-specific task that does not generalize. In contrast, this project hypothesizes that applications decompose into smaller, similar problems and that limitations of the human mind imply that we are likely to make similar errors when confronted by similar problems. In practice, most developers agree. When fixing a bug, a developer often begins by searching for similar bugs that have been reported and resolved in the past, because a fix for a similar bug can help him understand his bug, or even directly fix his bug. In short, the problem is that the knowledge gained during debugging is wasted---either not stored or not searchable.
This project seeks to revolutionize debugging by capturing and reusing debugging knowledge. Developers can leverage the knowledge of the community to speed up debugging. To this end, it proposes to create a universal bug repository capable of storing all bug information, indexed on bug traces. To speed debugging, this repository will support efficiently finding similar bugs and their fixes. It will be the basis of automatic debugging tools that match closed bugs to an open bug, then test the applicability of their fixes to that open bug. Programmers will consult the proposed bug repository as a matter of course during debugging. Monitors that compare the current execution against traces in the repository can also prevent bugs and improve software reliability. This project promises to significantly speed up debugging and reduce software production cost. The proposed educational innovations and outreach efforts can also help train more capable IT professionals for the workforce.
|
1 |
2012 — 2016 |
Barr, Earl (co-PI) [⬀] Su, Zhendong Devanbu, Premkumar [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Exploiting the Naturalness of Software @ University of California-Davis
A study of software code has revealed a surprising result: that software code may be just as (if not more) "natural" as natural language itself (e.g., English) in that code is highly predictable and repetitive; statistical natural language techniques may be applied quite competently for some software engineering tasks. For example, N-grams may be quite effective at suggestion and completion tasks in code. The evidence supports further exploration of the applicability of statistical NLP techniques and tools to software development activities and processes. The project explores the feasibility of establishing a scientific basis and tools for a variety of code-level software engineering functions -- including natural language summarization, code retrieval, software question answering, automated code completion, and assistive tools for disabled developers to support software engineering, forming not only a new and important domain for further research in NLP, but also a totally new approach to software development.
|
1 |
2013 — 2016 |
Su, Zhendong Bai, Zhaojun (co-PI) [⬀] Devanbu, Premkumar (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Toward Numerically Robust Software @ University of California-Davis
Society increasingly depends on numerical software, which uses finite precision arithmetic to approximate the reals and necessarily introduces approximation and error. Anti-lock breaks and medical devices such as haptic control systems for remote surgery are two such examples. Numerical errors in these systems can be disastrous. Toyota suspects such errors contributed to its recent, costly unintended acceleration problem, and the Ariane 5 rocket exploded due to an overflow in its inertial reference system. This project explores practical techniques to test and analyze numerical software, which will advance the state-of-the-art in engineering robust numerical software to help avoid costly, dangerous errors.
In particular, the project focuses on the two most fundamental sources of numerical errors: uncaught exceptions and numerical stability and accuracy. The proposed core framework is centered around symbolic execution, and domain insights will be used to develop principles and heuristics to make it practical. This project will complete several preliminary research tasks to validate and demonstrate the promise of the proposed general approach. It will explore new problem modeling strategies for numerical accuracy and stability, examining realistic numerical constraints to build insights into constraint solving strategies and algorithms, and improving the promising Ariadne symbolic analysis infrastructure.
|
1 |
2013 — 2017 |
Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Twc: Small: Collaborative: Similary-Based Program Analyses For Eliminating Vulnerabilities @ University of California-Davis
The security of critical information infrastructures depends upon effective techniques to detect vulnerabilities commonly exploited by malicious attacks. Due to poor coding practices or human error, a known vulnerability discovered and patched in one code location may often exist in many other unpatched code locations, either in the same code base or other code bases. Furthermore, patches are often error-prone, resulting in new vulnerabilities. This project develops practical techniques for detecting code-level similarity to prevent such vulnerabilities. It has the potential to help build a more reliable and secure information system infrastructure, which will have tremendous economical impact on society because of our growing reliance on information technologies.
In particular, the project aims to develop practical techniques for similarity-based testing and analysis to detect unpatched vulnerable code and validate patches to the detected vulnerable code at both the source code and binary levels. To this end, it focuses on three main technical directions: (1) developing techniques for detecting source-level vulnerabilities by adapting and refining an industrial-strength tool, (2) developing capabilities of detecting binary-level vulnerabilities by extending preliminary work on detecting code clones in binaries, and (3) supporting patch validation and repair by developing methodologies and techniques to validate software patches and help produce correct, secure patches. This project helps discover new techniques for source- and binary-level vulnerability analysis and gain better understandings of the fundamental and practical challenges for building highly secure and reliable software.
|
1 |
2014 — 2018 |
Su, Zhendong Filkov, Vladimir (co-PI) [⬀] Devanbu, Premkumar [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Large: Collaborative Research: Exploiting the Naturalness of Software @ University of California-Davis
This inter-disciplinary project has its roots in Natural Language (NL) processing. Languages such as English allow intricate, lovely and complex constructions; yet, everyday, ``natural? speech and writing is simple, prosaic, and repetitive, and thus amenable to statistical modeling. Once large NL corpora became available, computational muscle and algorithmic insight led to rapid advances in the statistical modeling of natural utterances, and revolutionized tasks such as translation, speech recognition, text summarization, etc. While programming languages, like NL, are flexible and powerful, in theory allowing a great variety of complex programs to be written, we find that ``natural? programs that people actually write are regular, repetitive and predictable. This project will use statistical models to capture and exploit this regularity to create a new generation of software engineering tools to achieve transformative improvements in software quality and productivity.
The project will exploit language modeling techniques to capture the regularity in natural programs at the lexical, syntactic, and semantic levels. Statistical modeling will also be used to capture alignment regularities in ``bilingual? corpora such as code with comments, or explanatory text (e.g., Stackoverflow) and in systems developed on two platforms such as Java and C#. These statistical models will help drive novel, data-driven approaches for applications such as code suggestion and completion, and assistive devices for programmers with movement or visual challenges. These models will also be exploited to correct simple errors in programs. Models of bilingual data will used to build code summarization and code retrieval tools, as well as tools for porting across platforms. Finally, this project will create a large, curated corpus of software, and code analysis products, as well as a corpus of alignments within software bilingual corpora, to help create and nurture a research community in this area.
|
1 |
2015 — 2018 |
Su, Zhendong Sun, Chengnian |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Compiler Validation Via Equivalence Modulo Inputs @ University of California-Davis
Title: SHF:Small:Compiler Validation via Equivalence Modulo Input
Compilers are among the most widely-used and complex software systems ever written; they are the trusted foundation for building other software. Perhaps less known is that production compilers also frequently contain bugs, which frustrate programmers and may lead to mysterious program failures and disasters. Compiler validation is both scientifically and technically challenging. The intellectual merits of this project are novel methodologies and practical techniques for validating production compilers. The project's broader significance and importance are improved reliability and usability of modern compilers. It also indirectly improves the quality of every piece of software upon which society increasingly depends.
Technically, this project is centered around equivalent modulo input (EMI), a general concept and approach for validating compilers. A basic realization of EMI is effective and has discovered more than a hundred important bugs in widely-used compilers. This project builds on that success and focuses on three main directions: develop advanced strategies to realize EMI's full potential, test a compiler's diagnostic support and performance, and generalize techniques toward testing C++ and OpenMP compilers. The project aims to significantly advance our knowledge and state-of-the-art practices on validating and engineering reliable and usable compilers.
|
1 |
2016 — 2019 |
Fu, Zhoulai Su, Zhendong |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Small: Testing and Analysis For Reliable Numerical Software @ University of California-Davis
Society increasingly depends on numerical software, which uses finite precision arithmetic to approximate the reals and necessarily introduces approximation and error. Cyber-physical systems are ubiquitous and rely on numerical software. Anti-lock breaks and medical devices such as haptic control systems for remote surgery are two such examples. Numerical errors in these systems can be disastrous. Toyota suspects such errors contributed to its costly acceleration problem; the Ariane 5 rocket exploded due to an overflow in its inertial reference system; and scientists have had to retract papers from prestigious journals (e.g., Nature). Techniques and tools for improving numerical software reliability are critically needed.
This project explores practical techniques to test and analyze numerical software. It focuses on the two most fundamental sources of numerical errors: uncaught exceptions, and numerical stability and accuracy. The proposed techniques are centered around symbolic execution and guided testing, and domain insights are used to develop principles and heuristics to make the techniques practical. This project considers three main dimensions: (1) problem formulation and analysis strategies, (2) constraint solving, and (3) implementation techniques. It aims at advancing the state-of-the-art in engineering robust numerical software to help avoid costly, dangerous errors.
|
1 |
2018 — 2021 |
Su, Zhendong Zhang, Qirun (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf:Small:Scalable and Precise Program Analyses Via Linear Conjunctive Language Reachability @ University of California-Davis
Static program analysis provides foundational and practical techniques to help build reliable and secure software. Context-free language (CFL) reachability has been widely adopted for specifying program analysis problems. However, little foundational progress exists on advancing the CFL-reachability framework itself. To support precise and scalable program analyses, an expressive, accessible class of formal language reachability is needed to bridge fundamental formal language research and practical analysis-based tool development. The project's novelties are twofold: a new powerful formalism for specifying program analyses, and algorithms and techniques for realizing a practical framework based on this formalism. The project's impacts are deepened knowledge on and improved capabilities for building precise and scalable static analyses, as well as practical analyses for improving software reliability and security.
This project will explore linear conjunctive language (LCL) reachability as a new static analysis formalism. In contrast to CFLs, LCLs are closed under all set-theoretic operations and can also be efficiently recognized in quadratic time. A significant number of advanced program analyses need to match properties described by multiple CFLs simultaneously. LCLs can precisely express many such properties, while CFLs cannot because they are not closed under intersection. Thus, LCL reachability offers a novel perspective in specifying and realizing program analyses. The investigators' initial work on LCL-reachability has shown considerable promise, leading to both more precise and orders of magnitude more scalable alias and taint analysis, two widely-used analyses. This project aims to fully exploit LCL-reachability's potentials by developing a unified solution for specifying program analysis problems in LCL and implementing novel data structures that support efficient LCL-reachability algorithms. It focuses on (1) theoretical development of the LCL-reachability formulation, (2) efficient algorithms for computing LCL-reachability, and (3) generalizing to practical program analyses. If successful, this project will significantly advance the state-of-the-art in software analysis and verification.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |