2007 |
Liu, Han |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sbir Phase I: Dimensionally Stable Membrane For Chlor-Alkali Production @ Giner Electrochemical Systems, Llc
The Small Business Innovation Research (SBIR) Phase I project will develop a novel, Dimensionally Stable Ionomer Membrane (DSMTM) that offers high ionic conductivity, good ion exclusion capability, and excellent mechanical properties. The most advanced chlor-alkali electrolyzers utilize an ion-exchange membrane to separate the electrolysis solutions, but that membrane introduces inefficiencies due to the thickness required to survive and function within the electrolyzer. This research project deals with a new type of membrane utilizing a thin, laser-perforated support and incorporating carboxylic and sulfonic acid ionomers to simultaneously reduce the thickness of the membrane (and therefore the electrolyzer operating voltage) while maintaining functionality and improving durability.
Just a few years ago, chlor-alkali electrolysis consumed 1.4% of all electricity generated in the United States; thus any improvement to electrolysis efficiency would result in significant power savings. The proposed research is expected to result in an electrolyzer voltage reduction of at least 125mV of the ~3.0V currently consumed, an energy consumption reduction of 4.2%.
|
0.912 |
2011 — 2015 |
Lafferty, John (co-PI) [⬀] Wasserman, Larry (co-PI) [⬀] Liu, Han |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Nonparametric Structure Learning For Complex Scientific Datasets @ Johns Hopkins University
The project brings together an interdisciplinary team of researchers from Johns Hopkins University, Carnegie Mellon University, and the University of Chicago to develop methods, theory and algorithms for discovering hidden structure from complex scientific datasets, without making strong a priori assumptions. The outcomes include practical models and provably correct algorithms that can help scientists to conduct sophisticated data analysis. The application areas include genomics, cognitive neuroscience, climate science, astrophysics, and language processing.
The project has five aims: (i) Nonparametric structure learning in high dimensions: In a standard structure learning problem, observations of a random vector X are available and the goal is to estimate the structure of the distribution of X. When the dimension is large, nonparametric structure learning becomes challenging. The project develops new methods and establishes theoretical guarantees for this problem; (ii) Nonparametric conditional structure learning: In many applications, it is of interest to estimate the structure of a high-dimensional random vector X conditional on another random vector Z . Nonparametric methods for estimating the structure of X given Z are being developed, building on recent approaches to graph-valued and manifold-valued regression developed by the investigators; (iii) Regularization parameter selection: Most structure learning algorithms have at least one tuning parameter that controls the bias-variance tradeoff. Classical methods for selecting tuning parameters are not suitable for complex nonparametric structure learning problems. The project explores stability-based approaches for regularization selection; (iv) Parallel and online nonparametric learning: Handling large-scale data is a bottleneck of many nonparametric methods. The project develops parallel and online techniques to extend nonparametric learning algorithms to large scale problems; (v) Minimax theory for nonparametric structure learning problems: Minimax theory characterizes the performance limits for learning algorithms. Few theoretical results are known for complex, high-dimensional nonparametric structure learning. The project develops new minimax theory in this setting. The results of this project will be disseminated through publications in scientific journals and major conferences, and free dissemination of software that implements the nonparametric structure learning algorithms resulting from this research.
The broader impacts of the project include: Creation of powerful data analysis techniques and software to a wide range of scientists and engineers to analyze and understand more complex scientific data; Increased collaboration and interdisciplinary interactions between researchers at multiple institutions (Johns Hopkins University, Carnegie Mellon University, and the University of Chicago); and Broad dissemination of the results of this research in different scientific communities. Additional information about the project can be found at: http://www.cs.jhu.edu/~hanliu/nsf116730.html.
|
0.955 |
2014 — 2017 |
Liu, Han |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Large-Scale Semiparametric Graphical Models With Applications to Neuroscience
DESCRIPTION: The objective of this proposal is to develop and theoretically evaluate a unified set of statistical, computational, and software tools to address data mining and discovery science challenges in the analysis of existing vast amounts of publicly available neuroimaging data. In particular, we propose to develop scalable and robust semiparametric solutions for high-throughput estimation of resting-state brain connectivity networks, both at the individual and population levels, with the flexibility of incorporating covariate information. The work will contribute meaningfully to the theory and methods for large-scale semiparametric graphical models and will apply these methods to the largest collections of resting-state fMRI data available. The proposed methods and theory include key directions of research for brain network estimation and mining. First, we pro- pose novel methods for subject-specific network estimation, such as would be needed for biomarker development in functional brain imaging. Secondly, we define and propose to evaluate and implement methods for studying population-level graphs, which study collections of graphs. Thirdly, we propose the use of estimated graphs in predictive modeling. Finally, all of these methods will have complementary software and web services development. Most notably, the idea of population graphs allows for the creation of functional brain network atlases. In summary, the work of this proposal will result in a unified framework for the analysis of modern neuroimaging data via graphical models. Our methods will further be agnostic to intricacies of the technology, thus making it portable across settings and applicable outside of the field of functional brain imaging. The methods will be carefully evaluated via theory, simulation and data-based application evidence.
|
0.955 |
2014 — 2018 |
Liu, Han |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Medium: Collaborative Research: Next-Generation Statistical Optimization Methods For Big Data Computing
This project develops a new generation of optimization methods to address data mining and knowledge discovery challenges in large-scale scientific data analysis. The project is constructed in the context that modern computing architectures are enabling us to fit complex statistical models (Big Models) on large and complex datasets (Big Data). However, despite significant progress in each subfield of Big Data, Big Model, and modern computing architecture, we are still lacking powerful optimization techniques to effectively integrate these key components.
One important bottleneck is that many general-purpose optimization methods are not specifically designed for statistical learning problems. Even some of them are tailored to utilize specific problem structures, they have not actually incorporated sophisticated statistical thinking into algorithm design and analysis. To tackle this bottleneck, the project extends traditional theory to open new possibilities for nontraditional optimization problems, such as nonconvex and infinite-dimensional examples. The project develops deeper theoretical understanding of several challenging issues in optimization (such as nonconvexity), develops new algorithms that will lead to better practical methods in the big data era, and demonstrates the new methods on challenging bio-informatics problems.
The project is closely related to NSF's mission to promote Big Data research, and will have broad impacts. In the Big Data era, we see an urgent need for powerful optimization methods to handle the increasing complexity of modern datasets. However, we still lack adequate methods, theory, and computational techniques. By simultaneously addressing these aspects, this project will deliver novel and useful statistical optimization methods that benefit all relevant scientific areas. The project will deliver easy-to-use software packages which directly help scientists to explore and analyze complex datasets. Both PIs will also design and develop new classes to teach modern techniques in handling big data optimization problems. All the course materials - including lecture notes, problem sets, source code, solutions and working examples - will be freely accessed online. Moreover, both PIs will write tutorial papers and disseminate the results of this research through the internet, academic conferences, workshops, and journals. Through senior theses and potentially the REU (Research Experiences for Undergraduates) program, the proposed project will also actively include undergraduates and engage under-represented minority groups.
To achieve these goals, this project develops (i) a new research area named statistical optimization, which incorporates sophisticated statistical thinking into modern optimization, and will effectively bridge machine learning, statistics, optimization, and stochastic analysis; (ii) new theoretical frameworks and computational methods for nonconvex and infinite-dimensional optimization, which will motivate effective optimization methods with theoretical guarantees that are applicable to a wide variety of prominent statistical models; (iii) new scalable optimization methods, which aim at fully harnessing the horsepower of modern large-scale distributed computing infrastructure. The project will shed new theoretical light on large-scale optimization, advance practice through novel algorithms and software, and demonstrate the methods on challenging bio-informatics problems.
|
0.955 |
2015 — 2019 |
Liu, Han |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Bigdata: Collaborative Research: F: Stochastic Approximation For Subspace and Multiview Representation Learning @ Northwestern University
Unsupervised learning of useful features, or representations, is one of the most basic challenges of machine learning. Unsupervised representation learning techniques capitalize on unlabeled data which is often cheap and abundant and sometimes virtually unlimited. The goal of these ubiquitous techniques is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains.
This project aims to develop new theory and methods for representation learning that can easily scale to large datasets. In particular, this project is concerned with methods for large-scale unsupervised feature learning, including Principal Component Analysis (PCA) and Partial Least Squares (PLS). To capitalize on massive amounts of unlabeled data, this project will develop appropriate computational approaches and study them in the ?data laden? regime. Therefore, instead of viewing representation learning as dimensionality reduction techniques and focusing on an empirical objective on finite data, these methods are studied with the goal of optimizing a population objective based on sample. This view suggests using Stochastic Approximation approaches, such as Stochastic Gradient Descent (SGD) and Stochastic Mirror Descent, that are incremental in nature and process each new sample with a computationally cheap update. Furthermore, this view enables a rigorous analysis of benefits of stochastic approximation algorithms over traditional finite-data methods. The project aims to develop stochastic approximation approaches to PCA and PLS and related problems and extensions, including deep, and sparse variants, and analyze these problems in the data-laden regime.
|
0.955 |
2015 — 2020 |
Liu, Han |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: An Integrated Inferential Framework For Big Data Research and Education @ Northwestern University
This project addresses several fundamental challenges in modern data analysis and aims to create a new research area named Big Data Inference. Currently available literature regarding Big Data research mainly focuses on developing new estimators for complex data. However, most of these estimators are still in lack of systematic inferential methods for uncertainty assessment. This project hopes to bridge this gap by developing new inferential theory for modern estimators unique to Big Data analysis. The deliverables of this project include easy-to-use software packages, which directly help scientists to explore and analyze complex datasets. The principal investigator is also actively collaborating with many scientists to ensure the more direct impact of this project to the targeted scientific communities.
This project aims to develop novel inferential methods for assessing uncertainty (e.g., constructing confidence intervals or testing hypotheses) of modern statistical procedures unique to Big Data analysis. In particular, it develops innovative statistical inferential tools for a variety of machine learning methods which have not yet been equipped with inferential power. It also provides necessary inferential tools for the next generation of scientists to be competitive in modern data analysis.
|
0.955 |
2018 — 2020 |
Liu, Han |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Tripods Institute For Optimization and Learning @ Northwestern University
This Phase I project forms an NSF TRIPODS Institute, based at Lehigh University and in collaboration with Stony Brook and Northwestern Universities, with a focus on new advances in tools for machine learning applications. A critical component for machine learning is mathematical optimization, where one uses historical data to train tools for making future predictions and decisions. Traditionally, optimization techniques for machine learning have focused on simplified models and algorithms. However, recent revolutionary leaps in the successes of machine learning tools---e.g., for image and speech recognition---have in many cases been made possible by a shift toward using more complicated techniques, often involving deep neural networks. Continued advances in the use of such techniques require combined efforts between statisticians, computer scientists, and applied mathematicians to develop more sophisticated models and algorithms along with more comprehensive theoretical guarantees that support their use. In addition to its research goals, the institute trains Ph.D. students and postdoctoral fellows in statistics, computer science, and applied mathematics, and hosts interdisciplinary workshops and Winter/Summer schools.
The research efforts in Phase I are on the analysis of nonconvex machine learning models, the design of optimization algorithms for training them, and on the development of nonparametric models and associated algorithms. The focus is on deep neural networks (DNNs), mostly in general, but also with respect to specific architectures of interest. The institute's research efforts emphasize the need to develop connections between state-of-the-art approaches for training DNNs and statistical performance guarantees (e.g., on generalization errors), which are currently not well understood. Optimization algorithms development centers on second-order-derivative-type techniques, including (Hessian-free) Newton, quasi-Newton, Gauss-Newton, and their limited memory variants. Recent advances have been made in the design of such methods; the PIs' work builds upon these efforts with their broad expertise in the design and implementation (including in parallel and distributed computing environments) of such methods. The development of nonparametric models promises to free machine learning approaches from restrictions imposed by large numbers of user-defined parameters (e.g., defining a network structure or learning rate of an optimization algorithm). Such models could lead to great advances in machine learning, and the institute's work in this area also draws on the PIs expertise in derivative-free optimization methods, which are needed for training in nonparametric settings.
In this TRIPODS institute, the PIs approach all of these research directions with a unified perspective in the three disciplines of statistics, computer science, and applied mathematics. Indeed, as machine learning draws so heavily from these areas, future progress requires close collaborations between optimization experts, learning theorists, and statisticians---communities of researchers that, as yet, have tended to operate separately with differing terminology and publication venues. With an emphasis on deep learning, this institute aims to foster intercollegiate and interdisciplinary collaborations that overcome these hindrances.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.942 |