1992 — 1994 |
Dorr, Bonnie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Interlingual Machine Translation and the Lexicon @ University of Maryland College Park
This research is concerned with the applicability of a lexical- based framework to the problem of interlingual machine translation. There are two tasks relevant to this goal. The first task is the augmentation of an existing lexical-semantic representation to include temporal, aspectual, and spatial information, all of which are necessary for adequate machine translation. The second task is the construction of routines for automatic acquisition of lexical entries based on this representation. In general, the project aims to test hypotheses that support the view that a lexical-based parametric framework can be built to accommodate interlingual machine translation, and provide and adequate basis for capturing temporal, aspectual, and spatial information.
|
1 |
1993 — 1998 |
Dorr, Bonnie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf Young Investigator @ University of Maryland College Park
9357731 Stein NYI - Embodiment Informs Cognition (Supplement) This is a supplement to the above mentioned NYI award for the purpose of holding a small workshop to cover the cost of participation of a small number of college and university faculty members involved in teaching introductory AI courses. This supplement addresses the desire to provide improved educational and research opportunities to bridge the gap between the capacities of current artificially intelligent agents and human-like cognitive abilities, both issues to be discussed from a point of view of classical undergraduate or first year graduate AI education, indicating how a symbolic (linguistic) approach evolves from the interpretation of arbitrary signals in terms of the human experiences. The goals of the workshop consist of providing information exchange relating to the nature of the introductory course, and addressing new curricula issues and questions including the one related to the Stein award mentioned above, but others as well such as the use and selection of instructional programming tools, and other software repository issues related to Artificial Intelligence education. A report or publication is part of the results of the workshop.
|
1 |
1994 — 1999 |
Dorr, Bonnie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
U.S.-France Cooperative Research: Development and Formalization of Lexical-Semantic Representations For Machine Translations @ University of Maryland College Park
9314583 Dorr This three-year award supports U.S.-France cooperative research in interactive systems (human/computer) between research groups at the the University of Maryland and the University of Narbonne, Toulouse, France. The U.S. investigators are Bonnie Dorr and Martha Palmer, and the French investigator is Patrick Saint-Dizier. They will collaborate on topics concerning lexical representations for French and English. The objective of their research is to develop a conceptual description for thematic roles and argument structures. They will emphasize cross-linguistic distinctions and the meaning components which are necessary for lexical selection during machine translation. Current machine translation systems encounter problems when extended to new languages or domains. This joint project will not only advance our understanding of that problem, but is likely to lead to a new lexical based framework that can be applied to machine translation, information retrieval, and automatic lexical acquisition. The U.S. investigators bring to this collaboration expertise in lexical-semantic representation. This is complemented by Dr. Saint-Dizier's theoretical work in computational linguistics. ***
|
1 |
1995 — 1997 |
Dorr, Bonnie Weinberg, Amy (co-PI) [⬀] Raschid, Louiqa (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cise Research Instrumentation: Hardware and Software For Large Scale Projects in Information Mediation, Language Translation and Text Filtering and Retrieval @ University of Maryland College Park
9422138 Dorr This award is to purchase equipment (cost-shared with The Institute for Advanced Computer Studies (UMIACS) and Department of Linguistics at the University of Maryland) dedicated to support specific projects in the laboratory for Computational Linguistics and Information Processing (CLIP). These projects are in the areas of information mediation, language translation and tutoring, and text filtering and retrieval. The software requirements include large dictionaries, corpora, data models and databases, high-level database query languages, and a Prolog interpreter. Since many of these resources are available only on CD ROMs, the hardware requirements include an optical disk controller and large disks for storage. Also there is a need for two workstations with significant computing power for indexing and processing of text and installation of an object server. Finally, four Xterminals for extensive prototype development and an eye-tracker for testing hypotheses related to the development of psycholinguistically-grounded NLP models are needed. ***
|
1 |
2001 — 2004 |
Dorr, Bonnie Weinberg, Amy (co-PI) [⬀] Raschid, Louiqa [⬀] Doermann, David (co-PI) [⬀] Oard, Douglas (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cise Research Resources: Infrastructure to Develop a Large Scale Experiment Testbed of Multi-Modal Resources @ University of Maryland College Park
EIA-0130422 Louiqa Raschid University of Maryland College Park
CISE Research Resources: Infrastructure to Develop a Large Scale Experiment Testbed of Multi-model Resources
The use of the widely distributed collections of structured and unstructured information expressed in multiple languages or modalities provided by the Internet, requires production of scalable, robust algorithms for the discovery of replicated content, determination of delay or access latency of sources, and the confrontation of the inherently dynamic nature of the Internet.
This project's objective is to establish a laboratory testbed providing a controlled environment that captures structural, content, and latency characteristics of the (publicly accessible) Web. This will stimulate collaboration between researchers whose interests range over natural language applications, language independent processing of scanned documents, analysis of video information sources, information retrieval, and wide area applications and resource discovery across heterogeneous servers.
The testbed will support the development and testing of: (1) tools for broad-scale, cross-linguistic analysis and discovery of relevant information across languages and modalities, (2) cost models and access cost catalogs for wide area environments, reflecting the temporal variability in access latency, (3) distributed content based indexing and association of media clips for resource discovery, (4) transcoding and scheduling of multimedia resources for delivery any time and anywhere to disparate clients; from mobile wireless to high speed optical links.
|
1 |
2001 — 2004 |
Dorr, Bonnie Resnik, Philip [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Proposal-Using the Web as a Corpus For Empirical Linguistic Research @ University of Maryland College Park
. This project will develop tools that make it possible to retrieve naturally occurring sentences from the World Wide Web on the basis of lexical content and syntactic structure, providing linguists with an immediate, easily accessible source of raw linguistic data. The PIs will investigate specific linguistic hypotheses at the lexical semantics/syntax interface as an illustrative application of these tools. At a high level, the planned work constitutes an important step toward a new paradigm for linguistic research. Rather than relying entirely on introspective data generated by the linguist who is trying to (dis)prove a particular hypothesis, Web-enabled linguistics research will draw on the methodology and the tools developed by the PIs to supply naturally occurring data on which theories can rest. With regard to specific linguistic questions, the goal is to provide an explanation of the rules and constraints that govern three transitivity alternations (Middle, Unaccusative, Unspecified Object Deletion), and the PIs expect data made available by their tools to shed light on the "grey" area between competence and performance, that is, the linguistic behavior that seems to fall outside of rule-governed behavior. Although naturally occurring data are not accorded great emphasis in generative syntax, the use of text corpora has a tradition in the greater linguistic enterprise. An explosive new phenomenon in the world of naturally occurring text, the World Wide Web is an essentially untapped resource that embodies the rich and dynamic nature of language, presenting a data resource of unparalleled size and diversity.
|
1 |
2003 — 2005 |
Dorr, Bonnie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research:Interlingual Annotation of Multilingual Text Corporation @ University of Maryland College Park
This multi-site research effort is aimed at developing a coherent, consistent, standardized Interlingual representation along with a methodology and sharable tools for annotating large bilingual corpora of parallel texts. It has four central components: First, six corpora are being compiled, each consisting of a number of texts in a particular source language along with three translations of each text into English. Second, a standardized interlingual representation is being developed based on a comparative analysis of these parallel text corpora. Third, the bilingual corpora are being annotated using the standardized interlingua and following a predefined annotation procedure. Fourth, metrics are being developed for evaluating the accuracy and appropriateness of the interlingual representations in terms of the grain size of the representation given a particular task. The metrics are based on inter-coder reliability, the growth rate of the interlingual representation, and quality of the target language text that is be generated from the interlingua.
The resulting annotated, multilingual, parallel corpora will be useful as an empirical basis for developing a wide variety of interlingual NLP systems for tasks such as machine translation, question answering, web searching, summarization, or presentation generation, as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines.
The participants include CRL at NMSU, ISI at USC, UMIACS at the University of Maryland, LTI at CMU, Columbia University, and The MITRE Corporation. The source languages include Arabic, Chinese, French, Hindi, Japanese, Spanish and English.
|
1 |
2005 — 2008 |
Dorr, Bonnie |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Semantic Annotation Planning Meeting @ University of Maryland College Park
This is funding to support approximately 40 attendees in a workshop to be held at the University of Maryland Inn and Conference Center on April 14-15, 2005. Participants will be a mixture of researchers who are developing representations of the semantics of a text, and those developing applications such as machine translation and summarization systems, as well as knowledge-based inferencing and analysis applications such as Social Network Analysis. The goal of the meeting is to create a representation that would serve as input to these applications. Workshop participants will model the problem as an extended exercise in extracting information elements from a "document" (that is, from language communication which may have been written or spoken). There are two broad questions that attendees hope to answer: (a) what are the elements of knowledge that can be derived from a document, and (b) can the representation, and hence, the annotation, be laid out in terms of iterative layers, the accumulation of which would represent the sum of the knowledge? Thus, the workshop outcome will be creation of a "road map" for building up representations over the long term, which will get richer as the community learns to exploit more of the information content.
Broader Impacts: The government has an interest in using a unified representation rather than a set of individual representations. The Intelligence Community, DARPA, and NSF have funded a number of efforts towards the annotation of meaning representations in the past few years; this workshop is an attempt to pull this expertise together to produce a unified approach.
|
1 |
2007 — 2011 |
Dorr, Bonnie Shneiderman, Ben (co-PI) [⬀] Klavans, Judith Lin, Jimmy (co-PI) [⬀] Radev, Dragomir |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii-Cor: Iopener - a Flexible Framework to Support Rapid Learning in Unfamiliar Research Domains @ University of Maryland College Park
In today's rapidly expanding disciplines, scientists and scholars are constantly faced with the daunting task of keeping up with knowledge in their field. In addition, the increasingly interconnected nature of real-world tasks often requires experts in one discipline to rapidly learn about other areas in a short amount of time. Cross-disciplinary research requires scientists in such areas as linguistics, biology, and sociology to learn about computational approaches and applications. Both students and educators must have access to accurate surveys of previous work, ranging from short summaries to in-depth historical notes. Government decision-makers must learn about different scientific fields to determine funding priorities.
The goal of iOPENER (Information Organization for PENning Expositions on Research) is to generate readily-consumable surveys of different scientific domains and topics, targeted to different audiences and levels, e.g., expert specialists, scientists from related disciplines, educators, students, government decision makers, and citizens including minorities and underrepresented groups. Surveyed material is presented in different modalities, e.g., an enumerated list of articles, a bulleted list of key facts, a textual summary, or a visual presentation with zoom and filter capabilities. The original contributions of this research are in the creation of an infrastructure for automatically summarizing entire areas of scientific endeavor by linking three available technologies: (1) bibliometric lexical link mining; (2) summarization techniques; and (3) visualization tools for displaying both structure and content.
The iOPENER software and resulting surveys will be made publicly available via the project Web site (http://tangra.si.umich.edu/clair/iopener/) and research results will be presented at conferences such as the ACL, SIGIR, and ASIST, as well as to broader audiences, e.g., expert specialists, students, educators, and government decision makers. Application areas include digital government, emergency response, and public health issues.
|
1 |