Deep Linguistic Processing with HPSG - INitiative (DELPH-IN) is a collaboration where computational linguists worldwide develop natural language processing tools for deep linguistic processing of human language.[1] The goal of DELPH-IN is to combine linguistic and statistical processing methods in order to computationally understand the meaning of texts and utterances.
Since 2005, DELPH-IN has held an annual summit. This is a loosely structured unconference where people update each other about the work they are doing, seek feedback on current work, and occasionally hammer out agreement on standards and best practice.
DELPH-IN technologies and resources
The DELPH-IN collaboration has been progressively building computational tools for deep linguistic analysis, such as:
PET parser (Platform for Experimentation with efficient HPSG processing Techniques): an open source parser which produces HPSG parse trees with Minimal Recursion Semantics (MRS) outputs [3]
ACE processor (Answer Constraint Engine): an efficient system to process DELPH-IN grammars that provide HPSG syntactic parses with MRS outputs. The latest version of ACE is able to generate natural language sentences.[4]
LOGON infrastructure is a collection of software and DELPH-IN grammars to provide transfer-based machine translation. The LOGON approach to machine translation has proven to provide quality oriented hybrid (rule-based and stochastic) translations.[5]
Other than deep linguistic processing tools, the DELPH-IN collaboration supplies computational resources for Natural Language Processing such as computational HPSG grammars and language prototypes e.g.:
DELPH-IN grammars: a catalogue of computational HPSG grammar hand-crafted to capture deep linguistics analysis specific to the respective languages [6]
LinGO Grammar Matrix: an open-source starter-kit for rapid prototyping of precision broad-coverage grammars compatible with the LKB. It contains a library of common language phenomena that computational grammarians can inherit for their HPSG grammars.[7]
CLIMB libraries (Comparative Libraries of Implementations with Matrix Basis): an extended language library built on the Grammar Matrix. The objective of the CLIMB library is to maintain alternative analyses of the same phenomenon across different languages to test their impact on long-term grammar development.[8]
MRS Test Suite: a short but representative set of sentences designed to capture some minimal recursion semantics phenomena. The test suites are available in Bulgarian, English, French, German, Greek, Japanese, Mandarin, Norwegian, Portuguese, Russian and Spanish.[9]
Wikiwoods: WikiWoods is a parsed corpus that provides rich syntacto-semantic annotations for the English Wikipedia.[10]
DeepBank: an ongoing project to annotate the one million words of 1989 Wall Street Journal text (the same set of sentences annotated in the original Penn Treebank project) with the English Resource Grammar, augmented with a robust approximating PCFG for complete coverage.[11][12]
Cathedral and the Bazaar: a compilation of an early essay on Open Source by Eric Raymond with translations into multiple languages. It was proposed as a multilingual shared test suite to enable us to compare parses across different grammars.[13][14]
The open-source culture of the DELPH-IN collaboration provides the Natural Language Processing community with an array of deep linguistic processing tools and resources. However, the usability of DELPH-IN tools has been an issue with users and application developers new to the DELPH-IN ecology.[citation needed] The DELPH-IN developers are aware of these usability issues and there are ongoing attempts to improve documentation and tutorials of DELPH-IN technologies.[15]
This page is based on this Wikipedia article Text is available under the CC BY-SA 4.0 license; additional terms may apply. Images, videos and audio are available under their respective licenses.