A structure editor, also structured editor or projectional editor, is any document editor that is cognizant of the document's underlying structure. Structure editors can be used to edit hierarchical or marked up text, computer programs, diagrams, chemical formulas, and any other type of content with clear and well-defined structure. In contrast, a text editor is any document editor used for editing plain text files.[ clarification needed ]
Typically, the benefits of text and structure editing are combined in the user interface of a single hybrid tool. For example, Emacs is fundamentally a text editor, but supports the manipulation of words, sentences, and paragraphs as structures that are inferred from the text. Conversely, Dreamweaver is fundamentally a structure editor for marked up web documents, but supports the display and manipulation of raw HTML text as well. Similarly, molecule editors typically support both graphical and textual input. Structure editing predominates when content is graphical and textual representations are awkward, e.g., CAD systems and PowerPoint. Text editing predominates when content is largely devoid of structure, e.g., text fields in web forms. WYSIWYG word processing systems such as Word, which appear to edit formatted text directly, are essentially structure editors for the underlying marked-up text.
In linguistics, syntax is the study of the structure of grammatical utterances, and accordingly syntax-directed editor is a synonym for structure editor. Language-based editor and language-sensitive editor are also synonyms. A language-based editor's features may be implemented by ad hoc code or by a formal grammar. For example, language sensitivity in Emacs is implemented in the Lisp definition of the edit mode for the given language. In contrast, language sensitivity in an XML editor is driven by a formal DTD schema for the given language.
Although structured editors allow the viewing and manipulation of the underlying document in a structured manner, the file format in which the document is stored on disk may or may not be heavily structured and may or may not be open or standardized (e.g., plain text versus Microsoft Word documents).
Structure editing has often been employed in source code editors, as source code is naturally structured by the syntax of the computer language. However, most source code editors are instead text editors with additional features such as syntax highlighting and code folding, rather than structure editors. The editors in some integrated development environments parse the source code and generate a parse tree, allowing the same analysis as by a structure editor, but the actual editing of the source code is generally done as raw text.
Each programming language typically has a well-defined syntax given by a context-free grammar, and accordingly the meaningful structural elements in source code written in the language correspond to the grammatical phrases in the text. Early syntax-directed source code editors included Interlisp-D (for Lisp’s limited syntax) and Emily [1] (for PL/I’s rich syntax).
A syntax-directed editor may treat grammar rules as generative (e.g., offering the user templates that correspond to one or more steps in a formal derivation of program text) or proscriptive (e.g., preventing a phrase of a given part of speech from being moved to a context where another part of speech is required) or analytic (e.g., parsing textual edits to create a structured representation). Structure editing features in source code editors make it harder to write programs with invalid syntax. Language-sensitive editors may impose syntactic correctness as an absolute requirement (e.g., as did Mentor [2] ), or may tolerate syntax errors after issuing a warning (e.g., as did the Cornell Program Synthesizer [3] ). Strict structured editors often make it difficult to perform edits that are easy to perform with plain text editors, which is one of the factors contributing to the lack of adoption of structured editing in some domains, such as source code editing.
Some syntax-directed editors monitor compliance with the context-sensitive constraints of a language such as type correctness. Such static-semantic constraints may be specified imperatively by actions (e.g., as in Gandalf [4] [5] [6] ), or declaratively by an attribute grammar (e.g., as in the Synthesizer Generator [7] [8] ) or by unification in a many-sorted algebra (e.g., as in PSG [9] ) or a logic program (e.g., as in Centaur [10] and Pan [11] ), with compliance checked by the underlying editing machinery. Structured editors vary in the degree to which they allow their users to perform edits that cause the document to become syntactically or semantically incorrect.
It is common for a language sensitive editor to represent a document as a parse tree with respect to language's grammar, or as an abstract syntax tree (AST). For example, a DOM tree is essentially an AST with respect to a given DTD. Frequently, the textual view of that underlying tree is generated by prettyprinting the underlying tree. Editors associated with intentional programming [12] and language-oriented programming for general-purpose languages and domain-specific languages share many of the features of language-sensitive editors, but aim for greater separation between the underlying representation (the intention) and the surface representation (text in a programming language).
In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a low-level programming language to create an executable program.
Natural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.
In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine.
In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text written in a formal language. Each node of the tree denotes a construct occurring in the text.
Syntax highlighting is a feature of text editors that is used for programming, scripting, or markup languages, such as HTML. The feature displays text, especially source code, in different colours and fonts according to the category of terms. This feature facilitates writing in a structured language such as a programming language or a markup language as both structures and syntax errors are visually distinct. This feature is also employed in many programming related contexts, either in the form of colorful books or online websites to make understanding code snippets easier for readers. Highlighting does not affect the meaning of the text itself; it is intended only for human readers.
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part.
In computing, RELAX NG is a schema language for XML—a RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document but RELAX NG also offers a popular compact, non-XML syntax. Compared to other XML schema languages RELAX NG is considered relatively simple.
In computing, a visual programming language or block coding is a programming language that lets users create programs by manipulating program elements graphically rather than by specifying them textually. A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols, used either as elements of syntax or secondary notation. For example, many VPLs are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations.
A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.
A source-code editor is a text editor program designed specifically for editing source code of computer programs. It may be a standalone application or it may be built into an integrated development environment (IDE).
A program transformation is any operation that takes a computer program and generates another program. In many cases the transformed program is required to be semantically equivalent to the original, relative to a particular formal semantics and in fewer cases the transformations result in programs that semantically differ from the original in predictable ways.
In computer science, the syntax of a computer language is the rules that define the combinations of symbols that are considered to be correctly structured statements or expressions in that language. This applies both to programming languages, where the document represents source code, and to markup languages, where the document represents data.
The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems. DMS was originally motivated by a theory for maintaining designs of software called Design Maintenance Systems. DMS and "Design Maintenance System" are registered trademarks of Semantic Designs.
GrammaTech is a software-development tools vendor based in Bethesda, Maryland with a research center based in Ithaca, New York. The company was founded in 1988 as a technology spin-off of Cornell University. GrammaTech is a provider of application security testing products and software research services.
In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.
JetBrains MPS is a language workbench developed by JetBrains. MPS is a tool to design domain-specific languages (DSL). It uses projectional editing which allows users to overcome the limits of language parsers, and build DSL editors, such as ones with tables and diagrams.
It implements language-oriented programming. MPS is an environment for language definition, a language workbench, and integrated development environment (IDE) for such languages.
The Sweble Wikitext parser is an open-source tool to parse the Wikitext markup language used by MediaWiki, the software behind Wikipedia. The initial development was done by Hannes Dohrn as a Ph.D. thesis project at the Open Source Research Group of professor Dirk Riehle at the University of Erlangen-Nuremberg from 2009 until 2011. The results were presented to the public for the first time at the WikiSym conference in 2011. Before that, the dissertation was inspected and approved by an independent scientific peer-review and was published at ACM Press.
(Ray) Tim Teitelbaum is an American computer scientist known for his early work on integrated development environments (IDEs), syntax-directed editing, and incremental computation. He is Professor Emeritus at Cornell University. As an educator and faculty member of the Cornell University Computer Science Department since 1973, he was recognized for his large-scale teaching of introductory programming, and for his mentoring of highly successful graduate students. As a businessman, he is known for having co-founded GrammaTech, Inc. and for having been its sole CEO from 1988 to 2019.