Transclusion

Last updated
In this example, the data of file B is transcluded into the document A. Transclusion simple.svg
In this example, the data of file B is transcluded into the document A.

In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user. [1] The result of transclusion is a single integrated document made of parts assembled dynamically from separate sources, possibly stored on different computers in disparate places.

Contents

Transclusion facilitates modular design (using the "single source of truth" model, whether in data, code, or content): a resource is stored once and distributed for reuse in multiple documents. Updates or corrections to a resource are then reflected in any referencing documents.

In systems where transclusion is not available, and in some situations where it is available but not desirable, substitution is often the complementary option, whereby a static copy of the "single source of truth" is integrated into the relevant document. Examples of both are provided by the ways in which they are both used in creating the content of Wikipedia, for example (see Wikipedia:Transclusion and Wikipedia:Substitution for more information). Substituted static copies introduce a different set of considerations for version control than transclusion does, but they are sometimes necessary.

Ted Nelson coined the term for his 1980 nonlinear book Literary Machines , but the idea of master copy and occurrences was applied 17 years before, in Sketchpad. Currently it is a common technique employed by textbook writers, where a single topic/subject needs to be discussed in multiple chapters. An advantage of this system in textbooks is that it helps data redundancy and keeps the book to a manageable size.

Technical considerations

Context neutrality

Transclusion works better when transcluded sections of text are self-contained, so that the meaning and validity of the text is independent of context. For example, formulations like "as explained in the previous section" are problematic, because the transcluded section may appear in a different context, causing confusion. What constitutes "context-neutral" text varies, but often includes things like company information or boilerplate. To help overcome context sensitivity issues such as those aforementioned, systems capable of transclusion are often also capable of suppressing particular elements within the transcluded content. For example, Wikipedia can use tags such as "noinclude", "onlyinclude", and "includeonly" for this purpose. Typical examples of elements that often require such exceptions are document titles, footnotes, and cross-references; in this way, they can be automatically suppressed upon transclusion, without manual reworking for each instance.

Parameterization

Under some circumstances, and in some technical contexts, transcluded sections of text may not require strict adherence to the "context neutrality" principle, because the transcluded sections are capable of parameterization. Parameterization implies the ability to modify certain portions or subsections of a transcluded text depending on exogenous variables that can be changed independently. This is customarily done by supplying a transcluded text with one or more substitution placeholders. These placeholders are then replaced with the corresponding variable values prior to rendering the final transcluded output in context.

Origins

The concept of reusing file content began with computer programming languages: COBOL in 1960, [2] followed by BCPL, PL/I, C, [3] and by 1978, even FORTRAN. An include directive allows common source code to be reused while avoiding the pitfalls of copy-and-paste-programming and hard coding of constants. As with many innovations, a problem developed. Multiple include directives may provide the same content as another include directive, inadvertently causing repetitions of the same source code into the final result, resulting in an error. Include guards help solve this by, after a single inclusion of content, thereafter omitting the duplicate content. [4]

The idea of a single, reusable, source for information lead to concepts like: Don't repeat yourself and the abstraction principle. A further use was found to make programs more portable. Portable source code uses an include directive to specify a standard library, which contains system specific source code that varies with each computer environment. [5]

History and implementation by Project Xanadu

Ted Nelson, who originated the words hypertext and hypermedia , also coined the term transclusion in his 1980 book Literary Machines . Part of his proposal was the idea that micropayments could be automatically exacted from the reader for all the text, no matter how many snippets of content are taken from various places.

However, according to Nelson, the concept of transclusion had already formed part of his 1965 description of hypertext. [6] Nelson defines transclusion as, "...the same content knowably in more than one place," setting it apart from more special cases, such as the inclusion of content from a different location (which he calls transdelivery) or an explicit quotation that remains connected to its origins, (which he calls transquotation).

Some hypertext systems, including Ted Nelson's own Xanadu Project, support transclusion. [7]

Nelson has delivered a demonstration of Web transclusion, the Little Transquoter (programmed to Nelson's specification by Andrew Pam in 2004–2005). [8] It creates a new format built on portion addresses from Web pages; when dereferenced, each portion on the resulting page remains click-connected to its original context.

Implementation on the Web

HTTP, as a transmission protocol, has rudimentary support for transclusion via byte serving: specifying a byte range in an HTTP request message.

Transclusion can occur either before (server-side) or after (client-side) transmission. For example:

Publishers of web content may object to the transclusion of material from their own web sites into other web sites, or they may require an agreement to do so. Critics of the practice may refer to various forms of inline linking as bandwidth theft or leeching.

Other publishers may seek specifically to have their materials transcluded into other web sites, as in the form of web advertising, or as widgets like a hit counter or web bug.

Mashups make use of transclusion to assemble resources or data into a new application, as by placing geo-tagged photos on an interactive map, or by displaying business metrics in an interactive dashboard.

Client-side HTML

HTML defines elements for client-side transclusion of images, scripts, stylesheets, other documents, and other types of media. HTML has relied heavily on client-side transclusion from the earliest days of the Web (so web pages could be displayed more quickly before multimedia elements finished loading), rather than embedding the raw data for such objects inline into a web page's markup.

Through techniques such as Ajax, scripts associated with an HTML document can instruct a web browser to modify the document in-place, as opposed to the earlier technique of having to pull an entirely new version of the page from the web server. Such scripts may transclude elements or documents from a server after the web browser has rendered the page, in response to user input or changing conditions, for example.

Future versions of HTML may support deeper transclusion of portions of documents using XML technologies such as entities, XPointer document referencing, and XSLT manipulations.

Proxy servers may employ transclusion to reduce redundant transmissions of commonly requested resources.

A popular Front End Framework known as AngularJS developed and maintained by Google has a directive callend ng-transclude that marks the insertion point for the transcluded DOM of the nearest parent directive that uses transclusion.

Server-side transclusion

Transclusion can be accomplished on the server side, as through Server Side Includes and markup entity references resolved by the server software. It is a feature of substitution templates.

Transclusion of source code

Transclusion of source code into software design or reference materials lets source code be presented within the document, but not interpreted as part of the document, preserving the semantic consistency of the inserted code in relation to its source codebase.

Transclusion in content management

In content management for single-source publishing, top-class content management systems increasingly provide for transclusion and substitution. Component content management systems, especially, aim to take the modular design principle to its optimal degree. MediaWiki provides transclusion and substitution and is a good off-the-shelf option for many smaller organizations (such as smaller nonprofits and SMEs) that may not have the budget for other commercial options; for details, see Component content management system .

Implementation in software development

A common feature in programming languages is the ability of one source code file to transclude, in whole or part, another source code file. The part transcluded is interpreted as if it were part of the transcluding file. Some of the methods are:

See also

Related Research Articles

<span class="mw-page-title-main">HTML</span> HyperText Markup Language

Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript, a programming language.

<span class="mw-page-title-main">Hypertext</span> Text with references (links) to other text that the reader can immediately access

Hypertext is text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typically activated by a mouse click, keypress set, or screen touch. Apart from text, the term "hypertext" is also sometimes used to describe tables, images, and other presentational content formats with integrated hyperlinks. Hypertext is one of the key underlying concepts of the World Wide Web, where Web pages are often written in the Hypertext Markup Language (HTML). As implemented on the Web, hypertext enables the easy-to-use publication of information over the Internet.

<span class="mw-page-title-main">LaTeX</span> Typesetting system

LaTeX is a software system for typesetting documents. LaTeX markup describes the content and layout of the document, as opposed to the formatted text found in WYSIWYG word processors like Google Docs, LibreOffice Writer and Microsoft Word. The writer uses markup tagging conventions to define the general structure of a document, to stylise text throughout a document, and to add citations and cross-references. A TeX distribution such as TeX Live or MiKTeX is used to produce an output file suitable for printing or digital distribution.

<span class="mw-page-title-main">Markup language</span> Modern system for annotating a document

A markuplanguage is a text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts. Markup can control the display of a document or enrich its content to facilitate automated processing.

<span class="mw-page-title-main">Standard Generalized Markup Language</span> Markup language

The Standard Generalized Markup Language is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

<span class="mw-page-title-main">World Wide Web</span> Linked hypertext system on the Internet

The World Wide Web is an information system that enables content sharing over the Internet through user-friendly ways meant to appeal to users beyond IT specialists and hobbyists. It allows documents and other web resources to be accessed over the Internet according to specific rules of the Hypertext Transfer Protocol (HTTP).

<span class="mw-page-title-main">Hyperlink</span> Method of referencing visual computer data

In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

Project Xanadu was the first hypertext project, founded in 1960 by Ted Nelson. Administrators of Project Xanadu have declared it superior to the World Wide Web, with the mission statement: "Today's popular software simulates paper. The World Wide Web trivialises our original hypertext model with one-way ever-breaking links and no management of version or contents."

Pretty-printing is the application of any of various stylistic formatting conventions to text files, such as source code, markup, and similar kinds of content. These formatting conventions may entail adhering to an indentation style, using different color and typeface to highlight syntactic elements of source code, or adjusting size, to make the content easier for people to read, and understand. Pretty-printers for source code are sometimes called code formatters or beautifiers.

<span class="mw-page-title-main">Dynamic web page</span> Type of web page

A dynamic web page is a web page constructed at runtime, as opposed to a static web page, delivered as it is stored.

In computer programming, boilerplate code, or simply boilerplate, are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered verbose, the programmer must write a lot of boilerplate code to accomplish only minor functionality.

<span class="mw-page-title-main">Web template system</span> System in web publishing

A web template system in web publishing allows web designers and developers to work with web templates to automatically generate custom web pages, such as the results from a search. This reuses static web page elements while defining dynamic elements based on web request parameters. Web templates support static content, providing basic structure and appearance. Developers can implement templates from content management systems, web application frameworks, and HTML editors.

<span class="mw-page-title-main">Template processor</span> Software designed to combine templates with a data model to produce result documents

A template processor is software designed to combine templates with data to produce resulting documents or programs. The language that the templates are written in is known as a template language or templating language. For purposes of this article, a result document is any kind of formatted output, including documents, web pages, or source code, either in whole or in fragments. A template engine is ordinarily included as a part of a web template system or application framework, and may be used also as a preprocessor or filter.

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.

An include directive instructs a text file processor to replace the directive text with the content of a specified file.

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.

<span class="mw-page-title-main">History of hypertext</span>

Hypertext is text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Early conceptions of hypertext defined it as text that could be connected by a linking system to a range of other documents that were stored outside that text. In 1934 Belgian bibliographer, Paul Otlet, developed a blueprint for links that telescoped out from hypertext electrically to allow readers to access documents, books, photographs, and so on, stored anywhere in the world.

The Web platform is a collection of technologies developed as open standards by the World Wide Web Consortium and other standardization bodies such as the Web Hypertext Application Technology Working Group, the Unicode Consortium, the Internet Engineering Task Force, and Ecma International. It is the umbrella term introduced by the World Wide Web Consortium, and in 2011 it was defined as "a platform for innovation, consolidation and cost efficiencies" by W3C CEO Jeff Jaffe. Being built on The evergreen Web has allowed for the addition of new capabilities while addressing security and privacy risks. Additionally, developers are enabled to build interoperable content on a cohesive platform.

References

  1. Glushko, Robert J., ed. (2013). The Discipline of Organizing. Cambridge, Massachusetts: MIT Press. p. 231. ISBN   9780262518505.
  2. Initial Specifications for a COMMON BUSINESS ORIENTED LANGUAGE (COBOL) for Programming Electronic Digital Computers (PDF). Washington: Department of Defense. April 1960. pp. V-27. INCLUDE: Function: To save the programmer effort by automatically incorporating library subroutines into the source program.
  3. Ritchie, Dennis M. (1993-03-01). "The development of the C language". ACM SIGPLAN Notices. 28 (3): 201–208. doi:10.1145/155360.155580. Archived from the original on 27 February 2020. Many other changes occurred around 1972-3, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder [Snyder 74], but also in recognition of the utility of the the[sic] file-inclusion mechanisms available in BCPL and PL/I. Its original version was exceedingly simple, and provided only included files and simple string replacements: #include and #define of parameterless macros. Soon thereafter, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. The preprocessor was originally considered an optional adjunct to the language itself. Alt URL Archived 2020-02-04 at the Wayback Machine
  4. Stallman, Richard M.; Weinberg, Zachary. "Header Files" (PDF). The C Preprocessor: For gcc version 6.3.0 (GCC). pp. 10–11. Alternatives to Wrapper #ifndef : CPP supports two more ways of indicating that a header file should be read only once. Neither one is as portable as a wrapper '#ifndef' and we recommend you do not use them in new programs, with the caveat that '#import' is standard practice in Objective-C. [...] Another way to prevent a header file from being included more than once is with the '#pragma once' directive. If '#pragma once' is seen when scanning a header file, that file will never be read again, no matter what.
  5. Johnson, S. C.; Ritchie, D. M. (July–August 1978). "UNIX time-sharing system: Portability of C programs and the UNIX system". The Bell System Technical Journal. 57 (6): 2021–2048. doi:10.1002/j.1538-7305.1978.tb02141.x. ISSN   0005-8580. S2CID   17510065 . Retrieved 27 February 2020. Even before the advent of the Interdata machine, it as realized, as mentioned above, that many programs depended to an undesirable degree not only on UNIX I/O conventions but on details of particularly favorable buffering strategies for the PDP-11. A package of routines, called the "portable I/O library," was written by M. E. Lesk and implemented on the Honeywell and IBM machines as well as the PDP-11 in a generally successful effort to overcome the deficiencies of earlier packages
  6. Theodor H. Nelson, "A File Structure for the Complex, the Changing and the Indeterminate." Proceedings of the ACM 20th National Conference (1965), pp. 84-100
  7. Kolbitsch, Josef; Maurer, Hermann (January 27, 2017). "Transclusions in an HTML-Based Environment" (PDF). Archived from the original (PDF) on July 1, 2017. Retrieved January 27, 2017.
  8. The Little Transquoter Xanadu.com.au
  9. "AngularJS". docs.angularjs.org. Retrieved 2016-08-11.

Further reading