Transclusion

Last updated
In this example, the data of file B is transcluded into the document A. Transclusion simple.svg
In this example, the data of file B is transcluded into the document A.

In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user. [1] The result of transclusion is a single integrated document made of parts assembled dynamically from separate sources, possibly stored on different computers in disparate places.

Contents

Transclusion facilitates modular design (using the "single source of truth" model, whether in data, code, or content): a resource is stored once and distributed for reuse in multiple documents. Updates or corrections to a resource are then reflected in any referencing documents.

In systems where transclusion is not available, and in some situations where it is available but not desirable, substitution is often the complementary option, whereby a static copy of the "single source of truth" is integrated into the relevant document. Examples of both are provided by the ways in which they are both used in creating the content of Wikipedia, for example (see Wikipedia:Transclusion and Wikipedia:Substitution for more information). Substituted static copies introduce a different set of considerations for version control than transclusion does, but they are sometimes necessary.

Ted Nelson coined the term for his 1980 nonlinear book Literary Machines , but the idea of master copy and occurrences was applied 17 years before, in Sketchpad.

Technical considerations

Context neutrality

Transclusion works better when transcluded sections of text are self-contained, so that the meaning and validity of the text is independent of context. For example, formulations like "as explained in the previous section" are problematic, because the transcluded section may appear in a different context, causing confusion. What constitutes "context-neutral" text varies, but often includes things like company information or boilerplate. To help overcome context sensitivity issues such as those aforementioned, systems capable of transclusion are often also capable of suppressing particular elements within the transcluded content. For example, Wikipedia can use tags such as "noinclude", "onlyinclude", and "includeonly" for this purpose. Typical examples of elements that often require such exceptions are document titles, footnotes, and cross-references; in this way, they can be automatically suppressed upon transclusion, without manual reworking for each instance.

Parameterization

Under some circumstances, and in some technical contexts, transcluded sections of text may not require strict adherence to the "context neutrality" principle, because the transcluded sections are capable of parameterization. Parameterization implies the ability to modify certain portions or subsections of a transcluded text depending on exogenous variables that can be changed independently. This is customarily done by supplying a transcluded text with one or more substitution placeholders. These placeholders are then replaced with the corresponding variable values prior to rendering the final transcluded output in context.

Origins

The concept of reusing file content began with computer programming languages: COBOL in 1960, [2] followed by BCPL, PL/I, C, [3] and by 1978, even FORTRAN. An include directive allows common source code to be reused while avoiding the pitfalls of copy-and-paste-programming and hard coding of constants. As with many innovations, a problem developed. Multiple include directives may provide the same content as another include directive, inadvertently causing repetitions of the same source code into the final result, resulting in an error. Include guards help solve this by, after a single inclusion of content, thereafter omitting the duplicate content. [4]

The idea of a single, reusable, source for information lead to concepts like: Don't repeat yourself and the abstraction principle. A further use was found to make programs more portable. Portable source code uses an include directive to specify a standard library, which contains system specific source code that varies with each computer environment. [5]

History and implementation by Project Xanadu

Ted Nelson, who originated the words "hypertext" and "hypermedia", also coined the term "transclusion", in his 1980 book Literary Machines . Part of his proposal was the idea that micropayments could be automatically exacted from the reader for all the text, no matter how many snippets of content are taken from various places.

However, according to Nelson, the concept of transclusion had already formed part of his 1965 description of hypertext. [6] Nelson defines transclusion as, "...the same content knowably in more than one place," setting it apart from more special cases, such as the inclusion of content from a different location (which he calls transdelivery) or an explicit quotation that remains connected to its origins, (which he calls transquotation).

Some hypertext systems, including Ted Nelson's own Xanadu Project, support transclusion. [7]

Nelson has delivered a demonstration of Web transclusion, the Little Transquoter (programmed to Nelson's specification by Andrew Pam in 2004–2005). [8] It creates a new format built on portion addresses from Web pages; when dereferenced, each portion on the resulting page remains click-connected to its original context.

Implementation on the Web

HTTP, as a transmission protocol, has rudimentary support for transclusion via byte serving: specifying a byte range in an HTTP request message.

Transclusion can occur either before (server-side) or after (client-side) transmission. For example:

Publishers of web content may object to the transclusion of material from their own web sites into other web sites, or they may require an agreement to do so. Critics of the practice may refer to various forms of inline linking as bandwidth theft or leeching.

Other publishers may seek specifically to have their materials transcluded into other web sites, as in the form of web advertising, or as widgets like a hit counter or web bug.

Mashups make use of transclusion to assemble resources or data into a new application, as by placing geo-tagged photos on an interactive map, or by displaying business metrics in an interactive dashboard.

Client-side HTML

HTML defines elements for client-side transclusion of images, scripts, stylesheets, other documents, and other types of media. HTML has relied heavily on client-side transclusion from the earliest days of the Web (so web pages could be displayed more quickly before multimedia elements finished loading), rather than embedding the raw data for such objects inline into a web page's markup.

Through techniques such as Ajax, scripts associated with an HTML document can instruct a web browser to modify the document in-place, as opposed to the earlier technique of having to pull an entirely new version of the page from the web server. Such scripts may transclude elements or documents from a server after the web browser has rendered the page, in response to user input or changing conditions, for example.

Future versions of HTML may support deeper transclusion of portions of documents using XML technologies such as entities, XPointer document referencing, and XSLT manipulations.

Proxy servers may employ transclusion to reduce redundant transmissions of commonly requested resources.

A popular Front End Framework known as AngularJS developed and maintained by Google has a directive callend ng-transclude that marks the insertion point for the transcluded DOM of the nearest parent directive that uses transclusion.

Server-side transclusion

Transclusion can be accomplished on the server side, as through Server Side Includes and markup entity references resolved by the server software. It is a feature of substitution templates.

Transclusion of source code

Transclusion of source code into software design or reference materials lets source code be presented within the document, but not interpreted as part of the document, preserving the semantic consistency of the inserted code in relation to its source codebase.

Transclusion in content management

In content management for single-source publishing, top-class content management systems increasingly provide for transclusion and substitution. Component content management systems, especially, aim to take the modular design principle to its optimal degree. MediaWiki provides transclusion and substitution and is a good off-the-shelf option for many smaller organizations (such as smaller nonprofits and SMEs) that may not have the budget for other commercial options; for details, see Component content management system .

Implementation in software development

A common feature in programming languages is the ability of one source code file to transclude, in whole or part, another source code file. The part transcluded is interpreted as if it were part of the transcluding file. Some of the methods are:

See also

Related Research Articles

<span class="mw-page-title-main">HTML</span> HyperText Markup Language

HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.

<span class="mw-page-title-main">Hypertext</span> Text with references (links) to other text that the reader can immediately access

Hypertext is text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typically activated by a mouse click, keypress set, or screen touch. Apart from text, the term "hypertext" is also sometimes used to describe tables, images, and other presentational content formats with integrated hyperlinks. Hypertext is one of the key underlying concepts of the World Wide Web, where Web pages are often written in the Hypertext Markup Language (HTML). As implemented on the Web, hypertext enables the easy-to-use publication of information over the Internet.

<span class="mw-page-title-main">Literate programming</span> A programming approach of software development

Literate programming is a programming paradigm introduced in 1984 by Donald Knuth in which a computer program is given as an explanation of how it works in a natural language, such as English, interspersed (embedded) with snippets of macros and traditional source code, from which compilable source code can be generated. The approach is used in scientific computing and in data science routinely for reproducible research and open access purposes. Literate programming tools are used by millions of programmers today.

<span class="mw-page-title-main">Markup language</span> Modern system for annotating a document

A markuplanguage is a text-encoding system which specifies the structure and formatting of a document and potentially the relationship between its parts. Markup can control the display of a document or enrich its content to facilitate automated processing.

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

<span class="mw-page-title-main">World Wide Web</span> Linked hypertext system on the Internet

The World Wide Web is an information system that enables content sharing over the Internet through user-friendly ways meant to appeal to users beyond IT specialists and hobbyists. It allows documents and other web resources to be accessed over the Internet according to specific rules of the Hypertext Transfer Protocol (HTTP).

Jakarta Server Pages is a collection of technologies that helps software developers create dynamically generated web pages based on HTML, XML, SOAP, or other document types. Released in 1999 by Sun Microsystems, JSP is similar to PHP and ASP, but uses the Java programming language.

<span class="mw-page-title-main">Hyperlink</span> Method of referencing visual computer data

In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided to by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks. The text that is linked from is known as anchor text. A software system that is used for viewing and creating hypertext is a hypertext system, and to create a hyperlink is to hyperlink. A user following hyperlinks is said to navigate or browse the hypertext.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

Project Xanadu was the first hypertext project, founded in 1960 by Ted Nelson. Administrators of Project Xanadu have declared it superior to the World Wide Web, with the mission statement: "Today's popular software simulates paper. The World Wide Web trivialises our original hypertext model with one-way ever-breaking links and no management of version or contents."

In computing, a polyglot is a computer program or script written in a valid form of multiple programming languages or file formats. The name was coined by analogy to multilingualism. A polyglot file is composed by combining syntax from two or more different formats.

Hypermedia, an extension of the term hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks. This designation contrasts with the broader term multimedia, which may include non-interactive linear presentations as well as hypermedia. It is also related to the field of electronic literature. The term was first used in a 1965 article written by Ted Nelson.

Inline linking is the use of a linked object, often an image, on one site by a web page belonging to a second site. One site is said to have an inline link to the other site where the object is located.

<span class="mw-page-title-main">Dynamic web page</span> Type of web page

A dynamic web page is a web page constructed at runtime, as opposed to a static web page, delivered as it is stored. A server-side dynamic web page is a web page whose construction is controlled by an application server processing server-side scripts. In server-side scripting, parameters determine how the assembly of every new web page proceeds, and including the setting up of more client-side processing. A client-side dynamic web page processes the web page using JavaScript running in the browser as it loads. JavaScript can interact with the page via Document Object Model (DOM), to query page state and modify it. Even though a web page can be dynamic on the client-side, it can still be hosted on a static hosting service such as GitHub Pages or Amazon S3 as long as there is not any server-side code included.

In computer programming, boilerplate code, or simply boilerplate, are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered verbose, the programmer must write a lot of boilerplate code to accomplish only minor functionality.

<span class="mw-page-title-main">Web template system</span> System in web publishing

A web template system in web publishing allows web designers and developers work with web templates to automatically generate custom web pages, such as the results from a search. This reuses static web page elements while defining dynamic elements based on web request parameters. Web templates support static content, providing basic structure and appearance. Developers can implement templates from content management systems, web application frameworks, and HTML editors.

<span class="mw-page-title-main">Template processor</span> Software designed to combine templates with a data model to produce result documents

A template processor is software designed to combine templates with data to produce resulting documents or programs. The language that the templates are written in is known as a template language or templating language. For purposes of this article, a result document is any kind of formatted output, including documents, web pages, or source code, either in whole or in fragments. A template engine is ordinarily included as a part of a web template system or application framework, and may be used also as a preprocessor or filter.

The Template Attribute Language (TAL) is a templating language used to generate dynamic HTML and XML pages. Its main goal is to simplify the collaboration between programmers and designers. This is achieved by embedding TAL statements inside valid HTML tags which can then be worked on using common design tools.

Many programming languages and other computer files have a directive, often called include, import, or copy, that causes the contents of the specified file to be inserted into the original file. These included files are called header files or copybooks. They are often used to define the physical layout of program data, pieces of procedural code, and/or forward declarations while promoting encapsulation and the reuse of code or data.

<span class="mw-page-title-main">History of hypertext</span>

Hypertext is text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Early conceptions of hypertext defined it as text that could be connected by a linking system to a range of other documents that were stored outside that text. In 1934 Belgian bibliographer, Paul Otlet, developed a blueprint for links that telescoped out from hypertext electrically to allow readers to access documents, books, photographs, and so on, stored anywhere in the world.

References

  1. Glushko, Robert J., ed. (2013). The Discipline of Organizing. Cambridge, Massachusetts: MIT Press. p. 231. ISBN   9780262518505.
  2. Initial Specifications for a COMMON BUSINESS ORIENTED LANGUAGE (COBOL) for Programming Electronic Digital Computers (PDF). Washington: Department of Defense. April 1960. pp. V-27. INCLUDE: Function: To save the programmer effort by automatically incorporating library subroutines into the source program.
  3. Ritchie, Dennis M. (1993-03-01). "The development of the C language". ACM SIGPLAN Notices. 28 (3): 201–208. doi:10.1145/155360.155580. Archived from the original on 27 February 2020. Many other changes occurred around 1972-3, but the most important was the introduction of the preprocessor, partly at the urging of Alan Snyder [Snyder 74], but also in recognition of the utility of the the[sic] file-inclusion mechanisms available in BCPL and PL/I. Its original version was exceedingly simple, and provided only included files and simple string replacements: #include and #define of parameterless macros. Soon thereafter, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. The preprocessor was originally considered an optional adjunct to the language itself. Alt URL Archived 2020-02-04 at the Wayback Machine
  4. Stallman, Richard M.; Weinberg, Zachary. "Header Files" (PDF). The C Preprocessor: For gcc version 6.3.0 (GCC). pp. 10–11. Alternatives to Wrapper #ifndef : CPP supports two more ways of indicating that a header file should be read only once. Neither one is as portable as a wrapper '#ifndef' and we recommend you do not use them in new programs, with the caveat that '#import' is standard practice in Objective-C. [...] Another way to prevent a header file from being included more than once is with the '#pragma once' directive. If '#pragma once' is seen when scanning a header file, that file will never be read again, no atter what.
  5. Johnson, S. C.; Ritchie, D. M. (July–August 1978). "UNIX time-sharing system: Portability of C programs and the UNIX system". The Bell System Technical Journal. 57 (6): 2021–2048. doi:10.1002/j.1538-7305.1978.tb02141.x. ISSN   0005-8580. S2CID   17510065 . Retrieved 27 February 2020. Even before the advent of the Interdata machine, it as realized, as mentioned above, that many programs depended to an undesirable degree not only on UNIX I/O conventions but on details of particularly favorable buffering strategies for the PDP-11. A package of routines, called the "portable I/O library," was written by M. E. Lesk and implemented on the Honeywell and IBM machines as well as the PDP-11 in a generally successful effort to overcome the deficiencies of earlier packages
  6. Theodor H. Nelson, "A File Structure for the Complex, the Changing and the Indeterminate." Proceedings of the ACM 20th National Conference (1965), pp. 84-100
  7. Kolbitsch, Josef; Maurer, Hermann (January 27, 2017). "Transclusions in an HTML-Based Environment" (PDF). Archived from the original (PDF) on July 1, 2017. Retrieved January 27, 2017.
  8. The Little Transquoter Xanadu.com.au
  9. "AngularJS". docs.angularjs.org. Retrieved 2016-08-11.

Further reading