XS (Perl)

Last updated

XS is a Perl foreign function interface through which a program can call a C or C++ subroutine. XS or xsub is an abbreviation of "eXtendable Subroutine".

Contents

XS also refers to a glue language for specifying calling interfaces supporting such interfaces (see below).

Background

Subroutine libraries in Perl are called modules, and modules that contain xsubs are called XS modules. Perl provides a framework for developing, packaging, distributing, and installing modules.

It may be desirable for a Perl program to invoke a C subroutine in order to handle very CPU or memory intensive tasks, to interface with hardware or low-level system facilities, or to make use of existing C subroutine libraries.

Perl interpreter

The Perl interpreter is a C program, so in principle there is no obstacle to calling from Perl to C. However, the XS interface is complex[ why? ] and highly technical, and using it requires some understanding of the interpreter. The earliest reference on the subject was the perlguts POD.

Wrappers

It is possible to write XS modules that wrap C++ code. Doing so is mostly a matter of configuring the module build system. [1]

Example code

The following shows an XS module that exposes a function concat() to concatenate two strings (i.e., the equivalent of Perl’s . operator).

#define PERL_NO_GET_CONTEXT#include"EXTERN.h"#include"perl.h"#include"XSUB.h"SV*_do_sv_catsv(pTHX_SV*one_sv,SV*two_sv){SV*one_copy=newSVsv(one_sv);sv_catsv(one_copy,two_sv);returnone_copy;}
MODULE=Demo::XSModulePACKAGE=Demo::XSModuleSV*concat(SV*one_sv,SV*two_sv)CODE:SV*to_return=_do_sv_catsv(aTHX_one_sv,two_sv);RETVAL=to_return;OUTPUT:RETVAL

The first four lines (#define and #include statements) are standard boilerplate.

After then follow any number of plain C functions that are callable locally.

The section that starts with MODULE = Demo::XSModule defines the Perl interface to this code using the actual XS macro language. Note that the C code under the CODE: section calls the _do_sv_catsv() pure-C function that was defined in the prior section.

Perl’s documentation explains the meaning and purpose of all of the “special” symbols (e.g., aTHX_ and RETVAL) shown above.

To make this module available to Perl it must be compiled. Build tools like ExtUtils::MakeMaker can do this automatically. (To build manually: the xsubpp tool parses an XS module and outputs C source code; that source code is then compiled to a shared library and placed in a directory where Perl can find it.) Perl code then uses a module like XSLoader to load the compiled XS module. At this point Perl can call Demo::XSModule::concat('foo', 'bar') and receive back a string foobar, as if concat() were itself written in Perl.

Note that, for building Perl interfaces to preexisting C libraries, the h2xs [ further explanation needed ] can automate much of the creation of the XS file itself.

Difficulties

Creation and maintenance of XS modules requires expertise with C itself as well as Perl’s extensive C API. XS modules may only be installed if a C compiler and the header files that the Perl interpreter was compiled against are available. Also, new versions of Perl may break binary compatibility requiring XS modules to be recompiled.

See also

Related Research Articles

<span class="mw-page-title-main">Dylan (programming language)</span>

Dylan is a multi-paradigm programming language that includes support for functional and object-oriented programming (OOP), and is dynamic and reflective while providing a programming model designed to support generating efficient machine code, including fine-grained control over dynamic and static behaviors. It was created in the early 1990s by a group led by Apple Computer.

<span class="mw-page-title-main">Perl</span> Interpreted programming language first released in 1987

Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was officially changed to Raku in October 2019.

In computer programming, the scope of a name binding is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts of the program, the name may refer to a different entity, or to nothing at all. Scope helps prevent name collisions by allowing the same name to refer to different objects – as long as the names have separate scopes. The scope of a name binding is also known as the visibility of an entity, particularly in older or more technical literature—this is from the perspective of the referenced entity, not the referencing name.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.

The Simplified Wrapper and Interface Generator (SWIG) is an open-source software tool used to connect computer programs or libraries written in C or C++ with scripting languages such as Lua, Perl, PHP, Python, R, Ruby, Tcl, and other languages like C#, Java, JavaScript, Go, D, OCaml, Octave, Scilab and Scheme. Output can also be in the form of XML.

In computer programming, a thunk is a subroutine used to inject a calculation into another subroutine. Thunks are primarily used to delay a calculation until its result is needed, or to insert operations at the beginning or end of the other subroutine. They have many other applications in compiler code generation and modular programming.

In computer programming, a callback or callback function is any reference to executable code that is passed as an argument to another piece of code; that code is expected to call back (execute) the callback function as part of its job. This execution may be immediate as in a synchronous callback, or it might happen at a later point in time as in an asynchronous callback. Programming languages support callbacks in different ways, often implementing them with subroutines, lambda expressions, blocks, or function pointers.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

Perl Data Language is a set of free software array programming extensions to the Perl programming language. PDL extends the data structures built into Perl, to include large multidimensional arrays, and adds functionality to manipulate those arrays as vector objects. It also provides tools for image processing, machine learning, computer modeling of physical systems, and graphical plotting and presentation. Simple operations are automatically vectorized across complete arrays, and higher-dimensional operations are supported.

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

<span class="mw-page-title-main">Raku (programming language)</span> Programming language derived from Perl

Raku is a member of the Perl family of programming languages. Formerly known as Perl 6, it was renamed in October 2019. Raku introduces elements of many modern and historical languages. Compatibility with Perl was not a goal, though a compatibility mode is part of the specification. The design process for Raku began in 2000.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written or compiled in another one. An FFI is often used in contexts where calls are made into binary dynamic-link library.

A wrapper function is a function in a software library or a computer program whose main purpose is to call a second subroutine or a system call with little or no additional computation. Wrapper functions are used to make writing computer programs easier by abstracting away the details of a subroutine's underlying implementation.

<span class="mw-page-title-main">Scripting language</span> Programming language for run-time events

A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled.

The Perl virtual machine is a stack-based process virtual machine implemented as an opcodes interpreter which runs previously compiled programs written in the Perl language. The opcodes interpreter is a part of the Perl interpreter, which also contains a compiler in one executable file, commonly /usr/bin/perl on various Unix-like systems or perl.exe on Microsoft Windows systems.

Many programming languages and other computer files have a directive, often called include, import, or copy, that causes the contents of the specified file to be inserted into the original file. These included files are called header files or copybooks. They are often used to define the physical layout of program data, pieces of procedural code, and/or forward declarations while promoting encapsulation and the reuse of code or data.

The structure of the Perl programming language encompasses both the syntactical rules of the language and the general ways in which programs are organized. Perl's design philosophy is expressed in the commonly cited motto "there's more than one way to do it". As a multi-paradigm, dynamically typed language, Perl allows a great degree of flexibility in program design. Perl also encourages modularization; this has been attributed to the component-based design structure of its Unix roots, and is responsible for the size of the CPAN archive, a community-maintained repository of more than 100,000 modules.

In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.

The following outline is provided as an overview of and topical guide to the Perl programming language:

References

  1. "Gluing C++ And Perl Together". johnkeiser.com. August 27, 2001.{{cite web}}: CS1 maint: url-status (link)