|Designed by||Neil Pappalardo|
ANSI X11.1-1995 / December 8, 1995
|PSL, Caché ObjectScript|
MUMPS ("Massachusetts General Hospital Utility Multi-Programming System"), or M, is a general-purpose computer programming language originally designed in 1966 for the healthcare industry. Its differentiating feature is its "built-in" database, enabling high-level access to disk storage using simple symbolic program variables and subscripted arrays; similar to the variables used by most languages to access main memory.
It continues to be used today by many large hospitals and banks to provide high-throughput transaction data processing.
MUMPS was developed by Neil Pappalardo, Robert Greenes, and Curt Marble in Dr. Octo Barnett's animal lab at the Massachusetts General Hospital (MGH) in Boston during 1966 and 1967. It was later rewritten by technical leaders Dennis "Dan" Brevik and Paul Stylosof DEC in 1970 and 1971. Evelyn Dow was the original Marketing representative.
The original MUMPS system was, like Unix a few years later, built on a spare DEC PDP-7. Octo Barnett and Neil Pappalardo were also involved with MGH's planning for a Hospital Information System, obtained a backward compatible PDP-9, and began using MUMPS in the admissions cycle and laboratory test reporting. MUMPS was then an interpreted language, yet even then, incorporated a hierarchical database file system to standardize interaction with the data and abstract disk operations so they were only done by the MUMPS language itself.
Some aspects of MUMPS can be traced from Rand Corporation's JOSS through BBN's TELCOMP and STRINGCOMP.
The MUMPS team deliberately chose to include portability between machines as a design goal.
Another feature, not widely supported for machines of the era, in operating systems or in computer hardware, was multitasking, which was also built into the language itself as evidenced by the "-MPS" in the language name of "MUMPS" stands for Multi-Programming System. Although timesharing on Mainframe computers was increasingly common in systems such as Multics, most mini-computers (and they were the smallest available at that time) did not run parallel programs and threading was not available at all. Even on Mainframes, the variant of batch processing where a program was run to completion was the most common implementation for an operating system of multi-programming.
It was a few years until Unix was developed. The lack of memory management hardware also meant that all multi-processing was fraught with the possibility that a memory pointer could change some other process. MUMPS programs do not have a standard way to refer to memory directly at all, in contrast to C language, so since the multitasking was enforced by the language, not by any program written in the language it was impossible to have the risk that existed for other systems.
Dan Brevik'sDEC MUMPS-15 system was adapted to a DEC PDP-15, where it lived for some time. It was first installed at Health Data Management Systems of Denver in May of 1971. The portability proved to be useful and MUMPS was awarded a government research grant, and so MUMPS was released to the public domain which was the a requirement for grants. MUMPS was soon ported to a number of other systems including the popular DEC PDP-8, the Data General Nova and on DEC PDP-11 and the Artronix PC12 minicomputer. Word about MUMPS spread mostly through the medical community, and was in widespread use, often being locally modified for their own needs.
By the early 1970's, there were many and varied implementations of MUMPS on a range of hardware platforms. Another noteworthy platform was Paul Stylos'DEC MUMPS-11 on the PDP-11, and MEDITECH's MIIS. In the Fall of 1972, many MUMPS users attended a conference in Boston which standardized the then-fractured language, and created the MUMPS Users Group and MUMPS Development Committee (MDC) to do so. These efforts proved successful; a standard was complete by 1974, and was approved, on September 15, 1977, as ANSI standard, X11.1-1977. At about the same time DEC launched DSM-11 (Digital Standard MUMPS) for the PDP-11. This quickly dominated the market, and became the reference implementation of the time. Also, InterSystems sold ISM-11 for the PDP-11 (which was identical to DSM-11).
During the early 1980s several vendors brought MUMPS-based platforms that met the ANSI standard to market. The most significant were:
Other companies developed important MUMPS implementations:
This period also saw considerable MDC activity. The second revision of the ANSI standard for MUMPS (X11.1-1984) was approved on November 15, 1984.
The US Department of Veterans Affairs (formerly the Veterans Administration) was one of the earliest major adopters of the MUMPS language. Their development work (and subsequent contributions to the free MUMPS application codebase) was an influence on many medical users worldwide. In 1995, the Veterans Affairs' patient Admission/Tracking/Discharge system, Decentralized Hospital Computer Program (DHCP) was the recipient of the Computerworld Smithsonian Award for best use of Information Technology in Medicine. In July 2006, the Department of Veterans Affairs (VA) / Veterans Health Administration (VHA) was the recipient of the Innovations in American Government Award presented by the Ash Institute of the John F. Kennedy School of Government at Harvard University for its extension of DHCP into the Veterans Health Information Systems and Technology Architecture (VistA). Nearly the entire VA hospital system in the United States, the Indian Health Service, and major parts of the Department of Defense CHCS hospital system use MUMPS databases for clinical data tracking. In 2015 the Department of Defense awarded a 10 year contract to Leidos, Cerner, and Accenture to replace CHCS.In 2017 the Veterans Health Administration (VHA) announced that it would replace VistA with Cerner by 2024 or 2025. In 2019 the VHA announced that it would also replace its Epic Systems online appointment scheduling system with one from Cerner by 2024.
Other healthcare IT companies using MUMPS include Epic, MEDITECH, GE Healthcare (formerly IDX Systems and Centricity), AmeriPath (part of Quest Diagnostics), Care Centric, Allscripts, Coventry Healthcare, EMIS, and Sunquest Information Systems (formerly Misys Healthcare). Many reference laboratories, such as DASA, Quest Diagnostics, and Dynacare, use MUMPS software written by or based on Antrim Corporation code. Antrim was purchased by Misys Healthcare (now Sunquest Information Systems) in 2001.
MUMPS is also widely used in financial applications. MUMPS gained an early following in the financial sector and is in use at many banks and credit unions. It is used by Ameritrade, the largest online trading service in the US, with over 12 billion transactions per day, as well as by the Bank of England and Barclays Bank, among others.
Since 2005, the use of MUMPS has been either in the form of GT.M or InterSystems Caché. The latter is being aggressively marketed by InterSystems and has had success in penetrating new markets, such as telecommunications, in addition to existing markets. The European Space Agency announced on May 13, 2010 that it will use the InterSystems Caché database to support the Gaia mission. This mission aims to map the Milky Way with unprecedented precision.
MUMPS is a language intended for and designed to build database applications. Secondary language features were included to help programmers make applications using minimal computing resources. The original implementations were interpreted, though modern implementations may be fully or partially compiled. Individual "programs" run in memory "partitions". Early MUMPS memory partitions were limited to 2048 bytes so aggressive abbreviation greatly aided multi-programming on severely resource limited hardware, because more than one MUMPS job could fit into the very small memories extant in hardware at the time. The ability to provide multi-user systems was another language design feature. The word "Multi-Programming" in the acronym points to this. Even the earliest machines running MUMPS supported multiple jobs running at the same time. With the change from mini-computers to micro-computers a few years later, even a "single user PC" with a single 8-bit CPU and 16K or 64K of memory could support multiple users, who could connect to it from (non-graphical) video display terminals.
Since memory was tight originally, the language design for MUMPS valued very terse code. Thus, every MUMPS command or function name could be abbreviated from one to three letters in length, e.g. Quit (exit program) as Q, $P = $Piece function, R = Read command, $TR = $Translate function. Spaces and end-of-line markers are significant in MUMPS because line scope promoted the same terse language design. Thus, a single line of program code could express, with few characters, an idea for which other programming languages could require 5 to 10 times as many characters. Abbreviation was a common feature of languages designed in this period (e.g., FOCAL-69, early BASICs such as Tiny BASIC, etc.). An unfortunate side effect of this, coupled with the early need to write minimalist code, was that MUMPS programmers routinely did not comment code and used extensive abbreviations. This meant that even an expert MUMPS programmer could not just skim through a page of code to see its function but would have to analyze it line by line.
Database interaction is transparently built into the language. The MUMPS language provides a hierarchical database made up of persistent sparse arrays, which is implicitly "opened" for every MUMPS application. All variable names prefixed with the caret character ("^") use permanent (instead of RAM) storage, will maintain their values after the application exits, and will be visible to (and modifiable by) other running applications. Variables using this shared and permanent storage are called Globals in MUMPS, because the scoping of these variables is "globally available" to all jobs on the system. The more recent and more common use of the name "global variables" in other languages is a more limited scoping of names, coming from the fact that unscoped variables are "globally" available to any programs running in the same process, but not shared among multiple processes. The MUMPS Storage mode (i.e. Globals stored as persistent sparse arrays), gives the MUMPS database the characteristics of a document-oriented database.
to modify a nested child node of ^Car. In MUMPS terms, "Color" is the 2nd subscript of the variable ^Car (both the names of the child-nodes and the child-nodes themselves are likewise called subscripts). Hierarchical variables are similar to objects with properties in many object oriented languages. Additionally, the MUMPS language design requires that all subscripts of variables are automatically kept in sorted order. Numeric subscripts (including floating-point numbers) are stored from lowest to highest. All non-numeric subscripts are stored in alphabetical order following the numbers. In MUMPS terminology, this is canonical order. By using only non-negative integer subscripts, the MUMPS programmer can emulate the arrays data type from other languages. Although MUMPS does not natively offer a full set of DBMS features such as mandatory schemas, several DBMS systems have been built on top of it that provide application developers with flat-file, relational and network database features.
Additionally, there are built-in operators which treat a delimited string (e.g., comma-separated values) as an array. Early MUMPS programmers would often store a structure of related information as a delimited string, parsing it after it was read in; this saved disk access time and offered considerable speed advantages on some hardware.
MUMPS has no data types. Numbers can be treated as strings of digits, or strings can be treated as numbers by numeric operators (coerced, in MUMPS terminology). Coercion can have some odd side effects, however. For example, when a string is coerced, the parser turns as much of the string (starting from the left) into a number as it can, then discards the rest. Thus the statement
IF 20<"30 DUCKS" is evaluated as
TRUE in MUMPS.
Other features of the language are intended to help MUMPS applications interact with each other in a multi-user environment. Database locks, process identifiers, and atomicity of database update transactions are all required of standard MUMPS implementations.
In contrast to languages in the C or Wirth traditions, some space characters between MUMPS statements are significant. A single space separates a command from its argument, and a space, or newline, separates each argument from the next MUMPS token. Commands which take no arguments (e.g.,
ELSE) require two following spaces. The concept is that one space separates the command from the (nonexistent) argument, the next separates the "argument" from the next command. Newlines are also significant; an
FOR command processes (or skips) everything else till the end-of-line. To make those statements control multiple lines, you must use the
DO command to create a code block.
A simple Hello world program in MUMPS might be:
hello() write "Hello, World!",! quit
and would be run from the MUMPS command line with the command
do ^hello. Since MUMPS allows commands to be strung together on the same line, and since commands can be abbreviated to a single letter, this routine could be made more compact:
hello() w "Hello, World!",! q
,!' after the text generates a newline.
ANSI X11.1-1995 gives a complete, formal description of the language; an annotated version of this standard is available online.
Data types: There is one universal datatype, which is implicitly coerced to string, integer, or floating-point datatypes as context requires.
Booleans (called truthvalues in MUMPS): In IF commands and other syntax that has expressions evaluated as conditions, any string value is evaluated as a numeric value, and if that is a nonzero value, then it is interpreted as True.
a<b yields 1 if a is less than b, 0 otherwise.
Declarations: None. All variables are dynamically created at the first time a value is assigned.
Lines: are important syntactic entities, unlike their status in languages patterned on C or Pascal. Multiple statements per line are allowed and are common. The scope of any IF, ELSE, and FOR command is "the remainder of current line."
Case sensitivity: Commands and intrinsic functions are case-insensitive. In contrast, variable names and labels are case-sensitive. There is no special meaning for upper vs. lower-case and few widely followed conventions. The percent sign (%) is legal as first character of variables and labels.
Postconditionals: execution of almost any command can be controlled by following it with a colon and a truthvalue expression.
SET:N<10 A="FOO" sets A to "FOO" if N is less than 10;
DO:N>100 PRINTERR, performs PRINTERR if N is greater than 100. This construct provides a conditional whose scope is less than a full line.
Abbreviation: You can abbreviate nearly all commands and native functions to one, two, or three characters.
Reserved words: None. Since MUMPS interprets source code by context, there is no need for reserved words. You may use the names of language commands as variables, so the following is perfectly legal MUMPS code:
GREPTHIS() NEW SET,NEW,THEN,IF,KILL,QUIT SET IF="KILL",SET="11",KILL="l1",QUIT="RETURN",THEN="KILL" IF IF=THEN DO THEN QUIT:$QUIT QUIT QUIT ; (quit) THEN IF IF,SET&KILL SET SET=SET+KILL QUIT
MUMPS can be made more obfuscated by using the contracted operator syntax, as shown in this terse example derived from the example above:
GREPTHIS() N S,N,T,I,K,Q S I="K",S="11",K="l1",Q="R",T="K" I I=T D T Q:$Q Q Q T I I,S&K S S=S+K Q
Arrays: are created dynamically, stored as B-trees, are sparse (i.e. use almost no space for missing nodes), can use any number of subscripts, and subscripts can be strings or numeric (including floating point). Arrays are always automatically stored in sorted order, so there is never any occasion to sort, pack, reorder, or otherwise reorganize the database. Built in functions such as $DATA, $ORDER, $NEXT(deprecated) and $QUERY functions provide efficient examination and traversal of the fundamental array structure, on disk or in memory.
for i=10000:1:12345 set sqtable(i)=i*i set address("Smith","Daniel")="email@example.com"
Local arrays: variable names not beginning with caret (i.e. "^") are stored in memory by process, are private to the creating process, and expire when the creating process terminates. The available storage depends on implementation. For those implementations using partitions, it is limited to the partition size (a small partition might be 32K). For other implementations, it may be several megabytes.
^abc, ^def. These are stored on disk, are available to all processes, and are persistent when the creating process terminates. Very large globals (for example, hundreds of gigabytes) are practical and efficient in most implementations. This is MUMPS' main "database" mechanism. It is used instead of calling on the operating system to create, write, and read files.
Indirection: in many contexts,
@VBL can be used, and effectively substitutes the contents of VBL into another MUMPS statement.
SET XYZ="ABC" SET @XYZ=123 sets the variable ABC to 123.
SET SUBROU="REPORT" DO @SUBROU performs the subroutine named REPORT. This substitution allows for lazy evaluation and late binding as well as effectively the operational equivalent of "pointers" in other languages.
Piece function: This breaks variables into segmented pieces guided by a user specified separator string (sometimes called a "delimiter"). Those who know awk will find this familiar.
$PIECE(STRINGVAR,"^",3) means the "third caret-separated piece of STRINGVAR." The piece function can also appear as an assignment (SET command) target.
$PIECE("world.std.com",".",2) yields "std".
SET $P(X,"@",1)="office" causes X to become "firstname.lastname@example.org" (note that $P is equivalent to $PIECE and could be written as such).
Order function: This function treats its input as a structure, and finds the next index that exists which has the same structure except for the last subscript. It returns the sorted value that is ordered after the one given as input. (This treats the array reference as a content-addressable data rather than an address of a value.)
$Order(stuff("")) yields 6,
$Order(stuff(6)) yields 10,
$Order(stuff(8)) yields 10,
$Order(stuff(10)) yields 15,
$Order(stuff(15)) yields "".
Set i="" For Set i=$O(stuff(i)) Quit:i="" Write !,i,10,stuff(i)
Here, the argument-less For repeats until stopped by a terminating Quit. This line prints a table of i and stuff(i) where i is successively 6, 10, and 15.
For iterating the database, the Order function returns the next key to use.
GTM>S n="" GTM>S n=$order(^nodex(n)) GTM>zwr n n=" building" GTM>S n=$order(^nodex(n)) GTM>zwr n n=" name:gd" GTM>S n=$order(^nodex(n)) GTM>zwr n n="%kml:guid"
Multi-User/Multi-Tasking/Multi-Processor: MUMPS supports multiple simultaneous users and processes even when the underlying operating system does not (e.g., MS-DOS). Additionally, there is the ability to specify an environment for a variable, such as by specifying a machine name in a variable (as in
SET ^|"DENVER"|A(1000)="Foo"), which can allow you to access data on remote machines.
Some aspects of MUMPS syntax differ strongly from that of more modern languages, which can cause confusion. Whitespace is not allowed within expressions, as it ends a statement:
2 + 3 is an error, and must be written
2+3. All operators have the same precedence and are left-associative (
2+3*10 evaluates to 50). The operators for "less than or equal to" and "greater than or equal to" are
'< (that is, the boolean negation operator
' plus a strict comparison operator). Periods (
.) are used to indent the lines in a DO block, not whitespace. The ELSE command does not need a corresponding IF, as it operates by inspecting the value in the builtin system variable
MUMPS scoping rules are more permissive than other modern languages. Declared local variables are scoped using the stack. A routine can normally see all declared locals of the routines below it on the call stack, and routines cannot prevent routines they call from modifying their declared locals, unless the caller manually creates a new stack level (
do) and aliases each of the variables they wish to protect (
. new x,y) before calling any child routines. By contrast, undeclared variables (variables created by using them, rather than declaration) are in scope for all routines running in the same process, and remain in scope until the program exits.
Because MUMPS database references differ from to internal variable references only in the caret prefix, it is dangerously easy to unintentionally edit the database, or even to delete a database "table".
This section needs additional citations for verification . (October 2018) (Learn how and when to remove this template message)
All of the following positions can be, and have been, supported by knowledgeable people at various times:
Some of the contention arose in response to strong M advocacy on the part of one commercial interest, InterSystems, whose chief executive disliked the name MUMPS and felt that it represented a serious marketing obstacle. Thus, favoring M to some extent became identified as alignment with InterSystems. The dispute also reflected rivalry between organizations (the M Technology Association, the MUMPS Development Committee, the ANSI and ISO Standards Committees) as to who determines the "official" name of the language. Some writers have attempted to defuse the issue by referring to the language as M[UMPS], square brackets being the customary notation for optional syntax elements. A leading authority, and the author of an open source MUMPS implementation, Professor Kevin O'Kane, uses only 'MUMPS'.
The most recent standard (ISO/IEC 11756:1999, re-affirmed on 25 June 2010), still mentions both M and MUMPS as officially accepted names.
Massachusetts General Hospital registered "MUMPS" as a trademark with the USPTO on November 28, 1971, and renewed it on November 16, 1992 - but let it expire on August 30, 2003.
MUMPS invites comparison with the Pick operating system.Similarities include:
In computer science, an array data structure, or simply an array, is a data structure consisting of a collection of elements, each identified by at least one array index or key. An array is stored such that the position of each element can be computed from its index tuple by a mathematical formula. The simplest type of data structure is a linear array, also called one-dimensional array.
C is a general-purpose, procedural computer programming language supporting structured programming, lexical variable scope, and recursion, with a static type system. By design, C provides constructs that map efficiently to typical machine instructions. It has found lasting use in applications previously coded in assembly language. Such applications include operating systems and various application software for computers architectures that range from supercomputers to PLCs and embedded systems.
Digital Equipment Corporation, using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president until forced to resign in 1992, after the company had gone into precipitous decline.
Lisp is a family of programming languages with a long history and a distinctive, fully parenthesized prefix notation. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today. Only Fortran is older, by one year. Lisp has changed since its early days, and many dialects have existed over its history. Today, the best-known general-purpose Lisp dialects are Racket, Common Lisp, Scheme and Clojure.
PL/I is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. It has been used by academic, commercial and industrial organizations since it was introduced in the 1960s, and is still used.
SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.
OpenVMS is a multi-user, multiprocessing virtual memory-based operating system (OS) designed for use in time-sharing, batch processing, and transaction processing. It was first released by Digital Equipment Corporation in 1977 as VAX/VMS for its series of VAX minicomputers. OpenVMS also runs on DEC Alpha systems, the Itanium-based HPE Integrity family of computers, and it can run natively on select x86-64 systems as of 2020. OpenVMS is a proprietary operating system, but source code listings are available for purchase.
The VA Kernel is a set of programs, developed by the Department of Veterans Affairs of the United States Government, which provide an operating system and MUMPS implementation independent abstraction to the VistA Hospital Information System. These programs are the only programs which are expected to not be written in ANSI Standard MUMPS.
IDL, short for Interactive Data Language, is a programming language used for data analysis. It is popular in particular areas of science, such as astronomy, atmospheric physics and medical imaging. IDL shares a common syntax with PV-Wave and originated from the same codebase, though the languages have subsequently diverged in detail. There are also free or costless implementations, such as GNU Data Language (GDL) and Fawlty Language (FL).
Dartmouth BASIC is the original version of the BASIC programming language. It was designed by two professors at Dartmouth College, John G. Kemény and Thomas E. Kurtz. With the underlying Dartmouth Time Sharing System (DTSS), it offered an interactive programming environment to all undergraduates as well as the larger university community.
DIGITAL Command Language (DCL) is the standard command language adopted by most of the operating systems (OSs) that were sold by the former Digital Equipment Corporation. DCL had its roots in the IAS, TOPS-20, and RT-11 OSs and was implemented as a standard across most of Digital's OSs, notably RSX-11, but took its most powerful form in the OpenVMS OS.
InterSystems Caché is a commercial operational database management system from InterSystems, used to develop software applications for healthcare management, banking and financial services, government, and other sectors. Customer software can use the database with object and SQL code. Caché also allows developers to directly manipulate its underlying data structures: hierarchical arrays known as M technology.
GT.M is a high-throughput key-value database engine optimized for transaction processing. GT.M is also an application development platform and a compiler for the ISO standard M language, also known as MUMPS.
Caché ObjectScript is a part of the Caché database system sold by InterSystems. The language is a functional superset of the ANSI-standard MUMPS programming language. Since Caché is at its core a MUMPS implementation, it can run ANSI MUMPS routines with no change. To appeal as a commercial product, Caché implements support for object-oriented programming, a macro preprocessing language, embedded SQL for ANSI-standard SQL access to M's built-in database, procedure and control blocks using C-like brace syntax, procedure-scoped variables, and relaxed whitespace syntax limitations.
MIIS is a MUMPS-like programming language that was created by A.Neil Pappalardo and Curt W. Marble, on a DEC PDP at Mass General Hospital from 1964 to 1968. MUMPS evolution took two major directions: MUMPS proper and MIIS. MUMPS became an ANSI and ISO-standard language. When many MUMPS implementations standardized to be compatible, MIIS did not standardize, but became a proprietary system instead.
MUMPS syntax allows multiple commands to appear on a line, grouped into procedures (subroutines) in a fashion similar to most structured programming systems. Storing variables in the database is designed to be simple, requiring no libraries and using the same commands and operators used for working with variables in RAM as with data in persistent storage.
InterSystems Corporation is a privately held vendor of software systems and technology for high-performance database management, rapid application development, integration, and healthcare information systems. The vendor's products include InterSystems IRIS Data Platform, Caché Database Management System, the InterSystems Ensemble integration platform, the HealthShare healthcare informatics platform and TrakCare healthcare information system, which is sold outside the United States.
Rexx is an interpreted programming language developed at IBM by Mike Cowlishaw. It is a structured, high-level programming language designed for ease of learning and reading. Proprietary and open source Rexx interpreters exist for a wide range of computing platforms; compilers exist for IBM mainframe computers.
RTL/2 is a discontinued high-level programming language developed at Imperial Chemical Industries Ltd by J.G.P. Barnes. It was originally used internally within ICI but was distributed by SPL International in 1974 It was designed for use in real-time computing. Based on concepts from Algol 68, it was intended to be a small, simple language. RTL/2 was standardised in 1980 by the British Standards Institution.
Description ... The date a product was classified as ACTIVE MATURE LIMITED ... MUMPS Oct-80 Dec-94 Dec-94
This article's use of external links may not follow Wikipedia's policies or guidelines. (August 2013) (Learn how and when to remove this template message)