Intentional programming

Last updated

In computer programming, Intentional Programming is a programming paradigm developed by Charles Simonyi that encodes in software source code the precise intention which programmers (or users) have in mind when conceiving their work. By using the appropriate level of abstraction at which the programmer is thinking, creating and maintaining computer programs become easier. By separating the concerns for intentions and how they are being operated upon, the software becomes more modular and allows for more reusable software code.

Contents

Intentional Programming was developed by former Microsoft chief architect Charles Simonyi, who led a team in Microsoft Research, which developed the paradigm and built an integrated development environment (IDE) called IP (for Intentional Programming) that demonstrated the paradigm. Microsoft decided not to productize the Intentional Programming paradigm, as in the early 2000s Microsoft was rolling out C# and .NET to counter Java adoption. [1] Charles Simonyi decided, with approval of Microsoft, to take his idea out from Microsoft and commercialize it himself. He founded the company Intentional Software to pursue this. Microsoft licensed the Intentional Programming patents Simonyi had acquired while at Microsoft, but no source code, to Intentional Software.

An overview of Intentional Programming as it was developed at Microsoft Research is given in Chapter 11 of the book Generative Programming: Methods, Tools, and Applications. [2]

Development cycle

As envisioned by Simonyi, developing a new application via the Intentional Programming paradigm proceeds as follows. A programmer builds a WYSIWYG-like environment supporting the schema and notation of business knowledge for a given problem domain (such as productivity applications or life insurance). Users then use this environment to capture their intentions, which are recorded at high level of abstraction. The environment can operate on these intentions and assist the user to create semantically richer documents that can be processed and executed, similar to a spreadsheet. The recorded knowledge is executed by an evaluator or is compiled to generate the final program. Successive changes are done at the WYSIWYG level only. As opposed to word processors, spreadsheets or presentation software, an Intentional environment has more support for structure and semantics of the intentions to be expressed, and can create interactive documents that capture more richly what the user is trying to accomplish. A special case is when the content is program code, and the environment becomes an intelligent IDE. [3]

Separating source code storage and presentation

Key to the benefits of Intentional Programming is that domain code which capture the intentions are not stored in source code text files, but in a tree-based storage (could be binary or XML). Tight integration of the environment with the storage format brings some of the nicer features of database normalization to source code. Redundancy is eliminated by giving each definition a unique identity, and storing the name of variables and operators in exactly one place. This makes it easier to intrinsically distinguish declarations from references, and the environment can show them differently.

Whitespace in a program is also not stored as part of the source code, and each programmer working on a project can choose an indentation display of the source. More radical visualizations include showing statement lists as nested boxes, editing conditional expressions as logic gates, or re-rendering names in Chinese.

The system uses a normalized language for popular languages like C++ and Java, while letting users of the environment mix and match these with ideas from Eiffel and other languages. Often mentioned in the same context as language-oriented programming via domain-specific languages, and aspect-oriented programming, IP purports to provide some breakthroughs in generative programming. These techniques allow developers to extend the language environment to capture domain-specific constructs without investing in writing a full compiler and editor for any new languages.

Programming Example

A Java program that writes out the numbers from 1 to 10, using a curly bracket syntax, might look like this:

for(inti=1;i<=10;i++){System.out.println("the number is "+i);}

The code above contains a common construct of most programming languages, the bounded loop, in this case represented by the for construct. The code, when compiled, linked and run, will loop 10 times, incrementing the value of i each time after printing it out.

But this code does not capture the intentions of the programmer, namely to "print the numbers 1 to 10". In this simple case, a programmer asked to maintain the code could likely figure out what it is intended to do, but it is not always so easy. Loops that extend across many lines, or pages, can become very difficult to understand, notably if the original programmer uses unclear labels. Traditionally the only way to indicate the intention of the code was to add source code comments, but often comments are not added, or are unclear, or drift out of sync with the source code they originally described.

In intentional programming systems the above loop could be represented, at some level, as something as obvious as "print the numbers 1 to 10". The system would then use the intentions to generate source code, likely something very similar to the code above. The key difference is that the intentional programming systems maintain the semantic level, which the source code lacks, and which can dramatically ease readability in larger programs.

Although most languages contain mechanisms for capturing certain kinds of abstraction, IP, like the Lisp family of languages, allows for the addition of entirely new mechanisms. Thus, if a developer started with a language like C, they would be able to extend the language with features such as those in C++ without waiting for the compiler developers to add them. By analogy, many more powerful expression mechanisms could be used by programmers than mere classes and procedures.

Identity

IP focuses on the concept of identity. Since most programming languages represent the source code as plain text, objects are defined by names, and their uniqueness has to be inferred by the compiler. For example, the same symbolic name may be used to name different variables, procedures, or even types. In code that spans several pages  or, for globally visible names, multiple files  it can become very difficult to tell what symbol refers to what actual object. If a name is changed, the code where it is used must carefully be examined.

By contrast, in an IP system, all definitions not only assign symbolic names, but also unique private identifiers to objects. This means that in the IP development environment, every reference to a variable or procedure is not just a name  it is a link to the original entity.

The major advantage of this is that if an entity is renamed, all of the references to it in the program remain valid (known as referential integrity). This also means that if the same name is used for unique definitions in different namespaces (such as ".to_string()"), references with the same name but different identity will not be renamed, as sometimes happens with search/replace in current editors. This feature also makes it easy to have multi-language versions of the program; it can have a set of English-language names for all the definitions as well as a set of Japanese-language names which can be swapped in at will.

Having a unique identity for every defined object in the program also makes it easy to perform automated refactoring tasks, as well as simplifying code check-ins in versioning systems. For example, in many current code collaboration systems (e.g. Git), when two programmers commit changes that conflict (i.e. if one programmer renames a function while another changes one of the lines in that function), the versioning system will think that one programmer created a new function while another modified an old function. In an IP versioning system, it will know that one programmer merely changed a name while another changed the code.

Levels of detail

IP systems also offer several levels of detail, allowing the programmer to "zoom in" or out. In the example above, the programmer could zoom out to get a level that would say something like:

<<print the numbers 1 to 10>>

Thus IP systems are self-documenting to a large degree, allowing the programmer to keep a good high-level picture of the program as a whole.

Similar works

There are projects that exploit similar ideas to create code with higher level of abstraction. Among them are:

See also

Related Research Articles

<span class="mw-page-title-main">Software</span> Non-tangible executable component of a computer

Software is a collection of programs and data that tell a computer how to perform specific tasks. Software often includes associated software documentation. This is in contrast to hardware, from which the system is built and which actually performs the work.

Computer programming or coding is the composition of sequences of instructions, called programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of procedures, by writing code in one or more programming languages. Programmers typically use high-level programming languages that are more easily intelligible to humans than machine code, which is directly executed by the central processing unit. Proficient programming usually requires expertise in several different subjects, including knowledge of the application domain, details of programming languages and generic code libraries, specialized algorithms, and formal logic.

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a low-level programming language to create an executable program.

<span class="mw-page-title-main">Common Lisp</span> Programming language standard

Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ANSI INCITS 226-1994 (S2018). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived from the ANSI Common Lisp standard.

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs.

Hungarian notation is an identifier naming convention in computer programming in which the name of a variable or function indicates its intention or kind, or in some dialects, its type. The original Hungarian notation uses only intention or kind in its naming convention and is sometimes called Apps Hungarian as it became popular in the Microsoft Apps division in the development of Microsoft Office applications. When the Microsoft Windows division adopted the naming convention, they based it on the actual data type, and this convention became widely spread through the Windows API; this is sometimes called Systems Hungarian notation.

In software engineering and computer science, abstraction is the process of generalizing concrete details, such as attributes, away from the study of objects and systems to focus attention on details of greater importance. Abstraction is a fundamental concept in computer science and software engineering, especially within the object-oriented programming paradigm. Examples of this include:

A programming paradigm is a relatively high-level way to structure and conceptualize the implementation of a computer program. Each programming language can be classified as one or more paradigms.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined to accomplish a task, much as one might use multiple hands to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

Hardware abstractions are sets of routines in software that provide programs with access to hardware resources through programming interfaces. The programming interface allows all devices in a particular class C of hardware devices to be accessed through identical interfaces even though C may contain different subclasses of devices that each provide a different hardware interface.

<span class="mw-page-title-main">Charles Simonyi</span> Hungarian-American software architect

Charles Simonyi is a Hungarian-American software architect. He started and led Microsoft's applications group, where he built the first versions of Microsoft Office.

In computer science, automatic programming is a type of computer programming in which some mechanism generates a computer program to allow human programmers to write the code at a higher abstraction level.

Domain-specific modeling (DSM) is a software engineering methodology for designing and developing systems, such as computer software. It involves systematic use of a domain-specific language to represent the various facets of a system.

Intentional Software was a software company that designed tools and platforms that followed the principles of intentional programming in which programmers focus on capturing the intent of users and designers, and spend as little time as possible interacting with machines and compilers. Its tools included language workbenches, tools that separated software function from implementation, and allowed 'language-focused' development. This allowed automatic rewriting of code as expert knowledge of implementation options changed. The company later began developing a platform for improving productivity of software groups.

Domain-driven design (DDD) is a major software design approach, focusing on modeling software to match a domain according to input from that domain's experts.

<span class="mw-page-title-main">Scripting language</span> Programming language for run-time events

A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled.

JetBrains MPS is a language workbench developed by JetBrains. MPS is a tool to design domain-specific languages (DSL). It uses projectional editing which allows users to overcome the limits of language parsers, and build DSL editors, such as ones with tables and diagrams.
It implements language-oriented programming. MPS is an environment for language definition, a language workbench, and integrated development environment (IDE) for such languages.

In software engineering and programming language theory, the abstraction principle is a basic dictum that aims to reduce duplication of information in a program whenever practical by making use of abstractions provided by the programming language or software libraries. The principle is sometimes stated as a recommendation to the programmer, but sometimes stated as a requirement of the programming language, assuming it is self-understood why abstractions are desirable to use. The origins of the principle are uncertain; it has been reinvented a number of times, sometimes under a different name, with slight variations.

<span class="mw-page-title-main">Object-oriented programming</span> Programming paradigm based on the concept of objects

Object-oriented programming (OOP) is a programming paradigm based on the concept of objects, which can contain data and code: data in the form of fields, and code in the form of procedures. In OOP, computer programs are designed by making them out of objects that interact with one another.

The following outline is provided as an overview of and topical guide to C++:

References

  1. "Simonyi explains, 'It was impractical, when Microsoft was making tremendous strides with .Net in the near term, to somehow send somebody out from the same organization who says, "This is not how you should do things--what if you did things in this other, more disruptive way?'" (Quote from "Anything You Can Do, I Can Do Meta", Tuesday, January 9, 2007, Scott Rosenberg, Technology Review . Archived 20 September 2020 at archive.today )
  2. Generative Programming: Methods, Tools, and Applications, by Krzysztof Czarnecki and Ulrich Eisenecker, Addison-Wesley, Reading, MA, USA, June 2000.
  3. Scott Rosenberg: "Anything You Can Do, I Can Do Meta." Technology Review , January 8, 2007. Archived 20 September 2020 at archive.today