Jackson structured programming

Last updated
Example of a JSP diagram. JSP RLE output1.png
Example of a JSP diagram.

Jackson structured programming (JSP) is a method for structured programming developed by British software consultant Michael A. Jackson and described in his 1975 book Principles of Program Design. [1] The technique of JSP is to analyze the data structures of the files that a program must read as input and produce as output, and then produce a program design based on those data structures, so that the program control structure handles those data structures in a natural and intuitive way.

Contents

JSP describes structures (of both data and programs) using three basic structures – sequence, iteration, and selection (or alternatives). These structures are diagrammed as (in effect) a visual representation of a regular expression.

Introduction

Michael A. Jackson originally developed JSP in the 1970s. He documented the system in his 1975 book Principles of Program Design. [1] In a 2001 conference talk, [2] he provided a retrospective analysis of the original driving forces behind the method, and related it to subsequent software engineering developments. Jackson's aim was to make COBOL batch file processing programs easier to modify and maintain, but the method can be used to design programs for any programming language that has structured control constructs sequence, iteration, and selection ("if/then/else").

Jackson Structured Programming was similar to Warnier/Orr structured programming [3] [4] although JSP considered both input and output data structures while the Warnier/Orr method focused almost exclusively on the structure of the output stream.

Motivation for the method

At the time that JSP was developed, most programs were batch COBOL programs that processed sequential files stored on tape. A typical program read through its input file as a sequence of records, so that all programs had the same structure a single main loop that processed all of the records in the file, one at a time. Jackson asserted that this program structure was almost always wrong, and encouraged programmers to look for more complex data structures. In Chapter 3 of Principles of Program Design [1] Jackson presents two versions of a program, one designed using JSP, the other using the traditional single-loop structure. Here is his example, translated from COBOL into Java. The purpose of these two programs is to recognize groups of repeated records (lines) in a sorted file, and to produce an output file listing each record and the number of times that it occurs in the file.

Here is the traditional, single-loop version of the program.

Stringline;intcount=0;StringfirstLineOfGroup=null;// begin single main loopwhile((line=in.readLine())!=null){if(firstLineOfGroup==null||!line.equals(firstLineOfGroup)){if(firstLineOfGroup!=null){System.out.println(firstLineOfGroup+" "+count);}count=0;firstLineOfGroup=line;}count++;}if(firstLineOfGroup!=null){System.out.println(firstLineOfGroup+" "+count);}

Here is a JSP-style version of the same program. Note that (unlike the traditional program) it has two loops, one nested inside the other. The outer loop processes groups of repeating records, while the inner loop processes the individual records in a group.

Stringline;intnumberOfLinesInGroup;line=in.readLine();// begin outer loop: process 1 groupwhile(line!=null){numberOfLinesInGroup=0;StringfirstLineOfGroup=line;// begin inner loop: process 1 record in the groupwhile(line!=null&&line.equals(firstLineOfGroup)){numberOfLinesInGroup++;line=in.readLine();}System.out.println(firstLineOfGroup+" "+numberOfLinesInGroup);}

Jackson criticises the traditional single-loop version for failing to process the structure of the input file (repeating groups of records containing repeating individual records) in a natural way. One sign of its unnatural design is that, in order to work properly, it is forced to include special code for handling the first and last record of the file.

The basic method

JSP uses semi-formal steps to capture the existing structure of a program's inputs and outputs in the structure of the program itself.

The intent is to create programs which are easy to modify over their lifetime. Jackson's major insight was that requirement changes are usually minor tweaks to the existing structures. For a program constructed using JSP, the inputs, the outputs, and the internal structures of the program all match, so small changes to the inputs and outputs should translate into small changes to the program.

JSP structures programs in terms of four component types:

The method begins by describing a program's inputs in terms of the four fundamental component types. It then goes on to describe the program's outputs in the same way. Each input and output is modelled as a separate Data Structure Diagram (DSD). To make JSP work for compute-intensive applications, such as digital signal processing (DSP) it is also necessary to draw algorithm structure diagrams, which focus on internal data structures rather than input and output ones.

The input and output structures are then unified or merged into a final program structure, known as a Program Structure Diagram (PSD). This step may involve the addition of a small amount of high level control structure to marry up the inputs and outputs. Some programs process all the input before doing any output, whilst others read in one record, write one record and iterate. Such approaches have to be captured in the PSD.

The PSD, which is language neutral, is then implemented in a programming language. JSP is geared towards programming at the level of control structures, so the implemented designs use just primitive operations, sequences, iterations and selections. JSP is not used to structure programs at the level of classes and objects, although it can helpfully structure control flow within a class's methods.

JSP uses a diagramming notation to describe the structure of inputs, outputs and programs, with diagram elements for each of the fundamental component types.

A simple operation is drawn as a box.

Element Jackson.png
An operation

A sequence of operations is represented by boxes connected with lines. In the example below, A is a sequence consisting of operations B, C and D.

JSP Sequence.png
A sequence

An iteration is again represented with joined boxes. In addition the iterated operation has a star in the top right corner of its box. In the example below, A is an iteration of zero or more invocations of operation B.

JSP Iteration.png
An iteration

Selection is similar to a sequence, but with a circle drawn in the top right hand corner of each optional operation. In the example, A is a selection of one and only one of operations B, C or D.

JSP Selection.png
A selection

Note that it in the above diagrams, it is element A that is the sequence or iteration, not the elements B, C or D (which in the above diagrams are all elementary). Jackson gives the 'Look-down rule' to determine what an element is, i.e. look at the elements below an element to find out what it is.

A worked example

As an example, here is how a JSP programmer would design and code a run length encoder. A run length encoder is a program whose input is a stream of bytes which can be viewed as occurring in runs, where a run consists of one or more occurrences of bytes of the same value. The output of the program is a stream of byte pairs, where each byte pair is a compressed description of a run. In each pair, the first byte is the value of the repeated byte in a run and the second byte is a number indicating the number of times that that value was repeated in the run. For example, a run of eight occurrences of the letter "A" in the input stream ("AAAAAAAA") would produce "A8" as a byte pair in the output stream. Run length encoders are often used for crudely compressing bitmaps.

With JSP, the first step is to describe the data structure(s) of a program's input stream(s). The program has only one input stream, consisting of zero or more runs of the same byte value. Here is the JSP data structure diagram for the input stream.

JSP RLE input.png

The second step is to describe the output data structure, which in this case consists of zero or more iterations of byte pairs.

JSP RLE output1.png

The next step is to describe the correspondences between the components of the input and output structures.

JSP RLE correspondence.png

The next step is to use the correspondences between the two data structures to create a program structure that is capable of processing the input data structure and producing the output data structure. (Sometimes this isn't possible. See the discussion of structure clashes, below.)

JSP RLE program.png

Once the program structure is finished, the programmer creates a list of the computational operations that the program must perform, and the program structure diagram is fleshed out by hanging those operations off of the appropriate structural components.

  1. read a byte
  2. remember byte
  3. set counter to zero
  4. increment counter
  5. output remembered byte
  6. output counter

Also, at this stage conditions on iterations (loops) and selections (if-then-else or case statements) are listed and added to the program structure diagram.

  1. while there are more bytes
  2. while there are more bytes and this byte is the same as the run's first byte and the count will still fit in a byte

Once the diagram is finished, it can be translated into whatever programming language is being used. Here is a translation into C.

#include<stdio.h>#include<stdlib.h>intmain(intargc,char*argv[]){intc;intfirst_byte;intcount;c=getchar();/* get first byte */while(c!=EOF){/* process the first byte in the run */first_byte=c;count=1;c=getchar();/* get next byte *//* process the succeeding bytes in the run */while(c!=EOF&&c==first_byte&&count<255){/* process one byte of the same value */count++;c=getchar();/* get next byte */}putchar(first_byte);putchar(count);}returnEXIT_SUCCESS;}

Techniques for handling difficult design problems

In Principles of Program Design Jackson recognized situations that posed specific kinds of design problems, and provided techniques for handling them.

One of these situations is a case in which a program processes two input files, rather than one. In 1975, one of the standard "wicked problems" was how to design a transaction-processing program. In such a program, a sequential file of update records is run against a sequential master file, producing an updated master file as output. (For example, at night a bank would run a batch program that would update the balances in its customers' accounts based on records of the deposits and withdrawals that they had made that day.) Principles of Program Design provided a standard solution for that problem, along with an explanation of the logic behind the design.

Another kind of problem involved what Jackson called "recognition difficulties" and today we would call parsing problems. The basic JSP design technique was supplemented by POSIT and QUIT operations to allow the design of what we would now call a backtracking parser.

JSP also recognized three situations that are called "structure clashes" a boundary clash, an ordering clash, and an interleaving clash and provided techniques for dealing with them. In structure clash situations the input and output data structures are so incompatible that it is not possible to produce the output file from the input file. It is necessary, in effect, to write two programs the first processes the input stream, breaks it down into smaller chunks, and writes those chunks to an intermediate file. The second program reads the intermediate file and produces the desired output.

JSP and object-oriented design

JSP was developed long before object-oriented technologies became available. It and its successor method JSD do not treat what now would be called "objects" as collections of more or less independent methods. Instead, following the work of C. A. R. Hoare, JSP and JSD describe software objects as co-routines. [5] [6]

See also

Related Research Articles

Java Platform, Standard Edition is a computing platform for development and deployment of portable code for desktop and server environments. Java SE was formerly known as Java 2 Platform, Standard Edition (J2SE).

In computer programming, an infinite loop is a sequence of instructions that, as written, will continue endlessly, unless an external intervention occurs, such as turning off power via a switch or pulling a plug. It may be intentional.

Iteration is the repetition of a process in order to generate a sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration.

In computer programming, an iterator is an object that enables a programmer to traverse a container, particularly lists. Various types of iterators are often provided via a container's interface. Though the interface and semantics of a given iterator are fixed, iterators are often implemented in terms of the structures underlying a container implementation and are often tightly coupled to the container to enable the operational semantics of the iterator. An iterator performs traversal and also gives access to data elements in a container, but does not itself perform iteration.

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, and officially became part of the Unix operating system in Version 7.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

Jackson System Development (JSD) is a linear software development methodology developed by Michael A. Jackson and John Cameron in the 1980s.

Loop unrolling, also known as loop unwinding, is a loop transformation technique that attempts to optimize a program's execution speed at the expense of its binary size, which is an approach known as space–time tradeoff. The transformation can be undertaken manually by the programmer or by an optimizing compiler. On modern processors, loop unrolling is often counterproductive, as the increased code size can cause more cache misses; cf. Duff's device.

<span class="mw-page-title-main">Java syntax</span> Set of rules defining correctly structured program

The syntax of Java is the set of rules defining how a Java program is written and interpreted.

<span class="mw-page-title-main">Pipeline (Unix)</span> Mechanism for inter-process communication using message passing

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently. The concept of pipelines was championed by Douglas McIlroy at Unix's ancestral home of Bell Labs, during the development of Unix, shaping its toolbox philosophy. It is named by analogy to a physical pipeline. A key feature of these pipelines is their "hiding of internals". This in turn allows for more clarity and simplicity in the system.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

In computer programming, flow-based programming (FBP) is a programming paradigm that defines applications as networks of black box processes, which exchange data across predefined connections by message passing, where the connections are specified externally to the processes. These black box processes can be reconnected endlessly to form different applications without having to be changed internally. FBP is thus naturally component-oriented.

A Warnier/Orr diagram is a kind of hierarchical flowchart that allows the description of the organization of data and procedures. They were initially developed 1976, in France by Jean-Dominique Warnier and in the United States by Kenneth Orr on the foundation of Boolean algebra. This method aids the design of program structures by identifying the output and processing results and then working backwards to determine the steps and combinations of input needed to produce them. The simple graphic method used in Warnier/Orr diagrams makes the levels in the system evident and the movement of the data between them vivid.

<span class="mw-page-title-main">Structured analysis</span>

In software engineering, structured analysis (SA) and structured design (SD) are methods for analyzing business requirements and developing specifications for converting practices into computer programs, hardware configurations, and related manual procedures.

PROMELA is a verification modeling language introduced by Gerard J. Holzmann. The language allows for the dynamic creation of concurrent processes to model, for example, distributed systems. In PROMELA models, communication via message channels can be defined to be synchronous, or asynchronous. PROMELA models can be analyzed with the SPIN model checker, to verify that the modeled system produces the desired behavior. An implementation verified with Isabelle/HOL is also available, as part of the Computer Aided Verification of Automata (CAVA) project. Files written in Promela traditionally have a .pml file extension.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

PL/SQL is Oracle Corporation's procedural extension for SQL and the Oracle relational database. PL/SQL is available in Oracle Database, Times Ten in-memory database, and IBM Db2. Oracle Corporation usually extends PL/SQL functionality with each successive release of the Oracle Database.

In software engineering, the module pattern is a design pattern used to implement the concept of software modules, defined by modular programming, in a programming language with incomplete direct support for the concept.

Although C++ is one of the most widespread programming languages, many prominent software engineers criticize C++ for being overly complex and fundamentally flawed. Among the critics have been: Robert Pike, Joshua Bloch, Linus Torvalds, Donald Knuth, Richard Stallman, and Ken Thompson. C++ has been widely adopted and implemented as a systems language through most of its existence. It has been used to build many pieces of very important software.

References

  1. 1 2 3 Jackson, MA (1975), Principles of Program Design, Academic.
  2. Jackson, MA (2001), JSP in Perspective (PDF), sd&m Pioneers’ Conference, Bonn, June 2001, retrieved 2017-01-26{{citation}}: CS1 maint: location (link) CS1 maint: location missing publisher (link)
  3. Warnier, JD (1974), Logical Construction of Programs, NY: Van Nostrand Reinhold
  4. Orr, KT (1980), "Structured programming in the 1980s", Proceedings of the ACM 1980 Annual Conference, New York, NY: ACM Press, pp. 323–26, doi:10.1145/800176.809987, ISBN   978-0897910286, S2CID   26834496
  5. Wieringa, R (Dec 1998), "A survey of structured and object-oriented software specification methods and techniques", Comput Surv, 30 (4): 459–527, CiteSeerX   10.1.1.107.5410 , doi:10.1145/299917.299919, S2CID   14967319 .
  6. Henderson-Sellers, Brian; Edwards, JM (Sep 1990), "The object-oriented systems life cycle", Communications of the ACM, 33 (9): 142–59, doi: 10.1145/83880.84529 , S2CID   14680399 .