Byte Sieve

Last updated

The Byte Sieve is a computer-based implementation of the Sieve of Eratosthenes published by Byte as a programming language performance benchmark. It first appeared in the September 1981 edition of the magazine and was revisited on occasion. Although intended to compare the performance of different languages on the same computers, it quickly became a widely used machine benchmark.

Contents

The Sieve was one of the more popular benchmarks of the home computer era, another being the Creative Computing Benchmark of 1983, and the Rugg/Feldman benchmarks, mostly seen in the UK in this era. Byte later published the more thorough NBench in 1995 to replace it.

History

Origins

Jim Gilbreath of the Naval Ocean System Center had been considering the concept of writing a small language benchmarking program for some time, desiring one that would be portable across languages, small enough that the program code would fit on a single printed page, and did not rely on specific features like hardware multiplication or division. The solution was inspired by a meeting with Chuck Forsberg at the January 1980 USENIX meeting in Boulder, CO, where Forsberg mentioned an implementation of the sieve written by Donald Knuth. [1] [2]

Gilbreath felt the sieve would be an ideal benchmark as it avoided indirect tests on arithmetic performance, which varied widely between systems. The algorithm mostly stresses array lookup performance and basic logic and branching capabilities. Nor does it require any advanced language features like recursion or advanced collection types. The only modification from Knuth’s original version was to remove a multiplication by two and replace it with an addition instead. Machines with hardware multipliers would otherwise run so much faster that the rest of the performance would be hidden. [1]

After six months of effort porting it to as many platforms as he had access to, the first results were introduced in the September 1981 edition of Byte in an article entitled "A High-Level Language Benchmark". [1] Gilbreath was quick to point out that:

I should emphasize that this benchmark is not the only criterion by which to judge a language or compiler. [1]

The article provided reference implementations in ten languages, including more popular selections like BASIC, C, Pascal, COBOL, and FORTRAN, and some less well-known examples like Forth, ZSPL, Ratfor, PL/1 and PLMX. [3]

Example runs were provided for a variety of machines, mostly Zilog Z80 or MOS 6502-based. The best time was initially 16.5 seconds, turned in by Ratfor on a 4 MHz Z80 machine, but Gary Kildall personally provided a version in Digital Research's prototype version of PL/1 [4] that ran in 14 seconds and set the mark for this first collection. The slowest was Microsoft COBOL on the same machine, which took a whopping 5115 seconds (almost one and a half hours), longer even than interpreted languages like BASIC. [5] A notable feature of this first run was that C, Pascal and PL/1 all turned in a roughly similar performance that easily beat the various interpreters. [4]

A second set of tests was carried out on more powerful machines, with Motorola 68000 assembly language turning in the fastest times at 1.12 seconds, slightly besting C on a PDP-11/70 and almost twice as fast as 8086 assembler. Most PDP-11 and HP-3000 times were much slower, on the order of 10 to 50 seconds. [6] Tests on these machines using only high-level languages was led by NBS Pascal on the PDP-11, at 2.6 seconds. [7]

UCSD Pascal provided another interesting set of results as the same program can be run on multiple machines. Running on the dedicated Ithaca Intersystems Pascal-100 machine, a Pascal MicroEngine based computer, it ran in 54 seconds, while on the Z80 it was 239, and 516 on the Apple II. [7]

Spread

Gilbreath, this time along with his brother Gary, revisited the code in the January 1983 edition of Byte. This version removed most of the less popular languages, leaving Pascal, C, FORTRAN IV, and COBOL, while adding Ada and Modula-2. Thanks to readers providing additional samples, the number of machines, operating systems and languages compared in the resulting tables was greatly expanded. [8]

Motorola 68000 (68k) assembly remained the fastest, almost three times the speed of the Intel 8086 running at the same 8 MHz clock. Using high-level languages the two were closer in performance, with the 8086 generally better than half the speed of the 68k and often much closer. [9] A wider variety of minicomputers and mainframes was also included, with times that the 68k generally beat except for the very fastest machines like the IBM 3033 and high-end models of the VAX. Older machines like the Data General Nova, PDP-11 and HP-1000 were nowhere near as fast as the 68k. [8]

Gilbreath's second article appeared as the benchmark was becoming quite common as a way to compare the performance of various machines, let alone languages. In spite of his original warning not to do so, it soon began appearing in magazine advertisements as a way to compare performance against the competition, [10] [11] and as a general benchmark. [12]

Byte once again revisited the sieve later in August 1983 as part of a whole-magazine series of articles on the C language. In this case the use was more in keeping with the original intent, using a single source code and running it on a single machine to compare the performance of C compilers on the CP/M-86 operating system, [13] on CP/M-80, [14] and for the IBM PC. [15]

In spite of Gilbreath's concern in the original article, by this time the code had become almost universal for testing, and one of the articles remarked that "The Sieve of Eratosthenes is a mandatory benchmark". [13] It was included in the Byte UNIX Benchmark Suite introduced in August 1984. [16]

Today

New versions of the code continue to appear for new languages, [17] eg Rosetta Code and GitHub has many versions available. [18] It is often used as an example of functional programming in spite of the common version not actually using the sieve algorithm. [19]

Implementation

The provided implementation calculated odd primes only, so the 8191 element array actually represented primes less than 16385. As shown in a sidebar table, the 0th element represented 3, 1st element 5, 2nd element 7, and so on.

This is the original BASIC version of the code presented in 1981. [20] [lower-alpha 1] The dialect is not specified, but a number of details mean it does not run under early versions of Microsoft BASIC (4.x and earlier), among these the use of long variable names like SIZE and FLAGS. The lack of line numbers may suggest a minicomputer variety that reads source from a text file, but may have also been a printing error.

REM Eratosthenes Sieve Prime Number Program in BASIC 1SIZE=81902DIMFLAGS(8191)3PRINT"Only 1 iteration"5COUNT=06FORI=0TOSIZE7FLAGS(I)=18NEXTI9FORI=0TOSIZE10IFFLAGS(I)=0THEN1811PRIME=I+I+312K=I+PRIME13IFK>SIZETHEN1714FLAGS(K)=015K=K+PRIME16GOTO1317COUNT=COUNT+118NEXTI19PRINTCOUNT," PRIMES"

And in C, with some whitespace adjustments from the original: [21]

#define true 1#define false 0#define size 8190#define sizepl 8191charflags[sizepl];main(){inti,prime,k,count,iter;printf("10 iterations\n");for(iter=1;iter<=10;iter++){count=0;for(i=0;i<=size;i++)flags[i]=true;for(i=0;i<=size;i++){if(flags[i]){prime=i+i+3;k=i+prime;while(k<=size){flags[k]=false;k+=prime;}count=count+1;}}}printf("\n%d primes",count);}

Notes

  1. Note that the line number for the first line is missing in the original source listing.

Related Research Articles

The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures. To disambiguate arbitrarily sized bytes from the common 8-bit definition, network protocol documents such as The Internet Protocol refer to an 8-bit byte as an octet. Those bits in an octet are usually counted with numbering from 0 to 7 or 7 to 0 depending on the bit endianness. The first bit is number 0, making the eighth bit number 7.

Intel 8086 16-bit microprocessor

The 8086 is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus, and is notable as the processor used in the original IBM PC design.

Pascal (programming language) Programming language

Pascal is an imperative and procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming and data structuring. It is named in honour of the French mathematician, philosopher and physicist Blaise Pascal.

UCSD Pascal Programming language released in 1977

UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (UCSD).

Turbo Pascal is a software development system that includes a compiler and an integrated development environment (IDE) for the Pascal programming language running on CP/M, CP/M-86, and DOS. It was originally developed by Anders Hejlsberg at Borland, and was notable for its extremely fast compiling times. Turbo Pascal, and the later but similar Turbo C, made Borland a leader in PC-based development.

BBC BASIC Version of the BASIC programming language

BBC BASIC is a version of the BASIC programming language released in 1981 as the native programming language for the BBC Micro home/personal computer, providing a standardized language for a UK computer literacy project of the BBC. It was written mainly by Sophie Wilson.

Sieve of Eratosthenes Ancient algorithm for generating prime numbers

In mathematics, the sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to any given limit.

Atari BASIC

Atari BASIC is an interpreter for the BASIC programming language that shipped with the Atari 8-bit family of 6502-based home computers. Unlike most BASICs of the home computer era, Atari BASIC is not a derivative of Microsoft BASIC and differs in significant ways. It includes keywords for Atari-specific features and lacks support for string arrays, for example.

Interlisp is a programming environment built around a version of the programming language Lisp. Interlisp development began in 1966 at Bolt, Beranek and Newman in Cambridge, Massachusetts with Lisp implemented for the Digital Equipment Corporation (DEC) PDP-1 computer by Danny Bobrow and D. L. Murphy. In 1970, Alice K. Hartley implemented BBN LISP, which ran on PDP-10 machines running the operating system TENEX. In 1973, when Danny Bobrow, Warren Teitelman and Ronald Kaplan moved from BBN to the Xerox Palo Alto Research Center (PARC), it was renamed Interlisp. Interlisp became a popular Lisp development tool for artificial intelligence (AI) researchers at Stanford University and elsewhere in the community of the Defense Advanced Research Projects Agency (DARPA). Interlisp was notable for integrating interactive development tools into an integrated development environment (IDE), such as a debugger, an automatic correction tool for simple errors (via do what I mean software design, and analysis tools.

Zilog Z8000

The Z8000 is a 16-bit microprocessor introduced by Zilog in early 1979. The architecture was designed by Bernard Peuto while the logic and physical implementation was done by Masatoshi Shima, assisted by a small group of people. In contrast to most designs of the era, the Z8000 did not use microcode which allowed it to be implemented in only 17,500 transistors.

Intel iAPX 432

The iAPX 432 is a discontinued computer architecture introduced in 1981. It was Intel's first 32-bit processor design. The main processor of the architecture, the general data processor, is implemented as a set of two separate integrated circuits, due to technical limitations at the time. Although some early 8086, 80186 and 80286-based systems and manuals also used the iAPX prefix for marketing reasons, the iAPX 432 and the 8086 processor lines are completely separate designs with completely different instruction sets.

TI BASIC (TI 99/4A) Programming language for TI-99 home computers

TI BASIC is an ANSI-compliant BASIC programming language interpreter built into the 1979 Texas Instruments TI-99/4 home computer and its improved 1981 version, the TI-99/4A.

History of programming languages Aspect of history

The history of programming languages spans from documentation of early mechanical computers to modern tools for software development. Early programming languages were highly specialized, relying on mathematical notation and similarly obscure syntax. Throughout the 20th century, research in compiler theory led to the creation of high-level programming languages, which use a more accessible syntax to communicate instructions.

PROMAL

PROMAL is a structured programming language from Systems Management Associates for MS-DOS, Commodore 64, and Apple II. PROMAL features simple syntax, no line numbers, long variable names, functions and procedures with argument passing, real number type, arrays, strings, pointer, and a built-in I/O library. Like ABC and Python, indentation is part of the language syntax.

Action! (programming language) Atari 8-bit family programming language

Action! is a procedural programming language and integrated development environment written by Clinton Parker for the Atari 8-bit family. The language, which is similar to ALGOL, compiled to high-performance code for the MOS Technologies 6502 of the Atari computers. Action! was distributed on ROM cartridge by Optimized Systems Software starting in 1983. It was one of the company's first bank-switched "Super Cartridges", with a total of 16 kB of code.

S-algol is a computer programming language derivative of ALGOL 60 developed at the University of St Andrews in 1979 by Ron Morrison and Tony Davie. The language is a modification of ALGOL to contain orthogonal data types that Morrison created for his PhD thesis. Morrison would go on to become professor at the university and head of the department of computer science. The S-algol language was used for teaching at the university at an undergraduate level until 1999. It was also the language taught for several years in the 1980s at a local school in St. Andrews, Madras College. The computer science text Recursive Descent Compiling describes a recursive descent compiler for S-algol, implemented in S-algol.

In mathematics, the sieve of Sundaram is a variant of the sieve of Eratosthenes, a simple deterministic algorithm for finding all the prime numbers up to a specified integer. It was discovered by Indian student S. P. Sundaram in 1934.

The Creative Computing Benchmark, also called Ahl's Simple Benchmark, is a computer benchmark that was used to compare the performance of the BASIC programming language on various machines. It was first introduced in the November 1983 issue of Creative Computing magazine with the measures from a number of 8-bit computers that were popular at the time. Over a period of a few months, the list was greatly expanded to include practically every contemporary machine, topped by the Cray-1 supercomputer, which ran it in 0.01 seconds.

The Rugg/Feldman benchmarks are a series of seven short BASIC programming language programs that are used to test the performance of BASIC implementations on various microcomputers. They were published by Tom Rugg and Phil Feldman in the June 1977 issue of the US computer magazine, Kilobaud.

BASIC interpreter Interpreter that enables users to enter and run programs in the BASIC language

A BASIC interpreter is an interpreter that enables users to enter and run programs in the BASIC language and was, for the first part of the microcomputer era, the default application that computers would launch. Users were expected to use the BASIC interpreter to type in programs or to load programs from storage.

References

Citations

  1. 1 2 3 4 Gilbreath 1981, p. 180.
  2. Knuth 1969, pp. 416, 658.
  3. Gilbreath 1981, pp. 181–190.
  4. 1 2 Gilbreath 1981, pp. 194.
  5. Gilbreath 1981, pp. 195.
  6. Gilbreath 1981, pp. 193.
  7. 1 2 Gilbreath 1981, pp. 196.
  8. 1 2 Gilbreath & Gilbreath 1983, p. 294.
  9. Gilbreath & Gilbreath 1983, p. 292.
  10. "HS/FORTH (advertisement)" (PDF). PC Tech Journal. October 1985. p. 132.
  11. "FORTH is now very.fast (advertisement)" (PDF). FORTH Dimensions. November–December 1985. p. 2.
  12. Ciarcia, Steve (1979). Ciarcia's Circuit Cellar, Volume 6. p. 133. ISBN   9780070109681.
  13. 1 2 Houston, Jerry; Brodrick, Jim; Kent, Les (August 1983). "Comparing C Compilers for CP/M-86". Byte. pp. 82–106.
  14. Kern, Christopher (August 1983). "Five C Compilers for CP/M-80". Byte. pp. 110–130.
  15. Phraner, Ralph (August 1983). "Nine C Compilers for the IBM PC". Byte. pp. 134–168.
  16. Hinnant, David (August 1984). "Benchmarking UNIX Systems: UNIX performance on microcomputers and minicomputers". Byte. pp. 132–135, 400–409.
  17. "Swift Sieve of Eratosthenes". 27 July 2015.
  18. "Sieve of Eratosthenes" . Retrieved 2 May 2019.
  19. O'Neill, Melissa (January 2009). "The Genuine Sieve of Eratosthenes". Journal of Functional Programming. 19 (1): 95–106. doi:10.1017/S0956796808007004. S2CID   1309380.
  20. Gilbreath 1981, p. 188.
  21. Gilbreath 1981, p. 186.

Bibliography