Paradigm | Imperative, Declarative |
---|---|
Family | dBASE |
Designed by | Arden Scott |
Developer | Richard Moore |
First appeared | 1971 |
Implementation language | FORTRAN |
Platform | SDS 940 |
License | Proprietary |
Influenced | |
dBASE, JPLDIS, RECALL, many others |
RETRIEVE is a database management system (DBMS) offered on Tymshare's systems starting in August 1971. It was written in Tymshare's own SUPER FORTRAN on the SDS 940. It offered basic single-file, non-relational database functionality using an interactive programming language. It is one of the earliest examples of software as a service (SaaS). [1]
RETRIEVE was highly influential and spawned a number of relatively direct clones. Wang Laboratories's RECALL on the Wang 2200 minicomputer was almost identical to RETRIEVE, to the point the differences were detailed in a single page. JPL made a version known as JPLDIS for the UNIVAC 1108 in 1973 that was also very similar.
Wayne Ratliff, a contractor at JPL for many years, was inspired by JPLDIS to port it to the IMSAI 8080 to manage his football pool, later releasing it commercially as Vulcan for CP/M in 1979. Ashton-Tate licensed Vulcan and re-released it as dBASE II in 1980, which sparked the microcomputer database market. Most of RETRIEVE's original syntax remains unchanged in dBASE and the many xBASE clones that survive into the 21st century.
In 1969, Jim Ryan of Tymshare designed an expanded version of IBM System/360 H-level (258 kB) FORTRAN, adding strings and other features that took advantage of their SDS 940 systems. Ryan hired Richard Moore and Franck Bracher to develop it as SUPER FORTRAN, and it went live on their systems in 1970. [lower-alpha 1] Soon after SUPER FORTRAN was released, Arden Scott asked Moore to use SUPER FORTRAN to develop his vision of a database management system (DBMS). The first version was ready in a few weeks, and immediately proved very popular with Tymshare customers. [1] [lower-alpha 2]
During the 1970s, Tymshare began moving their systems from SDS to the PDP-10 platform, eventually running TOPS-10. This led to an effort to build an entirely new database engine for this platform, known as MAGNUM. MAGNUM was a complete relational database engine, and in many references it is claimed to be the first such system offered commercially when it went live in October 1975. [3] Although most Tymshare customers, and internal users, switched to MAGNUM, by this time RETRIEVE had been ported to a number of platforms and these versions remained very popular outside the company.
In 1970, the Jet Propulsion Laboratory (JPL) installed three UNIVAC 1108 machines. Fred Thompson had been using RETRIEVE to manage a database of mechanical calculators at JPL, and decided to bring the system in-house when the 1108's arrived. In 1971, he collaborated with JPL programmer Jack Hatfield to produce JPLDIS. Hatfield left JPL in 1974, and the project was assigned to another JPL programmer, Jeb Long, who added a number of features. [4]
In 1973 the Wang 2200 was released, a desktop minicomputer with cassette tape storage. [5] RETRIEVE was ported to this platform under the name RECALL. A report for the US Army detailed the differences in a single page and concluded "Differences between the two implementations are very minor." [6]
While working at JPL as a contractor, Wayne Ratliff entered the office football pool. He had no interest in the game, but felt he could win the pool by processing the post-game statistics found in newspapers. In order to do this, he started looking for a database system and, by chance, came across the documentation for JPLDIS. He used this as the basis for a port to PTDOS on his kit-built IMSAI 8080 microcomputer, and called the resulting system Vulcan (after Mr. Spock on Star Trek ). It was later ported to CP/M when that system became almost universal in the S-100 bus market. [7]
In 1980, George Tate saw an advertisement for Vulcan being sold for $49. [8] He arranged a licensing deal with Ratliff, renamed it to dBASE II to make it sound like the second version, and put it on the market for $695. The system was a success, and in 1981, IBM commissioned a port to the still-unreleased PC DOS. [9] This was a runaway success, one of the big three packages, along with Word Perfect and Lotus 1-2-3 that made up the equivalent of an office suite in the early DOS market. [10]
RETRIEVE was a non-relational database, that concept not being introduced until 1970. RETRIEVE databases contained a single table stored in a single file, which typically used a short-form of the database name as the filename. The table structure is defined when the database is created, allowing fields of character, integer or free-form numeric inputs. [11] A database could have a maximum of 98 fields, with a maximum of 72 characters per field, and the total for any one row was 768 characters, or 256 words. Field names had to begin with a letter and could include additional letters, digits, a period and the @ character, with a maximum of 31 characters in total. [12]
The system was interactive, using a command prompt within the system for user interaction. [13] For instance, to start a new database the user would type CREATE
at the command prompt, which would then respond by asking the user to type in the name for the database and then prompt for the field definitions. An empty line stops this process and sends it into data-entry mode, allowing rows to be entered. [14] Any step of this operation could be removed by providing the data in the original command, for instance, if one typed CREATE EMPLOYEES
instead of CREATE
, the system no longer asked for the file name. [15]
There were three database file formats, which could be specified during CREATE
, the normal character-format SYMBOLIC
, BINARY
which saved numbers in their 24-bit-based internal formats, and SCRAMBLED
which encrypted the file with a user-supplied password. [16]
Existing databases were loaded using LOAD
or BASE
(they were equivalent). [17] Data was saved as it was changed, but there was also a SAVE
command to write out data (or selections of it) to flat files. [18] QUIT
exited the system and returned to the underlying operating system. [17]
Once the database was defined and populated, or an existing one was loaded, data could be displayed using LIST
. Without any parameters, it printed out all the records in the order they were entered, their "RECNO", which was printed at the front of the line. One could also provide a field list to select and reorder the fields in the printout, like LIST EMP.NUM,NAME,SALARY
. The PRINT
statement worked almost identically, differing only in that it did not print out the RECNO at the start of the lines. [19] FAST
was similar to PRINT
, but suppressed the field headers as well. [18]
Individual records could be selected using the "record number addressing system", which was prepended to LIST
or PRINT
. This was a flexible format allowing both single RECNOs separated by commas or ranges with colons to be specified, for instance, 1,7:20,50 PRINT
would print out the records 1, everything from 7 to 20, and then 50. It also included the pseudo-record $
to represent the last record, so 100:$ LIST
printed out all records from 100 on. [20]
Records could also be selected by fields meeting certain criteria based on their contents, using the FOR
syntax. To list all the employees with a salary greater than $40,000, one would PRINT FOR SALARY>40000
. The same syntax could be used to select single records based on their contents, not the RECNO, for instance LIST FOR NAME="Bob Smith"
. [21] [lower-alpha 3] RETRIEVE supported all basic comparisons, =
, <
, >
, <=
, >=
and #
for not-equals, which was found in some contemporary BASICs. Unusually, RETRIEVE also included English expansions of traditional comparisons, so one could use either SALARY>40000
or SALARY GREATER THAN 40000
. Such expressions could also include basic math, including +
, -
, *
for multiplication, /
for division, and ↑
for exponentiation. [22] These could be further combined with boolean expressions using AND
, OR
and NOT
. [21]
Additionally, the SUM
, COUNT
and AVERAGE
worked similar to LIST
or PRINT
, including the same record selection concepts. As the name implies, these output a single value with their associated values. For instance, COUNT FOR NAME='Bob Smith'
would likely return 1, while AVERAGE SALARY FOR SALARY>40000
might return 42500. [23]
In addition to the LIST
/PRINT
/FAST
, a further output statement is REPORT
. This works in a similar fashion, but has a number of options to pretty-print the output. It can be invoked alone or with qualifiers as above, but when it is used it enters an interactive mode that asks a number of questions about where to send the output (with T being the terminal), whether it should be single or double spaced, include headers and totals, etc. [24]
Records could be deleted using the DELETE
statement, using the same record selectors or field expressions as above. New records were inserted using APPEND
. APPEND FIELDS
entered an interactive mode allowing the user to type in additional records field-by-field rather than entering comma-separated values a row at a time. [25] APPEND FROM filename
read data from a comma-delimited text file into the current database already in memory. [26] MERGE
was used to update existing records; it worked similar to APPEND
, loading into the current database from another file, but in this case included an additional qualifier ON
. For instance, MERGE ON NAME FROM ADDRESSES
would read data from the file ADDRESSES and look in that file for a column where the first entry was "NAME". It would then process the file row-by-row, looking for entries in the database with that NAME and then updating the rest of the fields with the data from that row in ADDRESS. [27]
RETRIEVE supported two interactive methods to update existing records, CHANGE
and REPLACE
. REPLACE
worked similarly to the modern SQL equivalent, UPDATE
, taking a selector expression of some sort, one or more fields, and the new values. For instance, one might REPLACE SALARY=SALARY*1.05 WHERE YEARS.EMP>5
. [28] While CHANGE
ultimately did the same thing, it did so using an interactive mode. One invoked CHANGE
without the new value, for instance, CHANGE FOR 'Bob Smith' IN NAME
. The system printed out that row and then allowed the user to edit the record. [29] If there is more than one unique value, for instance, CHANGE FOR YEARS.EMP>5
, each value was printed on a separate line and changes were sent to all matching records. [30] MODIFY
was essentially identical to CHANGE
but did not print out the existing values first. [31]
Sorting was not performed at retrieval time, but was implemented by modifying the database and re-writing its contents in sorted order. This was accomplished with SORT BY
, followed by up to twenty field names. The original unsorted data is written to a backup file. [32]
The RESULTS TO
modifier could be used with any of the data modification statements to redirect the results to a new database. For instance, APPEND FROM FEBSALES RESULTS TO CURSALES
would append the data in the file FEBSALES to the current database and then save the results to CURSALES without updating the already-opened database. Or one might SORT BY NAME RESULTS TO SORTEMP
. [33]
Utility commands included STRUCTURE
which printed out the database schema, and SIZE
which returned the number of records. [34]
Although RETRIEVE was often used interactively, the system also included the ability to save lists of commands to files and then play them back. [35] Command files could also include a number of other "helper" statements, including TYPE 'a string'
to output any string, HUSH
to suppress the command prompt (a period), TALK
to turn the prompt back on, and ECHO ON
and ECHO OFF
to stop the playback from appearing on the terminal. [36]
These command files were run using the DO filename
statement from the internal command line [37] or COMMAND filename
from outside RETRIEVE, in EXECUTIVE. [38] If the script intended to leave the user in RETRIEVE at the end, one could put a COMMAND T
at the end, "running" the Terminal, which specified what should happen next. Scripts could be strung together with COMMAND
to form more complex workflows. [35]
When run, the commands inside the files operated just as they would if the user typed them in. This means that if a statement is provided that would normally require additional user input, for instance, a CHANGE
with no parameters, the interactive mode would be invoked as normal. This allowed the command files to invoke user-based input and then perform additional instructions. One could, for instance, use REPLACE ALL WEEK.SAL=SALARY/52
to catch any recent changes in salary, call CHANGE HOURS
to have the system present each employee record and ask for their weekly hours, then REPLACE ALL PAY=HOURS*WEEK.SAL
to calculate the weekly paycheck for all users, and finally REPORT
it send it all to a printer. [39]
Although separated by almost a decade, and having moved across four platforms in the process, dBASE on DOS remained very similar to RETRIEVE. Ratliff stated there was a "sort of progression from Retrieve to JPLDIS to my program, which I called Vulcan." [40] John Walker, better known for AutoCAD, also used JPLDIS and stated flatly that "DBase II, the Ashton-Tate database system, was a copy, a reimplementation of a package developed at the Jet Propulsion Laboratory called JPLDIS." [41]
LOAD
/BASE
became USE
and periods in field names were replaced by colons, but most other commands and features remained unchanged other than to support differences in the underlying platforms, like numeric formats. For instance, the original dBASE User Manual uses this example: [42]
use people list
Which is identical to the instructions in RETRIEVE:
LOAD people LIST
The overall operation of the statements is largely identical between the two systems. dBASE's primary differences are related to the programmability; dBASE added variables, could LIST
columns made of formulas like LIST SALARY*1.05
, and added a much wider variety of functions to manipulate data, including functions for returning the length of a string or the data type of a given field. [43]
JOSS was one of the first interactive, time-sharing programming languages. It pioneered many features that would become common in languages from the 1960s into the 1980s, including use of line numbers as both editing instructions and targets for branches, statements predicated by boolean decisions, and a built-in source-code editor that can perform instructions in direct or immediate mode, what they termed a conversational user interface.
dBase was one of the first database management systems for microcomputers and the most successful in its day. The dBase system included the core database engine, a query system, a forms engine, and a programming language that tied all of these components together.
In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.
Job Control Language (JCL) is a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem. The purpose of JCL is to say which programs to run, using which files or devices for input or output, and at times to also indicate under what conditions to skip a step. Parameters in the JCL can also provide accounting information for tracking the resources used by a job as well as which machine the job should run on.
Dartmouth BASIC is the original version of the BASIC programming language. It was designed by two professors at Dartmouth College, John G. Kemeny and Thomas E. Kurtz. With the underlying Dartmouth Time Sharing System (DTSS), it offered an interactive programming environment to all undergraduates as well as the larger university community.
Tymnet was an international data communications network headquartered in Cupertino, California that used virtual call packet-switched technology and X.25, SNA/SDLC, BSC and Async interfaces to connect host computers (servers) at thousands of large companies, educational institutions, and government agencies. Users typically connected via dial-up connections or dedicated asynchronous connections.
The IBM 1130 Computing System, introduced in 1965, was IBM's least expensive computer at that time. A binary 16-bit machine, it was marketed to price-sensitive, computing-intensive technical markets, like education and engineering, succeeding the decimal IBM 1620 in that market segment. Typical installations included a 1 megabyte disk drive that stored the operating system, compilers and object programs, with program source generated and maintained on punched cards. Fortran was the most common programming language used, but several others, including APL, were available.
In computing, a line number is a method used to specify a particular sequence of characters in a text file. The most common method of assigning numbers to lines is to assign every line a unique number, starting at 1 for the first line, and incrementing by 1 for each successive line.
CAL, short for Conversational Algebraic Language, was a programming language and system designed and developed by Butler Lampson at Berkeley in 1967 for the SDS 940 mainframe computer. CAL is a version of the seminal JOSS language with several cleanups and new features to take advantage of the SDS platform.
The Jet Propulsion Laboratory Display Information System is a file management program written in FORTRAN.
IBM System/3 BASIC was an interpreter for the BASIC programming language for the IBM System/34 midrange computer.
IBM System/36 BASIC was an interpreter for the IBM System/36 midrange computer.
In computing, ".bak" is a filename extension commonly used to signify a backup copy of a file.
The Universal Time-Sharing System (UTS) is a discontinued operating system for the XDS Sigma series of computers, succeeding Batch Processing Monitor (BPM)/Batch Time-Sharing Monitor (BTM). UTS was announced in 1966, but because of delays did not actually ship until 1971. It was designed to provide multi-programming services for online (interactive) user programs in addition to batch-mode production jobs, symbiont (spooled) I/O, and critical real-time processes. System daemons, called "ghost jobs" were used to run monitor code in user space. The final release, D00, shipped in January, 1973. It was succeeded by the CP-V operating system, which combined UTS with features of the heavily batch-oriented Xerox Operating System (XOS).
The .dbf file extension represents the dBase database file. The file type was introduced in 1983 with dBASE II. The file structure has evolved to include many features and capabilities. Several additional file types have been added, to support data storage and manipulation. The current .dbf file level is called Level 7. The .dbf format is supported by a number of database products.
Distributed Data Management Architecture (DDM) is IBM's open, published software architecture for creating, managing and accessing data on a remote computer. DDM was initially designed to support record-oriented files; it was extended to support hierarchical directories, stream-oriented files, queues, and system command processing; it was further extended to be the base of IBM's Distributed Relational Database Architecture (DRDA); and finally, it was extended to support data description and conversion. Defined in the period from 1980 to 1993, DDM specifies necessary components, messages, and protocols, all based on the principles of object-orientation. DDM is not, in itself, a piece of software; the implementation of DDM takes the form of client and server products. As an open architecture, products can implement subsets of DDM architecture and products can extend DDM to meet additional requirements. Taken together, DDM products implement a distributed file system.
SUPER BASIC, sometimes SBASIC for short, is an advanced dialect of the BASIC programming language offered on Tymshare's SDS 940 systems starting in 1968 and available well into the 1970s.
Data General Extended BASIC, also widely known as Nova Extended BASIC, was a BASIC programming language interpreter for the Data General Nova series minicomputers. It was based on the seminal Dartmouth BASIC, including the Fifth Edition's string variables and powerful MAT
commands for matrix manipulation. In contrast to the compile-and-go Dartmouth BASIC, Extended BASIC was an interpreter.
The TENET 210 was a mainframe computer designed for timesharing services. The machine was designed for high throughput and expandability, including 20 direct memory access (DMA) channels and eight slots for core memory, allowing up to 128k 32-bit words of RAM. The sales materials boasted that it guaranteed user responses within one second.
CALL/360:BASIC was an IBM dialect of the BASIC programming language for the System/360 and later platforms. It was based on mid-1960s versions of Dartmouth BASIC but added a number of extensions. Most of these were related to file handling, which, at that time, Dartmouth lacked. It also added support for the mathematical symbols found on some IBM terminals, so that <= could be entered directly as ≤. Differences are otherwise minor.