Proteus (programming language)

Last updated

Proteus (PROcessor for TExt Easy to USe) is a fully functional, procedural programming language created in 1998 by Simone Zanella. Proteus incorporates many functions derived from several other languages: C, BASIC, Assembly, Clipper/dBase; it is especially versatile in dealing with strings, having hundreds of dedicated functions; this makes it one of the richest languages for text manipulation.

Contents

Proteus owes its name to a Greek god of the sea (Proteus), who took care of Neptune's crowd and gave responses; he was renowned for being able to transform himself, assuming different shapes. Transforming data from one form to another is the main usage of this language.

Introduction

Proteus was initially created as a multiplatform (DOS, Windows, Unix) system utility, to manipulate text and binary files and to create CGI scripts. The language was later focused on Windows, by adding hundreds of specialized functions for: network and serial communication, database interrogation, system service creation, console applications, keyboard emulation, ISAPI scripting (for IIS). Most of these additional functions are only available in the Windows flavour of the interpreter, even though a Linux version is still available.

Proteus was designed to be practical (easy to use, efficient, complete), readable and consistent.

Its strongest points are:

The language can be extended by adding user functions written in Proteus or DLLs created in C/C++.

Language features

At first sight, Proteus may appear similar to Basic because of its straight syntax, but similarities are limited to the surface:

Data types supported by Proteus are only three: integer numbers, floating point numbers and strings. Access to advanced data structures (files, arrays, queues, stacks, AVL trees, sets and so on) takes place by using handles, i.e. integer numbers returned by item creation functions.

Type declaration is unnecessary: variable type is determined by the function applied – Proteus converts on the fly every variable when needed and holds previous data renderings, to avoid performance degradation caused by repeated conversions.

There is no need to add parenthesis in expressions to determine the evaluation order, because the language is fully functional (there are no operators).

Proteus includes hundreds of functions for:

Proteus supports associative arrays (called sets) and AVL trees, which are very useful and powerful to quickly sort and lookup values.

Two types of regular expressions are supported:

Both types of expressions can be used to parse and compare data.

The functional approach and the extensive library of built-in functions allow to write very short but powerful scripts; to keep them comprehensible, medium-length keywords were adopted.

The user, besides writing new high-level functions in Proteus, can add new functions in C/C++ by following the guidelines and using the templates available in the software development kit; the new functions can be invoked exactly the same way as the predefined ones, passing expressions by value or variables by reference.

Proteus is an interpreted language: programs are loaded into memory, pre-compiled and run; since the number of built-in functions is large, execution speed is usually very good and often comparable to that of compiled programs.

One of the most interesting features of Proteus is the possibility of running scripts as services or ISAPI scripts.

Running a Proteus script as a service, started as soon as the operating system has finished loading, gives many advantages:

This is very useful to protect critical processes in industrial environments (data collection, device monitoring), or to avoid that the operator inadvertently closes a utility (keyboard emulation).

The ISAPI version of Proteus can be used to create scripts run through Internet Information Services and is equipped with specific functions to cooperate with the web server.

For intellectual property protection Proteus provides:

Proteus is appreciated because it is relatively easy to write short, powerful and comprehensible scripts; the large number of built-in functions, together with the examples in the manual, keep low the learning curve.

The development environment includes a source code editor with syntax highlighting and a context-sensitive guide. Proteus does not need to be installed: the interpreter is a single executable (below 400 Kb) that does not require additional DLLs to be run on recent Windows systems.

Synopsis and licensing

The main features of this language are:

Proteus is available in demo version (script execution limited to three minutes) and registered version, protected by a USB dongle. At the moment, is available as a Windows or Ubuntu package and is distributed by SZP.

Example programs

Hello World

The following example prints out "Hello world!".

CONSOLELN "Hello World!"

Extract two fields

The following example reads the standard input (CSV format, separator ";") and prints out the first two fields separated by "|":

CONSOLELN TOKEN(L, 1, ";") "|" TOKEN(L, 2, ";")

Proteus scripts by default work on an input file and write to an output file; the predefined identifier L gets the value of every line in input. The function TOKEN returns the requested item of the string; the third parameter represents the delimiter. String concatenation is implicit.

The same program can be written in this way:

H = TOKNEW(L, ";")
CONSOLELN TOKGET(H, 1) "|" TOKGET(H, 2)
TOKFREE(H)

In this case, we used another function (TOKGET), which builds the list of the tokens in the line; this is more efficient if we need to access several items in the string.

Related Research Articles

<span class="mw-page-title-main">AWK</span> Programming language

AWK is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

<span class="mw-page-title-main">Shell script</span> Script written for the shell, or command line interpreter, of an operating system

A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manipulation, program execution, and printing text. A script which sets up the environment, runs the program, and does any necessary cleanup or logging, is called a wrapper.

<span class="mw-page-title-main">Unix shell</span> Command-line interpreter for Unix operating system

A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to control the execution of the system using shell scripts.

VBScript is a deprecated Active Scripting language developed by Microsoft that is modeled on Visual Basic. It allows Microsoft Windows system administrators to generate powerful tools for managing computers without error handling and with subroutines and other advanced programming constructs. It can give the user complete control over many aspects of their computing environment.

ScriptBasic is a scripting language variant of BASIC. The source of the interpreter is available as a C program under the LGPL license.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

mIRC scripting language Scripting language embedded in mIRC

The mIRC scripting language is the scripting language embedded in mIRC and Adiirc, IRC clients for Windows but work with WiNE for Linux.

TI-BASIC is the official name of a BASIC-like language built into Texas Instruments (TI)'s graphing calculators. TI-BASIC is a language family of three different and incompatible versions, released on different products:

The Internet Server Application Programming Interface (ISAPI) is an n-tier API of Internet Information Services (IIS), Microsoft's collection of Windows-based web server services. The most prominent application of IIS and ISAPI is Microsoft's web server.

<span class="mw-page-title-main">Comparison of command shells</span>

A command shell is a command-line interface to interact with and manipulate a computer's operating system.

newLISP

newLISP is a scripting language which is a dialect of the Lisp family of programming languages. It was designed and developed by Lutz Mueller. Because of its small resource requirements, newLISP is excellent for embedded systems applications. Most of the functions you will ever need are already built in. This includes networking functions, support for distributed and multicore processing, and Bayesian statistics. newLISP is free and open-source software released under the GNU General Public License, version 3 or later.

thinBasic is a BASIC-like computer programming language interpreter with a central core engine architecture surrounded by many specialized modules. Although originally designed mainly for computer automation, thanks to its modular structure it can be used for wide range of tasks.

A dynamic-link library (DLL) is a shared library in the Microsoft Windows or OS/2 operating system.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

<span class="mw-page-title-main">Visual Basic (classic)</span> Microsofts programming language based on BASIC and COM

Visual Basic (VB) before .NET, sometimes referred to as Classic Visual Basic, is a third-generation programming language, based on BASIC, and an integrated development environment (IDE), from Microsoft for Windows known for supporting rapid application development (RAD) of graphical user interface (GUI) applications, event-driven programming and both consumption and development of components via the Component Object Model (COM) technology.

windows.h is a Windows-specific header file for the C and C++ programming languages which contains declarations for all of the functions in the Windows API, all the common macros used by Windows programmers, and all the data types used by the various functions and subsystems. It defines a very large number of Windows specific functions that can be used in C. The Win32 API can be added to a C programming project by including the <windows.h> header file and linking to the appropriate libraries. To use functions in xxxx.dll, the program must be linked to xxxx.lib. Some headers are not associated with a .dll but with a static library.

<span class="mw-page-title-main">PowerShell</span> Cross-platform command-line interface and scripting language for system and network administration

PowerShell is a task automation and configuration management program from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-source and cross-platform on August 18, 2016, with the introduction of PowerShell Core. The former is built on the .NET Framework, the latter on .NET.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

Tcl is a high-level, general-purpose, interpreted, dynamic programming language. It was designed with the goal of being very simple but powerful. Tcl casts everything into the mold of a command, even programming constructs like variable assignment and procedure definition. Tcl supports multiple programming paradigms, including object-oriented, imperative, functional, and procedural styles.