High Level Assembly

High Level Assembly (HLA)
Developer(s)	Randall Hyde
Stable release	2.16 / July 6, 2011;13 years ago
Repository	sourceforge.net/projects/hlav1
Written in	Assembly language
Operating system	Windows, Linux, FreeBSD, macOS
Platform	IA-32
Available in	English
Type	Assembler
License	Public domain
Website	plantation-productions.com/Webster/HighLevelAsm/index.html

Last updated September 14, 2024

High-Level Assembly (HLA) is a language developed by Randall Hyde that allows the use of higher-level language constructs to aid both beginners and advanced assembly developers. It supports advanced data types and object-oriented programming. It uses a syntax loosely based on several high-level programming languages (HLLs), such as Pascal, Ada, Modula-2, and C++, to allow the creation of readable assembly language programs, and to allow HLL programmers to learn HLA as fast as possible.

Origins and Goals

HLA was originally conceived as a tool to teach assembly language programming at the college-university level. The goal is to leverage students' existing programming knowledge when learning assembly language to get them up to speed as fast as possible. Most students taking an assembly language programming course have already been introduced to high-level control flow structures, such as IF, WHILE, FOR, etc. HLA allows students to immediately apply that programming knowledge to assembly language coding early in their course, allowing them to master other prerequisite subjects in assembly before learning how to code low-level forms of these control structures. The book The Art of Assembly Language Programming by Randall Hyde uses HLA for this purpose.^[1]

High vs. low-level assembler

The HLA v2.x assembler supports the same low-level machine instructions as a regular low-level assembler, while high-end assemblers also support high-level-language-like statements, such as IF, WHILE, and so on, and fancier data declaration directives such as structures-records, unions, and even classes.

Some examples of high-end assemblers are HLA, Microsoft Macro Assembler (MASM), and the Turbo Assembler (TASM) on the Intel x86 processor family,

Unlike most other assembler tools, the HLA compiler includes a Standard Library with thousands of functions, procedures, and macros that can be used to create full applications with the ease of a high-level language. While assembly language libraries are not new, a language that includes a large standardized library encourages programmers to use the library code rather than simply writing their own library functions.

HLA supports all the same low-level machine instructions as other x86 assemblers. Furthermore, HLA's high-level control structures are based on the ones found in MASM and TASM, which HLL-like features predated the arrival of HLA by several years. In HLA, low-level assembly code can be written as easily as with any other assembler by simply ignoring the HLL-control constructs. In contrast to HLLs like Pascal and C(++), HLA doesn't require inline asm statements. In HLA, HLL-like features appear to provide a learning aid for beginning assembly programmers by smoothing the learning curve, with the assumption that they will discontinue the use of those statements once they master the low-level instruction set. In practice, many experienced programmers continue to use HLL-like statements in HLA, MASM, and TASM, long after mastering the low-level instruction set, but this is usually done to improve readability.

It is also possible to write high-level programs using HLA, avoiding much of the tedium of low-level assembly language programming. Some assembly language programmers reject HLA out of hand^{[ citation needed ]}, because it allows programmers to do this. However, supporting both high-level and low-level programming gives any language an expanded range of applicability.

Distinguishing features

Two HLA features set it apart from other x86 assemblers: its powerful macro system (compile-time language) and the HLA Standard Library.

Macro system

HLA's compile-time language allows extending the language with ease, even creating small domain-specific languages to help easily solve common programming problems. The macro stdout.put is a good example of a sophisticated macro that can simplify programming. Consider the following invocation of that macro:

stdout.put( "I=", i, " s=", s, " u=", u, " r=", r:10:2, nl );

The stdout.put macro processes each of the arguments to determine the argument's type and then calls an appropriate procedure in the HLA Standard library to handle the output of each of these operands.

Most assemblers provide some sort of macro ability: the advantage that HLA offers over other assemblers is that it can process macro arguments like r:10:2 using HLA's extensive compile-time string functions, and HLA's macro facilities can infer the types of variables and use that information to direct macro expansion.

HLA's macro language provides a special Context-Free macro facility. This feature allows easily writing macros that span other sections of code via a starting and terminating macro pair (along with optional intermediate macro invocations that are only available between the start–terminate macros). For example, one can write a fully recursive-nestable SWITCH–CASE–DEFAULT–ENDSWITCH statement using this macro facility.

Because of the HLA macro facilities context-free design, these switch..case..default..endswitch statements can be nested, and the nested statements' emitted code will not conflict with the outside statements.

Compile-Time Language

The HLA macro system is actually a subset of a larger feature known as the HLA Compile-Time Language (CTL). The HLA CTL is an interpreted language that is available in an HLA program source file. An interpreter executes HLA CTL statements during the compiling of an HLA source file; hence the name compile-time language.

The HLA CTL includes many control statements such as #IF, #WHILE, #FOR, #PRINT, an assignment statement and so on. One can also create compile-time variables and constants (including structured data types such as records and unions). The HLA CTL also provides hundreds of built-in functions (including a very rich set of string and pattern-matching functions). The HLA CTL allows programmers to create CTL programs that scan and parse strings, allowing those programmers to create embedded domain-specific languages (EDSLs, also termed mini-languages ). The stdout.put macro appearing earlier is an example of such an EDSL. The put macro (in the stdout namespace, hence the name stdout.put) parses its macro parameter list and emits the code that will print its operands.

Standard library

The HLA Standard Library is an extensive set of pre-made routines and macros (like the stdout.put macro described above) that make life easier for programmers, saving them from starting from scratch every time they write a new application. Perhaps just as important, the HLA Standard Library allows programmers to write portable applications that run under Windows or Linux with nothing more than recompiling the source code. Like the C standard library for the programming language C, the HLA Standard Library allows users to abstract away low-level operating system (OS) calls, so the same set of OS application programming interfaces (APIs) can serve for all operating systems that HLA supports. While an assembly language allows making any needed OS calls, where programs use the HLA Standard Library API set, writing OS-portable programs is easy.

The HLA Standard Library provides thousands of functions, procedures, and macros. While the list changes over time, as of mid-2010 for HLA v2.12, it included functions in these categories:

Command-line argument processing
Array (dynamic) declaration and manipulation
Bit manipulation
Blob (binary large object) manipulation
Character manipulation
Conversions
Character set manipulation
Date and time functions
Object-oriented file I/O
Standard file I/O
File system manipulation functions, e.g., delete, rename, change directory
HLA-related declarations and functions
The HLA Object Windows Library: object-oriented framework for Win32 programming
Linked list manipulation
Mathematical functions
Memory allocation and management
FreeBSD-specific APIs
Linux-specific APIs
MacOS-specific APIs
Win32-specific APIs
Text console functions
Coroutine support
Environment variable support
Exception handling support
Memory-mapped file support
Sockets and client–server object support
Thread and synchronization support
Timer functions
Pattern matching support for regular expressions and context-free languages
Random number generators
Remote procedure call support
Standard error output functions
Standard output functions
Standard input functions
String functions
Table (associative) support
Zero-terminated string functions

Design

The HLA v2.x language system is a command-line driven tool that consists of several components, including a shell program (e.g., hla.exe under Windows), the HLA language compiler (e.g., hlaparse.exe), a low-level translator (e.g., the HLABE, or HLA Back Engine), a linker (link.exe under Windows, ld under Linux), and other tools such as a resource compiler for Windows. Versions before 2.0 relied on an external assembler back end; versions 2.x and later of HLA use the built-in HLABE as the back-end object code formatter.

The HLA shell application processes command line parameters and routes appropriate files to each of the programs that make up the HLA system. It accepts as input .hla files (HLA source files), .asm files (source files for MASM, TASM, FASM, NASM, or Gas assemblers), .obj files for input to the linker, and .rc files (for use by a resource compiler).

Source code translation

Originally, the HLA v1.x tool compiled its source code into an intermediate source file that a back-end assembler such as MASM, TASM, flat assembler (FASM), Netwide Assembler (NASM), or GNU Assembler (Gas) would translate into the low-level object code file. As of HLA v2.0, HLA included its own HLA Back Engine (HLABE) that provided the low-level object code translation. However, via various command-line parameters, HLA v2.x still has the ability to translate an HLA source file into a source file that is compatible with one of these other assemblers.

HLA Back Engine

The HLA Back Engine (HLABE) is a compiler back end that translates an internal intermediate language into low-level Portable Executable (PE), Common Object File Format (COFF), Executable and Linkable Format (ELF), or Mach-O object code. An HLABE program mostly consists of data (byte) emission statements, 32-bit relocatable address statements, x86 control-transfer instructions, and various directives. In addition to translating the byte and relocatable address statements into the low-level object code format, HLABE also handles branch-displacement optimization (picking the shortest possible form of a branch instruction).

Although the HLABE is incorporated into the HLA v2.x compiler, it is actually a separate product. It is public domain and open source (hosted on SourceForge.net).

Notes

↑ "The Art of Assembly Language Programming". Archived from the original on 2018-03-29. Retrieved 2010-02-12.

Related Research Articles

In computer programming, assembly language, often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Assembly language usually has one statement per machine instruction (1:1), but constants, comments, assembler directives, symbolic labels of, e.g., memory locations, registers, and macros are generally also supported.

C is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems code, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

In computer programming, a macro is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion. The input and output may be a sequence of lexical tokens or characters, or a syntax tree. Character macros are supported in software applications to make it easy to invoke common command sequences. Token and tree macros are supported in some programming languages to enable code reuse or to extend the language, sometimes for domain-specific languages.

SNOBOL is a series of programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph Griswold and Ivan P. Polonsky, culminating in SNOBOL4. It was one of a number of text-string-oriented languages developed during the 1950s and 1960s; others included COMIT and TRAC.

Turbo Pascal is a software development system that includes a compiler and an integrated development environment (IDE) for the programming language Pascal running on the operating systems CP/M, CP/M-86, and DOS. It was originally developed by Anders Hejlsberg at Borland, and was notable for its very fast compiling. Turbo Pascal, and the later but similar Turbo C, made Borland a leader in PC-based development tools.

<span class="mw-page-title-main">FASM</span> Open source assembler for x86 processors

FASM is an assembler for x86 processors. It supports Intel-style assembly language on the IA-32 and x86-64 computer architectures. It claims high speed, size optimizations, operating system (OS) portability, and macro abilities. It is a low-level assembler and intentionally uses very few command-line options. It is free and open-source software.

In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate significant areas of computing systems, making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.

A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture; commands or functions in the language are structurally similar to a processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture.

A high-level assembler in computing is an assembler for assembly language that incorporate features found in a high-level programming language.

IBM CICS is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE.

Metaprogramming is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modify itself, while running. In some cases, this allows programmers to minimize the number of lines of code to express a solution, in turn reducing development time. It also allows programs more flexibility to efficiently handle new situations with no recompiling.

Turbo Assembler is an assembler for software development published by Borland in 1989. It runs on and produces code for 16- or 32-bit x86 MS-DOS and compatibles or Microsoft Windows. It can be used with Borland's other language products: Turbo Pascal, Turbo Basic, Turbo C, and Turbo C++. The Turbo Assembler package is bundled with Turbo Linker and is interoperable with Turbo Debugger.

The Microsoft Macro Assembler (MASM) is an x86 assembler that uses the Intel syntax for MS-DOS and Microsoft Windows. Beginning with MASM 8.0, there are two versions of the assembler: One for 16-bit & 32-bit assembly sources, and another (ML64) for 64-bit sources only.

Managed Extensions for C++ or Managed C++ is a deprecated set of language extensions for C++, including grammatical and syntactic extensions, keywords and attributes, to bring the C++ syntax and language to the .NET Framework. These extensions were created by Microsoft to allow C++ code to be targeted to the Common Language Runtime (CLR) in the form of managed code, as well as continue to interoperate with native code.

The IBM Basic assembly language and successors is a series of assembly languages and assemblers made for the IBM System/360 mainframe system and its successors through the IBM Z.

Vala is an object-oriented programming language with a self-hosting compiler that generates C code and uses the GObject system.

Open Watcom Assembler or WASM is an x86 assembler produced by Watcom, based on the Watcom Assembler found in Watcom C/C++ compiler and Watcom FORTRAN 77. Further development is being done on the 32- and 64-bit JWASM project, which more closely matches the syntax of Microsoft's assembler.

Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact form of data.

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level system programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley. It is intended as a successor to the language C, with the intent of being even smaller and simpler to program in, while offering more function. It is free and open-source software, released under an MIT License.

References

Richard Blum, Professional assembly language, Wiley, 2005, ISBN 0-7645-7901-0, p. 42
Randall Hyde, Write Great Code: Understanding the machine, No Starch Press, 2004, ISBN 1-59327-003-8, pp. 14–15 and used throughout the book
Randall Hyde, The Art of Assembly Language, 2nd Edition, No Starch Press, 2010, ISBN 1-59327-207-3, used throughout the book

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "The Art of Assembly Language Programming". Archived from the original on 2018-03-29. Retrieved 2010-02-12.

[1]

v t e x86 assembly topics
Topics	Assembly language Comparison of assemblers Disassembler Instruction set Low-level programming language Machine code Microassembler x86 assembly language
Assemblers	A86/A386 Flat Assembler (FASM) GNU Assembler (GAS) High Level Assembly (HLA) Microsoft Macro Assembler (MASM) Netwide Assembler (NASM) Turbo Assembler (TASM) Open Watcom Assembler (WASM)
Programming issues	Call stack Flags Carry flag Direction flag Interrupt flag Overflow flag Zero flag Memory address Opcode Program counter Processor register Calling conventions Instruction listings Registers