Netwide Assembler

Last updated

NASM
Original author(s) Simon Tatham, Julian Hall
Developer(s) H. Peter Anvin, Chang Seok Bae, Jim Kukunas, Frank B. Kotler, Cyrill Gorcunov
Initial releaseOctober 1996;27 years ago (1996-10)
Stable release
2.16.03 [1]   OOjs UI icon edit-ltr-progressive.svg / 17 April 2024;18 days ago (17 April 2024)
Repository
Written in Assembly, C [2]
Operating system Unix-like, Windows, OS/2, MS-DOS
Available in English
Type x86 assembler
License BSD 2-clause
Website www.nasm.us

The Netwide Assembler (NASM) is an assembler and disassembler for the Intel x86 architecture. It can be used to write 16-bit, 32-bit (IA-32) and 64-bit (x86-64) programs. It is considered one of the most popular assemblers for Linux and x86 chips. [3]

Contents

It was originally written by Simon Tatham with assistance from Julian Hall. As of 2016, it is maintained by a small team led by H. Peter Anvin. [4] It is open-source software released under the terms of a simplified (2-clause) BSD license. [5]

Features

NASM can output several binary formats, including COFF, OMF, a.out, Executable and Linkable Format (ELF), Mach-O and binary file (.bin, binary disk image, used to compile operating systems), though position-independent code is supported only for ELF object files. It also has its own binary format called RDOFF. [6]

The variety of output formats allows retargeting programs to virtually any x86 operating system (OS). It can also create flat binary files, usable to write boot loaders, read-only memory (ROM) images, and in various facets of OS development. [6] It can run on non-x86 platforms as a cross assembler, such as PowerPC and SPARC, though it cannot generate programs usable by those machines.

NASM uses a variant of Intel assembly syntax instead of AT&T syntax. [7] It also avoids features such as automatic generation of segment overrides (and the related ASSUME directive) used by MASM and compatible assemblers. [6]

Sample programs

A collection of sample programs can be found at Wikibooks:Netwide Assembler sample programs.

Development

NASM version 0.90 was released in October 1996. [5]

Version 2.00 was released on 28 November 2007, adding support for x86-64 extensions. [4] The development versions are not uploaded to SourceForge.net, but are checked into GitHub with binary snapshots available from the project web page.

A search engine for NASM documentation is also available. [8]

In July 2009, as of version 2.07, NASM was released under the Simplified (2-clause) BSD license. Previously, because it was licensed under LGPL, it led to development of Yasm, a complete rewrite of under the New BSD License. Yasm offered support for x86-64 earlier than NASM. It also added support for GNU Assembler syntax.

RDOFF

Relocatable Dynamic Object File Format (RDOFF) is used by developers to test the integrity of NASM's object file output abilities. It is based heavily on the internal structure of NASM, [9] essentially consisting of a header containing a serialization of the output driver function calls followed by an array of sections containing executable code or data. Tools for using the format, including a linker and loader, are included in the NASM distribution.

Until version 0.90 was released in October 1996, NASM supported output of only flat-format executable files (e.g., DOS COM files). In version 0.90, Simon Tatham added support for an object-file output interface, and for DOS .OBJ files for 16-bit code only. [10]

NASM thus lacked a 32-bit object format. To address this lack, and as an exercise to learn the object-file interface, developer Julian Hall put together the first version of RDOFF, which was released in NASM version 0.91. [10]

Since this initial version, there has been one major update to the RDOFF format, which added a record-length indicator on each header record, [11] allowing programs to skip over records whose format they do not recognise, and support for multiple segments; RDOFF1 only supported three segments: text, data and bss (containing uninitialized data). [9]

The RDOFF format is strongly deprecated and has been disabled starting in NASM 2.15.04. [12]

See also

Related Research Articles

<span class="mw-page-title-main">Assembly language</span> Low-level programming language

In computer programming, assembly language, often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Assembly language usually has one statement per machine instruction (1:1), but constants, comments, assembler directives, symbolic labels of, e.g., memory locations, registers, and macros are generally also supported.

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software.

A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture—commands or functions in the language map that are structurally similar to processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

INT is an assembly language instruction for x86 processors that generates a software interrupt. It takes the interrupt number formatted as a byte value.

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

High Level Assembly (HLA) is a language developed by Randall Hyde that allows the use of higher-level language constructs to aid both beginners and advanced assembly developers. It fully supports advanced data types and object-oriented programming. It uses a syntax loosely based on several high-level programming languages (HLLs), such as Pascal, Ada, Modula-2, and C++, to allow the creation of readable assembly language programs, and to allow HLL programmers to learn HLA as fast as possible.

The GNU Assembler, commonly known as gas or as, is the assembler developed by the GNU Project. It is the default back-end of GCC. It is used to assemble the GNU operating system and the Linux kernel, and various other software. It is a part of the GNU Binutils package.

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

Sign extension is the operation, in computer arithmetic, of increasing the number of bits of a binary number while preserving the number's sign (positive/negative) and value. This is done by appending digits to the most significant side of the number, following a procedure dependent on the particular signed number representation used.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.

This article describes the calling conventions used when programming x86 architecture microprocessors.

<span class="mw-page-title-main">Cosmos (operating system)</span> Toolkit for building GUI and command-line based operating systems

C# Open Source Managed Operating System (Cosmos) is a toolkit for building GUI and command-line based operating systems, written mostly in the programming language C# and small amounts of a high-level assembly language named X#. Cosmos is a backronym, in that the acronym was chosen before the meaning. It is open-source software released under a BSD license.

Advanced Vector Extensions are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions, and a new coding scheme.

A decompiler is a computer program that translates an executable file to high-level source code. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level language. While disassemblers translate an executable into assembly language, decompilers go a step further and translate the code into a higher level language such as C or Java, requiring more sophisticated techniques. Decompilers are usually unable to perfectly reconstruct the original source code, thus will frequently produce obfuscated code. Nonetheless, they remain an important tool in the reverse engineering of computer software.

BSON is a computer data interchange format. The name "BSON" is based on the term JSON and stands for "Binary JSON". It is a binary form for representing simple or complex data structures including associative arrays, integer indexed arrays, and a suite of fundamental scalar types. BSON originated in 2009 at MongoDB. Several scalar data types are of specific interest to MongoDB and the format is used both as a data storage and network transfer format for the MongoDB database, but it can be used independently outside of MongoDB. Implementations are available in a variety of languages such as C, C++, C#, D, Delphi, Erlang, Go, Haskell, Java, JavaScript, Julia, Lua, OCaml, Perl, PHP, Python, Ruby, Rust, Scala, Smalltalk, and Swift.

The Alternate Instruction Set (AIS) is a second 32-bit instruction set architecture found in some x86 CPUs made by VIA Technologies. On these VIA C3 processors, the second hidden processor mode is accessed by executing the x86 instruction JMPAI. If AIS mode has been enabled, the processor will perform a JMP EAX and begin executing AIS instructions at the address of the EAX register. Using AIS allows native access to the Centaur Technology-designed RISC core inside the processor.

Data General Extended BASIC, also widely known as Nova Extended BASIC, was a BASIC programming language interpreter for the Data General Nova series minicomputers. It was based on the seminal Dartmouth BASIC, including the Fifth Edition's string variables and powerful MAT commands for matrix manipulation. In contrast to the compile-and-go Dartmouth BASIC, Extended BASIC was an interpreter.

References

  1. "Release 2.16.03". 17 April 2024. Retrieved 23 April 2024.
  2. "NASM, the Netwide Assembler". GitHub . 25 October 2021.
  3. Ram Narayan. "Linux assemblers: A comparison of GAS and NASM". IBM . Archived from the original on 3 October 2013. two of the most popular assemblers for Linux, GNU Assembler (GAS) and Netwide Assembler (NASM)
  4. 1 2 "The Netwide Assembler" . Retrieved 27 June 2008.
  5. 1 2 "NASM Version History" . Retrieved 3 August 2019.
  6. 1 2 3 "NASM Manual". Archived from the original on 23 February 2009. Retrieved 15 August 2009.
  7. Randall Hyde. "NASM: The Netwide Assembler". Archived from the original on 12 September 2010. Retrieved 27 June 2008.
  8. "NASM Doc Search Engine". Archived from the original on 23 January 2010. Retrieved 14 September 2009.
  9. 1 2 "NASM Manual Ch. 6" . Retrieved 27 June 2008.
  10. 1 2 "NASM CVS". 8 June 2008. Retrieved 27 June 2008.
  11. "V1-V2.txt". 4 December 2002. Retrieved 27 June 2008.
  12. "Relocatable Dynamic Object File Format (deprecated)".

Further reading