Original author(s) | Simon Tatham, Julian Hall |
---|---|
Developer(s) | H. Peter Anvin, Chang Seok Bae, Jim Kukunas, Frank B. Kotler, Cyrill Gorcunov |
Initial release | October 1996 |
Stable release | |
Repository | |
Written in | Assembly, C [2] |
Operating system | Unix-like, Windows, OS/2, MS-DOS |
Available in | English |
Type | x86 assembler |
License | BSD 2-clause |
Website | www |
The Netwide Assembler (NASM) is an assembler and disassembler for the Intel x86 architecture. It can be used to write 16-bit, 32-bit (IA-32) and 64-bit (x86-64) programs. It is considered one of the most popular assemblers for Linux and x86 chips. [3]
It was originally written by Simon Tatham with assistance from Julian Hall. As of 2016 [update] , it is maintained by a small team led by H. Peter Anvin. [4] It is open-source software released under the terms of a simplified (2-clause) BSD license. [5]
NASM can output several binary formats, including COFF, OMF, a.out, Executable and Linkable Format (ELF), Mach-O and binary file (.bin, binary disk image, used to compile operating systems), though position-independent code is supported only for ELF object files. It also has its own binary format called RDOFF. [6]
The variety of output formats allows retargeting programs to virtually any x86 operating system (OS). It can also create flat binary files, usable to write boot loaders, read-only memory (ROM) images, and in various facets of OS development. [6] It can run on non-x86 platforms as a cross assembler, such as PowerPC and SPARC, though it cannot generate programs usable by those machines.
NASM uses a variant of Intel assembly syntax instead of AT&T syntax. [7] It also avoids features such as automatic generation of segment overrides (and the related ASSUME directive) used by MASM and compatible assemblers. [6]
A "Hello, world!" program for the DOS operating system:
section.textorg0x100movah,0x9movdx,helloint0x21movax,0x4c00int0x21section.datahello:db'Hello, world!',13,10,'$'
An equivalent program for Linux:
global_startsection.text_start:moveax,4; writemovebx,1; stdoutmovecx,msgmovedx,msg.lenint0x80; write(stdout, msg, strlen(msg));xoreax,msg.len; invert return value from write()xchgeax,ebx; value for exit()moveax,1; exitint0x80; exit(...)section.datamsg:db"Hello, world!",10.len:equ$-msg
An example of a similar program for Microsoft Windows:
global_mainextern_MessageBoxA@16extern_ExitProcess@4sectioncodeuse32class=code_main:pushdword0; UINT uType = MB_OKpushdwordtitle; LPCSTR lpCaptionpushdwordbanner; LPCSTR lpTextpushdword0; HWND hWnd = NULLcall_MessageBoxA@16pushdword0; UINT uExitCodecall_ExitProcess@4sectiondatause32class=databanner:db'Hello, world!',0title:db'Hello',0
A 64-bit program for Apple OS X that inputs a keystroke and shows it on the screen:
global_startsection.dataquery_string:db"Enter a character: "query_string_len:equ$-query_stringout_string:db"You have input: "out_string_len:equ$-out_stringsection.bssin_char:resw4section.text_start:movrax,0x2000004; put the write-system-call-code into register raxmovrdi,1; tell kernel to use stdoutmovrsi,query_string; rsi is where the kernel expects to find the address of the messagemovrdx,query_string_len; and rdx is where the kernel expects to find the length of the message syscall; read in the charactermovrax,0x2000003; read system callmovrdi,0; stdinmovrsi,in_char; address for storage, declared in section .bssmovrdx,2; get 2 bytes from the kernel's buffer (one for the carriage return)syscall; show user the outputmovrax,0x2000004; write system callmovrdi,1; stdoutmovrsi,out_stringmovrdx,out_string_lensyscallmovrax,0x2000004; write system callmovrdi,1; stdoutmovrsi,in_charmovrdx,2; the second byte is to apply the carriage return expected in the stringsyscall; exit system callmovrax,0x2000001; exit system callxorrdi,rdisyscall
NASM principally outputs object files, which are generally not executable by themselves. The only exception to this are flat binaries (e.g., .COM) [6] which are inherently limited in modern use. To translate the object files into executable programs, an appropriate linker must be used, such as the Visual Studio "LINK" utility for Windows or ld for Unix-like systems.
NASM version 0.90 was released in October 1996. [5]
Version 2.00 was released on 28 November 2007, adding support for x86-64 extensions. [4] The development versions are not uploaded to SourceForge.net, but are checked into GitHub with binary snapshots available from the project web page.
A search engine for NASM documentation is also available. [8]
In July 2009, as of version 2.07, NASM was released under the Simplified (2-clause) BSD license. Previously, because it was licensed under LGPL, it led to development of Yasm, a complete rewrite of under the New BSD License. Yasm offered support for x86-64 earlier than NASM. It also added support for GNU Assembler syntax.
Relocatable Dynamic Object File Format (RDOFF) is used by developers to test the integrity of NASM's object file output abilities. It is based heavily on the internal structure of NASM, [9] essentially consisting of a header containing a serialization of the output driver function calls followed by an array of sections containing executable code or data. Tools for using the format, including a linker and loader, are included in the NASM distribution.
Until version 0.90 was released in October 1996, NASM supported output of only flat-format executable files (e.g., DOS COM files). In version 0.90, Simon Tatham added support for an object-file output interface, and for DOS .OBJ files for 16-bit code only. [10]
NASM thus lacked a 32-bit object format. To address this lack, and as an exercise to learn the object-file interface, developer Julian Hall put together the first version of RDOFF, which was released in NASM version 0.91. [10]
Since this initial version, there has been one major update to the RDOFF format, which added a record-length indicator on each header record, [11] allowing programs to skip over records whose format they do not recognise, and support for multiple segments; RDOFF1 only supported three segments: text, data and bss (containing uninitialized data). [9]
The RDOFF format is strongly deprecated and has been disabled starting in NASM 2.15.04. [12]
A disassembler is a computer program that translates machine language into assembly language—the inverse operation to that of an assembler. Disassembly, the output of a disassembler, is often formatted for human-readability rather than suitability for input to an assembler, making it principally a reverse-engineering tool. Common uses of disassemblers include analyzing high-level programing language compilers output and their optimizations, recovering source code of a program whose original source was lost, malware analysis, modifying software, and software cracking.
A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture—commands or functions in the language map that are structurally similar to processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture.
x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.
In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name. In modern use on most architectures these are much rarer than segmentation faults, which occur primarily due to memory access violations: problems in the logical address or permissions.
In computing, vectored I/O, also known as scatter/gather I/O, is a method of input and output by which a single procedure call sequentially reads data from multiple buffers and writes it to a single data stream (gather), or reads data from a data stream and writes it to multiple buffers (scatter), as defined in a vector of buffers. Scatter/gather refers to the process of gathering data from, or scattering data into, the given set of buffers. Vectored I/O can operate synchronously or asynchronously. The main reasons for using vectored I/O are efficiency and convenience.
High Level Assembly (HLA) is a language developed by Randall Hyde that allows the use of higher-level language constructs to aid both beginners and advanced assembly developers. It fully supports advanced data types and object-oriented programming. It uses a syntax loosely based on several high-level programming languages (HLLs), such as Pascal, Ada, Modula-2, and C++, to allow the creation of readable assembly language programs, and to allow HLL programmers to learn HLA as fast as possible.
Turbo Assembler is an assembler for software development published by Borland in 1989. It runs on and produces code for 16- or 32-bit x86 MS-DOS and compatibles or Microsoft Windows. It can be used with Borland's other language products: Turbo Pascal, Turbo Basic, Turbo C, and Turbo C++. The Turbo Assembler package is bundled with Turbo Linker and is interoperable with Turbo Debugger.
The GNU Assembler, commonly known as gas or as, is the assembler developed by the GNU Project. It is the default back-end of GCC. It is used to assemble the GNU operating system and the Linux kernel, and various other software. It is a part of the GNU Binutils package.
MACRO-11 is an assembly language with macro facilities, designed for PDP-11 minicomputer family from Digital Equipment Corporation (DEC). It is the successor to Program Assembler Loader (PAL-11R), an earlier version of the PDP-11 assembly language without macro facilities.
A86 is an assembler for MS-DOS which generates code for the Intel x86 family of microprocessors. Written by Eric Isaacson, it was first made available as shareware in June 1986. The assembler is contained in one 32K executable and can directly produce a COM file or an object file for use with a standard linker. It comes with a debugger, D86.
In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.
objdump is a command-line program for displaying various information about object files on Unix-like operating systems. For instance, it can be used as a disassembler to view an executable in assembly form. It is part of the GNU Binutils for fine-grained control over executables and other binary data. objdump uses the BFD library to read the contents of object files. Similar utilities are Borland TDUMP, Microsoft DUMPBIN and readelf.
On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.
crt0 is a set of execution startup routines linked into a C program that performs any initialization work required before calling the program's main function.
This article describes the calling conventions used when programming x86 architecture microprocessors.
C# Open Source Managed Operating System (Cosmos) is a toolkit for building GUI and command-line based operating systems, written mostly in the programming language C# and small amounts of a high level assembly language named X#. Cosmos is a backronym, in that the acronym was chosen before the meaning. It is open-source software released under a BSD license.
A decompiler is a computer program that translates an executable file to high-level source code. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level language. While disassemblers translate an executable into assembly language, decompilers go a step further and translate the code into a higher level language such as C or Java, requiring more sophisticated techniques. Decompilers are usually unable to perfectly reconstruct the original source code, thus will frequently produce obfuscated code. Nonetheless, they remain an important tool in the reverse engineering of computer software.
The LLDB Debugger (LLDB) is the debugger component of the LLVM project. It is built as a set of reusable components which extensively use existing libraries from LLVM, such as the Clang expression parser and LLVM disassembler. LLDB is free and open-source software under the University of Illinois/NCSA Open Source License, a BSD-style permissive software license. Since v9.0.0, it was relicensed to the Apache License 2.0 with LLVM Exceptions.
Blind return oriented programming (BROP) is an exploit technique which can successfully create an exploit even if the attacker does not possess the target binary. BROP attacks shown by Bittau et al. have defeated address space layout randomization (ASLR) and stack canaries on 64-bit systems.
The Alternate Instruction Set (AIS) is a second 32-bit instruction set architecture found in some x86 CPUs made by VIA Technologies. On these VIA C3 processors, the second hidden processor mode is accessed by executing the x86 instruction JMPAI
. If AIS mode has been enabled, the processor will perform a JMP EAX and begin executing AIS instructions at the address of the EAX register. Using AIS allows native access to the Centaur Technology-designed RISC core inside the processor.
two of the most popular assemblers for Linux, GNU Assembler (GAS) and Netwide Assembler (NASM)