Zig (programming language)

Last updated
Zig
Zig logo 2020.svg
Paradigms Multi-paradigm: imperative, concurrent, procedural, functional
Designed by Andrew Kelley
First appeared8 February 2016;8 years ago (2016-02-08) [1]
Preview release
0.11.0 [2]   OOjs UI icon edit-ltr-progressive.svg / 4 August 2023;8 months ago (4 August 2023)
Typing discipline Static, strong, inferred, structural, generic
Memory management Manual
Platform x86-64, ARM64, WebAssembly
Tier 2: ARM, IA-32, RISC-V, MIPS64, POWERPC64, SPARC64, some tier-2 platforms have tier-1 support for standalone programs
OS Cross-platform: Linux, FreeBSD, Windows
License MIT
Filename extensions .zig, .zir
Website ziglang.org
Influenced by
C, C++, LLVM IR, Go, Rust, JavaScript [ citation needed ]

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley [3] . It is intended to be a successor to the C programming language, with the intention of being even smaller and simpler to program in while also offering more functionality. [4]

Contents

The improvements in language simplicity relate to flow control, function calls, library imports, variable declaration and Unicode support. Additionally, the language does not make use of macros or preprocessor instructions. Features adopted from modern languages include the addition of compile-time generic types, allowing functions to work on a variety of data, along with a small set of new compiler directives to allow access to the information about those types using reflection.

Another set of additions to Zig is intended to improve code safety. Like C, Zig does not include garbage collection and memory handling is manual. To help eliminate the potential errors that arise in such systems, it includes option types and simple syntax for using them. A testing framework is also built into the language.

Description

Goals

The primary goal of Zig is to be a better solution to the sorts of tasks that are currently solved with C. A primary concern in that respect is readability; Zig attempts to use existing concepts and syntax wherever possible, avoiding the addition of different syntax for similar concepts. Additionally, it is designed for "robustness, optimality and maintainability", including a variety of features to improve safety, optimization and testing. The small and simple syntax is an important part of the maintenance, as it is a goal of the language to allow maintainers to debug the code without having to learn the intricacies of a language they might not be familiar with. [5] Even with these changes, Zig can compile into and against existing C code; C headers can be included in a Zig project and their functions called, and Zig code can be linked into C projects by including the compiler-built headers. [6]

In keeping with the overall design philosophy of making the code simple and easy to read, the Zig system as a whole also encompasses a number of stylistic changes compared to C and other C-like languages. For instance, the Rust language has operator overloading which means a statement like a = b + c might actually be a function call to a type's overloaded version of the plus operator. Additionally, that function might panic which might pre-empt any following code. In Zig, if something calls a function, it looks like a function call, if it doesn't, it doesn't look like a function. If it raises an error, it is explicit in the syntax, [6] error handling is handled through error types and can be handled with catch or try.

The goals of Zig are in contrast to those of many other languages designed in the same time period, like Go, Rust, Carbon, and Nim. Generally, these languages are more complex with additional features like operator overloading, functions that masquerade as values (properties), and many other features intended to aid the construction of large programs. These sorts of features have more in common with C++'s approach, and these languages are more along the lines of that language. [6] Zig has a more conservative extension of the type system, supporting compile-time generics and accommodating a form of duck typing with the comptime directive.

Memory handling

One of the primary sources of bugs in C programs is the memory management system, based on malloc. malloc sets aside a block of memory for use in the code and returns a reference to that memory as a pointer. There is no system to ensure that memory is released when the program no longer needs it, which can lead to programs using up all available memory, a memory leak. More common is a dangling pointer that does not refer to a properly allocated memory object. [7]

A common solution to these problems is a garbage collector (GC), which examines the program for pointers to previously malloced memory, and removing any blocks that no longer have anything pointing to them. Although this greatly reduces, or even eliminates, memory errors, GC systems are relatively slow compared to manual memory management, and have unpredictable performance that makes them unsuited to systems programming. Another solution is automatic reference counting (ARC), which implements the same basic concept of looking for pointers to removed memory, but does so at malloc time by recording the number of pointers to that block, meaning there does not need to perform an exhaustive search, but instead adds time to every malloc and release operation. [7]

Zig aims to provide performance similar to or better than C, so GC and ARC are not suitable solutions. Instead, it uses a modern, as of 2022, concept known as option types. Instead of a pointer being allowed to point to nothing, or nil, a separate type is used to indicate data that is optionally empty. This is similar to using a structure with a pointer and a boolean that indicates whether the pointer is valid, but the state of the boolean is invisibly managed by the language and does not need to be explicitly managed by the programmer. So, for instance, when the pointer is declared it is set to "unallocated", and when that pointer receives a value from a malloc, it is set to "allocated" if the malloc succeeded. [8]

The advantage to this model is that it has very low or zero overhead; the compiler has to create the code to pass along the optional type when pointers are manipulated, as opposed to a simple pointer, but this allows it to directly express possible memory problems at compile time with no runtime support. For instance, creating a pointer with a null value and then attempting to use it is perfectly acceptable in C, leading to null-pointer errors. In contrast, a language using optional types can check that all code paths only attempt to use pointers when they are valid. While this does not eliminate all potential problems, when issues do occur at runtime the error can be more precisely located and explained. [9]

Another change for memory management in Zig is that the actual allocation is handled through structs describing the action, as opposed to calling the memory management functions in libc. For instance, in C if one wants to write a function that makes a string containing multiple copies of another string, the function might look like this:

constchar*repeat(constchar*original,size_ttimes);

In the code, the function would examine the size of original and then malloc times that length to set aside memory for the string it will build. That malloc is invisible to the functions calling it, if they fail to later release the memory, a leak will occur. In Zig, this might be handled using a function like:

fnrepeat(allocator:*std.mem.Allocator,original:[]constu8,times:usize)std.mem.Allocator.Error![]constu8;

In this code, the allocator variable is passed a struct that describes what code should perform the allocation, and the repeat function returns either the resulting string or, using the optional type as indicated by the !, an Allocator.Error. By directly expressing the allocator as an input, memory allocation is never "hidden" within another function, it is always exposed to the API by the function that is ultimately calling for the memory to be allocated. No allocations are performed inside Zig's standard library. Additionally, as the struct can point to anything, one can use alternative allocators, even ones written in the program. This can allow, for instance, small-object allocators that do not use the operating system functions that normally allocate an entire memory page. [10]

Optional types are an example of a language feature that offers general functionality while still being simple and generic. They do not have to be used to solve null pointer problems, they are also useful for any type of value where "no value" is an appropriate answer. Consider a function countTheNumberOfUsers that returns an integer, and an integer variable, theCountedUsers that holds the result. In many languages, a magic number would be placed in theCountedUsers to indicate that countTheNumberOfUsers has not yet been called, while many implementations would just set it to zero. In Zig, this could be implemented as an vartheCountedUsers:?i32=null which sets the variable to a clear "not been called" value. [10]

Another more general feature of Zig that also helps manage memory problems is the concept of defer, which marks some code to be performed at the end of a function no matter what happens, including possible runtime errors. If a particular function allocates some memory and then disposes of it when the operation is complete, one can add a line to defer a free to ensure it is released no matter what happens. [10]

Zig memory management avoids hidden allocations. Allocation is not managed in the language directly. Instead, heap access is done in a standard library, explicitly. [11]

Direct interaction with C

Zig promotes an evolutionary approach to using the language that combines new Zig code with existing C code. To do this, it aims to make interaction with existing C libraries as seamless as possible. Zig imports its own libraries with the @import directive, typically in this fashion:

conststd=@import("std");

Zig code within that file can now call functions inside std, for instance:

std.debug.print("Hello, world!\n",.{});

To work with C code, one simply replaces the @import with @cImport:

constc=@cImport(@cInclude("soundio/soundio.h"));

The Zig code can now call functions in the soundio library as if they were native Zig code. As Zig uses new data types that are explicitly defined, unlike C's more generic int and float, a small number of directives are used to move data between the C and Zig types, including @intCast and @ptrCast. [10]

Cross compiling

Zig treats cross-compiling as a first-class use-case of the language. This means any Zig compiler can compile runnable binaries for any of its target platforms, of which there are dozens. These include not only widely-used modern systems like ARM and x86-64, but also PowerPC, SPARC, MIPS, RISC-V and even the IBM z/Architectures (S390). The toolchain can compile to any of these targets without installing additional software, all the needed support is in the basic system. [10]

Comptime

By using the comptime keyword, the programmer can explicitly have Zig evaluate sections of code at compile time, as opposed to runtime. Being able to run code at compile time allows Zig to have the functionality of macros and conditional compilation without the need for a separate preprocessor language. [12]

During compile time, types become first-class citizens. This enables compile-time duck typing, and is how Zig implements generic types. [13]

For instance, in Zig, a generic linked list type might be implemented using a function like:

fnLinkedList(comptimeT:type)type;

This function takes in some type T, and returns a custom struct defining a linked list with that data type.

Origin of the name

The name 'Zig' was reportedly chosen through a process involving a Python script that randomly combined letters, starting with the letter 'Z' and followed by a vowel or 'Y', in order to generate four-letter words. Despite the intended length, 'Zig', a three-letter word, was ultimately selected from the various combinations produced by the script. [14]

Other features

Zig supports compile time generics, reflection and evaluation, cross-compiling, and manual memory management. [15] A major goal of the language is to improve on the C language, [12] [16] while also taking inspiration from Rust, [17] [6] among others. Zig has many features for low-level programming, notably packed structs (structs without padding between fields), arbitrary-width integers [18] and multiple pointer types. [13]

Zig is not just a new language: it also includes a C/C++ compiler, and can be used with either or both languages.

Versions

Since version 0.10 the (new default) Zig compiler is written in the Zig programming language, i.e., it is a self-hosting compiler, and that is a major new feature of that release. The older legacy bootstrapping compiler, written in C++, is still an option but will not be in version 0.11. When compiling with the new Zig compiler much less memory is used and it compiles a bit faster. The older, now legacy, C++ based compiler uses 3.5x more memory.

Zig's default backend for optimization is still LLVM, [19] and LLVM is written in C++. The Zig compiler with LLVM is 169 MiB, vs without LLVM 4.4 MiB. Faster executable code is usually compiled with the new Zig-lang based compiler, its LLVM code generation is better, and it fixes many bugs, but there are also improvements for the older legacy compiler in version 0.10. The self-hosted linker is tightly coupled with the self-hosted compiler. The new version also adds some experimental (tier-3) support for AMD GPUs (there's also some lesser support for Nvidia GPUs and for PlayStation 4 and 5).

The older bootstrapping ("stage1") compiler is written in Zig and C++, using LLVM 13 as a back-end, [20] [21] supporting many of its native targets. [22] The compiler is free and open-source software released under an MIT License. [23] The Zig compiler exposes the ability to compile C and C++ similarly to Clang with the commands zig cc and zig c++, [24] providing many headers including the C standard library (libc) and C++ Standard Library (libcxx) for many different platforms, allowing Zig's cc and c++ sub-commands to act as cross compilers out of the box. [25] [26]

Plus the operating systems (mostly desktop ones) officially supported (and documented), (minimal) applications can and have been made for Android (with Android NDK), and programming for iOS also possible.

Zig doesn't have its own official package manager (non-official ones exist), but a standard one has a milestone for 0.12.

Zig development is funded by the Zig Software Foundation (ZSF), a non-profit corporation with Andrew Kelley as president, which accepts donations and hires multiple full-time employees. [27] [28] [29]

Examples

Hello World

conststd=@import("std");pubfnmain()!void{conststdout=std.io.getStdOut().writer();trystdout.print("Hello, {s}!\n",.{"world"});}

Generic linked list

conststd=@import("std");conststdout=std.io.getStdOut().writer();fnLinkedList(comptimeT:type)type{returnstruct{constSelf=@This();pubconstNode=struct{next:?*Node=null,data:T,};first:?*Node=null,pubfnprepend(list:*Self,new_node:*Node,)void{new_node.next=list.first;list.first=new_node;}pubfnformat(list:Self,comptimefmt:[]constu8,options:std.fmt.FormatOptions,out_stream:anytype,)!void{tryout_stream.writeAll("( ");varit=list.first;while(it)|node|:(it=node.next){trystd.fmt.formatType(node.data,fmt,options,out_stream,1,);tryout_stream.writeAll(" ");}tryout_stream.writeAll(")");}};}pubfnmain()!void{constListU32=LinkedList(u32);varlist=ListU32{};varnode1=ListU32.Node{.data=1};varnode2=ListU32.Node{.data=2};varnode3=ListU32.Node{.data=3};list.prepend(&node1);list.prepend(&node2);list.prepend(&node3);trystdout.print("{}\n",.{list});trystdout.print("{b}\n",.{list});}
output
( 3 2 1 )  ( 11 10 1 ) 

String repetition with allocator

conststd=@import("std");fnrepeat(allocator:*std.mem.Allocator,original:[]constu8,times:usize,)std.mem.Allocator.Error![]constu8{varbuffer=tryallocator.alloc(u8,original.len*times,);for(0..times)|i|{std.mem.copyForwards(u8,buffer[(original.len*i)..],original,);}returnbuffer;}pubfnmain()!void{conststdout=std.io.getStdOut().writer();vararena=std.heap.ArenaAllocator.init(std.heap.page_allocator,);deferarena.deinit();varallocator=arena.allocator();constoriginal="Hello ";constrepeated=tryrepeat(&allocator,original,3,);// Prints "Hello Hello Hello "trystdout.print("{s}\n",.{repeated});}

output

Hello Hello Hello 

Projects

See also

Related Research Articles

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language drawing inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as that value, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

<span class="mw-page-title-main">Dangling pointer</span> Pointer that does not point to a valid object

Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are references that do not resolve to a valid destination.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

In the C++ programming language, new and delete are a pair of language constructs that perform dynamic memory allocation, object construction and object destruction.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computing, compile-time function execution is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state.

In the C++ programming language, placement syntax allows programmers to explicitly specify the memory management of individual objects — i.e. their "placement" in memory. Normally, when an object is created dynamically, an allocation function is invoked in such a way that it will both allocate memory for the object, and initialize the object within the newly allocated memory. The placement syntax allows the programmer to supply additional arguments to the allocation function. A common use is to supply a pointer to a suitable region of storage where the object can be initialized, thus separating memory allocation from object construction.

In programming languages and type theory, an option type or maybe type is a polymorphic type that represents encapsulation of an optional value; e.g., it is used as the return type of functions which may or may not return a meaningful value when they are applied. It consists of a constructor which either is empty, or which encapsulates the original data type A.

<span class="mw-page-title-main">Rust (programming language)</span> General-purpose programming language

Rust is a multi-paradigm, general-purpose programming language that emphasizes performance, type safety, and concurrency. It enforces memory safety—meaning that all references point to valid memory—without a garbage collector. To simultaneously enforce memory safety and prevent data races, its "borrow checker" tracks the object lifetime of all references in a program during compilation. Rust was influenced by ideas from functional programming, including immutability, higher-order functions, and algebraic data types. It is popular for systems programming.

<span class="mw-page-title-main">Nim (programming language)</span> Programming language

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

Although C++ is one of the most widespread programming languages, many prominent software engineers criticize C++ for being overly complex and fundamentally flawed. Among the critics have been: Robert Pike, Joshua Bloch, Linus Torvalds, Donald Knuth, Richard Stallman, and Ken Thompson. C++ has been widely adopted and implemented as a systems language through most of its existence. It has been used to build many pieces of very important software.

References

Citations

  1. Kelley, Andrew. "Introduction to the Zig Programming Language". andrewkelley.me. Retrieved 8 November 2020.
  2. "Release 0.11.0".
  3. "Taking the warts off C, with Andrew Kelley, creator of the Zig programming language". Sourcegraph. 2021-10-19. Retrieved 2024-04-18.
  4. "Zig has all the elegant simplicity of C, minus all the ways to shoot yourself in the foot". JAXenter. 2017-10-31. Archived from the original on 2017-11-01. Retrieved 2020-02-11.
  5. Elizabeth 2017.
  6. 1 2 3 4 Yegulalp 2016.
  7. 1 2 "ARC vs. GC". Elements.
  8. "Guide To Java 8 Optional". 28 November 2022.
  9. "Rust: Memory Management".
  10. 1 2 3 4 5 "Allocators". 11 September 2023.
  11. Tyson, Matthew (9 March 2023). "Meet Zig: The modern alternative to C". InfoWorld.com.
  12. 1 2 The Road to Zig 1.0 - Andrew Kelley. ChariotSolutions. 2019-05-09 via YouTube.
  13. 1 2 "Documentation". Ziglang.org. Retrieved 2020-04-24.
  14. andrewrk (2024-03-13). "origin of the zig programming language name. by @andrewrk" . Retrieved 2024-03-13.
  15. "The Zig Programming Language". Ziglang.org. Retrieved 2020-02-11.
  16. "The Zig Programming Language". Ziglang.org. Retrieved 2020-02-11.
  17. Company, Sudo Null. "Sudo Null - IT News for you". SudoNull. Retrieved 2020-02-11.
  18. Tim Anderson 24 Apr 2020 at 09:50. "Keen to go _ExtInt? LLVM Clang compiler adds support for custom width integers". www.theregister.co.uk. Retrieved 2020-04-24.{{cite web}}: CS1 maint: numeric names: authors list (link)
  19. New LLVM version 15, Zig legacy uses version 13
  20. "A Reply to _The Road to Zig 1.0_". www.gingerbill.org. 2019-05-13. Retrieved 2020-02-11.
  21. "ziglang/zig". GitHub. Zig Programming Language. 2020-02-11. Retrieved 2020-02-11.
  22. "The Zig Programming Language". Ziglang.org. Retrieved 2020-02-11.
  23. "ziglang/zig". GitHub. Retrieved 2020-02-11.
  24. "0.6.0 Release Notes". Ziglang.org. Retrieved 2020-04-19.
  25. "'zig cc': a Powerful Drop-In Replacement for GCC/Clang - Andrew Kelley". andrewkelley.me. Retrieved 2021-05-28.
  26. "Zig Makes Go Cross Compilation Just Work". DEV Community. 24 January 2021. Retrieved 2021-05-28.
  27. "Jakub Konka on Twitter". Twitter. Archived from the original on 2022-04-10. Retrieved 2021-05-28.
  28. "Announcing the Zig Software Foundation". Ziglang.org. Retrieved 2021-05-28.
  29. "Sponsor ZSF". Ziglang.org. Retrieved 2021-05-28.

Bibliography