Setjmp.h

Last updated

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

Contents

A typical use of setjmp/longjmp is implementation of an exception mechanism that exploits the ability of longjmp to reestablish program or thread state, even across multiple levels of function calls. A less common use of setjmp is to create syntax similar to coroutines.

Member functions

intsetjmp(jmp_bufenv)
Sets up the local jmp_buf buffer and initializes it for the jump. This routine [1] saves the program's calling environment in the environment buffer specified by the env argument for later use by longjmp. If the return is from a direct invocation, setjmp returns 0. If the return is from a call to longjmp, setjmp returns a nonzero value.
voidlongjmp(jmp_bufenv,intvalue)
Restores the context of the environment buffer env that was saved by invocation of the setjmp routine [1] in the same invocation of the program. Invoking longjmp from a nested signal handler is undefined. The value specified by value is passed from longjmp to setjmp. After longjmp is completed, program execution continues as if the corresponding invocation of setjmp had just returned. If the value passed to longjmp is 0, setjmp will behave as if it had returned 1; otherwise, it will behave as if it had returned value.

setjmp saves the current environment (the program state), at some point of program execution, into a platform-specific data structure (jmp_buf) that can be used at some later point of program execution by longjmp to restore the program state to that saved by setjmp into jmp_buf. This process can be imagined to be a "jump" back to the point of program execution where setjmp saved the environment. The (apparent) return value from setjmp indicates whether control reached that point normally (zero) or from a call to longjmp (nonzero). This leads to a common idiom: if(setjmp(x)){/* handle longjmp(x) */}.

POSIX.1 does not specify whether setjmp and longjmp save and restore the current set of blocked signals; if a program employs signal handling it should use POSIX's sigsetjmp/siglongjmp.

Member types

jmp_bufAn array type, such as struct __jmp_buf_tag[1], [2] suitable for holding the information needed to restore a calling environment.

The C99 Rationale describes jmp_buf as being an array type for backward compatibility; existing code refers to jmp_buf storage locations by name (without the & address-of operator), which is only possible for array types. [3]

Caveats and limitations

When a "non-local goto" is executed via setjmp/longjmp in C++, normal "stack unwinding" does not occur. Therefore, any required cleanup actions will not occur either. This could include closing file descriptors, flushing buffers, or freeing heap-allocated memory.

If the function in which setjmp was called returns, it is no longer possible to safely use longjmp with the corresponding jmp_buf object. This is because the stack frame is invalidated when the function returns. Calling longjmp restores the stack pointer, which—because the function returned—would point to a non-existent and potentially overwritten or corrupted stack frame. [4] [5]

Similarly, C99 does not require that longjmp preserve the current stack frame. This means that jumping into a function which was exited via a call to longjmp is undefined. [6]

Example usage

Simple example

The example below shows the basic idea of setjmp. There, main() calls first(), which in turn calls second(). Then, second() jumps back into main(), skipping first()'s call of printf().

#include<stdio.h>#include<setjmp.h>staticjmp_bufbuf;voidsecond(){printf("second\n");// printslongjmp(buf,1);// jumps back to where setjmp was called - making setjmp now return 1}voidfirst(){second();printf("first\n");// does not print}intmain(){if(!setjmp(buf))first();// when executed, setjmp returned 0else// when longjmp jumps back, setjmp returns 1printf("main\n");// printsreturn0;}

When executed, the above program will output:

second main 

Notice that although the first() subroutine gets called, "first" is never printed. "main" gets printed as the conditional statement if (!setjmp(buf)) is executed a second time.

Exception handling

In this example, setjmp is used to bracket exception handling, like try in some other languages. The call to longjmp is analogous to a throw statement, allowing an exception to return an error status directly to the setjmp. The following code adheres to the 1999 ISO C standard and Single UNIX Specification by invoking setjmp in a limited range of contexts: [7]

Following these rules can make it easier for the implementation to create the environment buffer, which can be a sensitive operation. [3] More general use of setjmp can cause undefined behaviour, such as corruption of local variables; conforming compilers and environments are not required to protect or even warn against such usage. However, slightly more sophisticated idioms such as switch ((exception_type = setjmp(env))) { } are common in literature and practice, and remain relatively portable. A simple conforming methodology is presented below, where an additional variable is maintained along with the state buffer. This variable could be elaborated into a structure incorporating the buffer itself.

In a more modern-looking example, the usual "try" block would be implemented as a setjmp (with some preparation code for multilevel jumps, as seen in first), the "throw" as longjmp with the optional parameter as the exception, and the "catch" as the "else" block under "try".

#include<setjmp.h>#include<stdio.h>#include<stdlib.h>#include<string.h>staticvoidfirst();staticvoidsecond();/* Use a file scoped static variable for the exception stack so we can access * it anywhere within this translation unit. */staticjmp_bufexception_env;staticintexception_type;intmain(void){char*volatilemem_buffer=NULL;if(setjmp(exception_env)){// if we get here there was an exceptionprintf("first failed, exception type: %d\n",exception_type);}else{// Run code that may signal failure via longjmp.puts("calling first");first();mem_buffer=malloc(300);// allocate a resourceprintf("%s\n",strcpy(mem_buffer,"first succeeded"));// not reached}free(mem_buffer);// NULL can be passed to free, no operation is performedreturn0;}staticvoidfirst(){jmp_bufmy_env;puts("entering first");// reachedmemcpy(my_env,exception_env,sizeofmy_env);switch(setjmp(exception_env)){case3:// if we get here there was an exception.puts("second failed, exception type: 3; remapping to type 1");exception_type=1;default:// fall throughmemcpy(exception_env,my_env,sizeofexception_env);// restore exception stacklongjmp(exception_env,exception_type);// continue handling the exceptioncase0:// normal, desired operationputs("calling second");// reached second();puts("second succeeded");// not reached}memcpy(exception_env,my_env,sizeofexception_env);// restore exception stackputs("leaving first");// never reached}staticvoidsecond(){puts("entering second");// reachedexception_type=3;longjmp(exception_env,exception_type);// declare that the program has failedputs("leaving second");// not reached}

This program's output is:

calling first entering first calling second entering second second failed, exception type: 3; remapping to type 1 first failed, exception type: 1 

Cooperative multitasking

C99 provides that longjmp is guaranteed to work only when the destination is a calling function, i.e., that the destination scope is guaranteed to be intact. Jumping to a function that has already terminated by return or longjmp is undefined. [6] However, most implementations of longjmp do not specifically destroy local variables when performing the jump. Since the context survives until its local variables are erased, it could actually be restored by setjmp. In many environments (such as Really Simple Threads and TinyTimbers), idioms such as if(!setjmp(child_env)) longjmp(caller_env); can allow a called function to effectively pause-and-resume at a setjmp.

This is exploited by thread libraries to provide cooperative multitasking facilities without using setcontext or other fiber facilities.

Considering that setjmp to a child function will generally work unless sabotaged, and setcontext, as part of POSIX, is not required to be provided by C implementations, this mechanism may be portable where the setcontext alternative fails.

Since no exception will be generated upon overflow of one of the multiple stacks in such a mechanism, it is essential to overestimate the space required for each context, including the one containing main() and including space for any signal handlers that might interrupt regular execution. Exceeding the allocated space will corrupt the other contexts, usually with the outermost functions first. Unfortunately, systems requiring this kind of programming strategy are often also small ones with limited resources.

#include<setjmp.h>#include<stdio.h>jmp_bufmainTask,childTask;voidcall_with_cushion();voidchild();intmain(){if(!setjmp(mainTask)){call_with_cushion();// child never returns, yield}// execution resumes after this "}" after first time that child yieldswhile(1){printf("Parent\n");if(!setjmp(mainTask))longjmp(childTask,1);// yield - note that this is undefined under C99}}voidcall_with_cushion(){charspace[1000];// Reserve enough space for main to runspace[999]=1;// Do not optimize array out of existencechild();}voidchild(){while(1){printf("Child loop begin\n");if(!setjmp(childTask))longjmp(mainTask,1);// yield - invalidates childTask in C99printf("Child loop end\n");if(!setjmp(childTask))longjmp(mainTask,1);// yield - invalidates childTask in C99}/* Don't return. Instead we should set a flag to indicate that main()       should stop yielding to us and then longjmp(mainTask, 1) */}

Related Research Articles

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

The Cyclone programming language was intended to be a safe dialect of the C language. It avoids buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for system programming. It is no longer supported by its original developers, with the reference tooling not supporting 64-bit platforms. The Rust language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.

In computer programming, the scope of a name binding is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts of the program, the name may refer to a different entity, or to nothing at all. Scope helps prevent name collisions by allowing the same name to refer to different objects – as long as the names have separate scopes. The scope of a name binding is also known as the visibility of an entity, particularly in older or more technical literature—this is in relation to the referenced entity, not the referencing name.

Berkeley sockets is an application programming interface (API) for Internet sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. Starting from the original ANSI C standard, it was developed at the same time as the C library POSIX specification, which is a superset of it. Since ANSI C was adopted by the International Organization for Standardization, the C standard library is also called the ISO C library.

The printf family of functions in the C programming language are a set of functions that take a format string as input among a variable sized list of other values and produce as output a string that corresponds to the format specifier and given input values. The string is written in a simple template language: characters are usually copied literally into the function's output, but format specifiers, which start with a % character, indicate the location and method to translate a piece of data to characters. The design has been copied to expose similar functionality in other programming languages.

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analysers intend to make and compute useful information for understanding aliasing in programs.

In computing, an uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have some value, but not a predictable one. As such, it is a programming error and a common source of bugs in software.

<span class="mw-page-title-main">Stack-based memory allocation</span> Form of computer memory allocation

Stacks in computing architectures are regions of memory where data is added or removed in a last-in-first-out (LIFO) manner.

In computing, sigaction is a function API defined by POSIX to give the programmer access to what should be a program's behavior when receiving specific OS signals.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

Exception handling syntax is the set of keywords and/or structures provided by a computer programming language to allow exception handling, which separates the handling of errors that arise during a program's operation from its ordinary processes. Syntax for exception handling varies between programming languages, partly to cover semantic differences but largely to fit into each language's overall syntactic structure. Some languages do not call the relevant concept "exception handling"; others may not have direct facilities for it, but can still provide means to implement it.

A scanf format string is a control parameter used in various functions to specify the layout of an input string. The functions can then divide the string and translate into values of appropriate data types. String scanning functions are often supplied in standard libraries. Scanf is a function that reads formatted data from the standard input string, which is usually the keyboard and writes the results whenever called in the specified arguments.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

setcontext is one of a family of C library functions used for context control. The setcontext family allows the implementation in C of advanced control flow patterns such as iterators, fibers, and coroutines. They may be viewed as an advanced version of setjmp/longjmp; whereas the latter allows only a single non-local jump up the stack, setcontext allows the creation of multiple cooperative threads of control, each with its own stack.

stdarg.h is a header in the C standard library of the C programming language that allows functions to accept an indefinite number of arguments. It provides facilities for stepping through a list of function arguments of unknown number and type. C++ provides this functionality in the header cstdarg.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

In modern POSIX compliant operating systems, a program that needs to access data from a file stored in a file system uses the read system call. The file is identified by a file descriptor that is normally obtained from a previous call to open. This system call reads in data in bytes, the number of which is specified by the caller, from the file and stores then into a buffer supplied by the calling process.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

References

  1. 1 2 ISO C states that setjmp must be implemented as a macro, but POSIX explicitly states that it is undefined whether setjmp is a macro or a function.
  2. This is the type used by the GNU C Library, version 2.7
  3. 1 2 C99 Rationale, version 5.10, April 2003, section 7.13
  4. CS360 Lecture Notes — Setjmp and Longjmp
  5. setjmp(3) Archived 2009-07-26 at the Wayback Machine
  6. 1 2 ISO/IEC 9899:1999, 2005, 7.13.2.1:2 and footnote 211
  7. setjmp : set jump point for a non-local goto  System Interfaces Reference, The Single UNIX Specification , Version 4 from The Open Group

Further reading