Getaddrinfo

Last updated

In C programming, the functions getaddrinfo() and getnameinfo() convert domain names, hostnames, and IP addresses between human-readable text representations and structured binary formats for the operating system's networking API. Both functions are contained in the POSIX standard application programming interface (API). [1]

Contents

getaddrinfo and getnameinfo are inverse functions of each other. They are network protocol agnostic, and support both IPv4 and IPv6. It is the recommended interface for name resolution in building protocol independent applications and for transitioning legacy IPv4 code to the IPv6 Internet.

Internally, the functions perform resolutions using the Domain Name System (DNS) by calling other, lower level functions, such as gethostbyname().

On February 16, 2016, a security bug was announced in the glibc implementation of getaddrinfo(), using a buffer overflow technique, that may allow execution of arbitrary code by the attacker. [2]

struct addrinfo

The C data structure used to represent addresses and hostnames within the networking API is the following:

struct addrinfo {     int       ai_flags;     int       ai_family;     int       ai_socktype;     int       ai_protocol;     socklen_t ai_addrlen;     struct    sockaddr* ai_addr;     char*     ai_canonname;      /* canonical name */     struct    addrinfo* ai_next; /* this struct can form a linked list */ }; 

In some older systems the type of ai_addrlen is size_t instead of socklen_t. Most socket functions, such as accept() and getpeername(), require the parameter to have type socklen_t * and programmers often pass the address to the ai_addrlen element of the addrinfo structure. If the types are incompatible, e.g., on a 64-bit Solaris 9 system where size_t is 8 bytes and socklen_t is 4 bytes, then run-time errors may result.

The structure contains structures ai_family and sockaddr with its own sa_family field. These are set to the same value when the structure is created with function getaddrinfo in some implementations.

getaddrinfo()

getaddrinfo() converts human-readable text strings representing hostnames or IP addresses into a dynamically allocated linked list of struct addrinfo structures. The function prototype for this function is specified as follows:

int getaddrinfo(const char* hostname,                 const char* service,                 const struct addrinfo* hints,                 struct addrinfo** res); 
hostname
can be either a domain name, such as "example.com", an address string, such as "127.0.0.1", or NULL, in which case the address 0.0.0.0 or 127.0.0.1 is assigned depending on the hints flags.
service
can be a port number passed as string, such as "80", or a service name, e.g. "echo". In the latter case a typical implementation uses getservbyname() to query the file /etc/services to resolve the service to a port number.
hints
can be either NULL or an addrinfo structure with the type of service requested.
res
is a pointer that points to a new addrinfo structure with the information requested after successful completion of the function. [3] The function returns 0 upon success and non-zero error value if it fails. [1]

Although implementations vary among platforms, the function first attempts to obtain a port number usually by branching on service. If the string value is a number, it converts it to an integer and calls htons(). If it is a service name, such as www, the service is looked up with getservbyname(), using the protocol derived from hints->ai_socktype as the second parameter to that function. Then, if hostname is given (not NULL), a call to gethostbyname() resolves it, or otherwise the address 0.0.0.0 is used, if hints->ai_flags is set to AI_PASSIVE, and 127.0.0.1 otherwise. It allocated a new addrinfo structure filled with the appropriate sockaddr_in in one of these conditions and also adds the port retrieved at the beginning to it. Finally, the **res parameter is dereferenced to make it point to a newly allocated addrinfo structure. [4] In some implementations, such as the Unix version for Mac OS, the hints->ai_protocol overrides the hints->ai_socktype value while in others it is the opposite, so both need to be defined with equivalent values for the code to be work across multiple platforms.

freeaddrinfo()

This function frees the memory allocated by function getaddrinfo(). As the result of the latter is a linked list of addrinfo structures starting at the address ai, freeaddrinfo() loops through the list and frees each one in turn.

void freeaddrinfo(struct addrinfo *ai); 

getnameinfo()

The function getnameinfo() converts the internal binary representation of an IP address in the form of a pointer to a struct sockaddr into text strings consisting of the hostname or, if the address cannot be resolved into a name, a textual IP address representation, as well as the service port name or number. The function prototype is specified as follows:

int getnameinfo(const struct sockaddr* sa, socklen_t salen,                 char* host, size_t hostlen,                 char* serv, size_t servlen,                 int flags); 

Example

The following example uses getaddrinfo() to resolve the domain name www.example.com into its list of addresses and then calls getnameinfo() on each result to return the canonical name for the address. In general, this produces the original hostname, unless the particular address has multiple names, in which case the canonical name is returned. In this example, the domain name is printed three times, once for each of the three results obtained.

#include <stdio.h> #include <stdlib.h> #include <netdb.h> #include <netinet/in.h> #include <sys/socket.h>  #ifndef   NI_MAXHOST #define   NI_MAXHOST 1025 #endif  int main(void) {     struct addrinfo* result;     struct addrinfo* res;     int error;      /* resolve the domain name into a list of addresses */     error = getaddrinfo("www.example.com", NULL, NULL, &result);     if (error != 0) {            if (error == EAI_SYSTEM) {             perror("getaddrinfo");         } else {             fprintf(stderr, "error in getaddrinfo: %s\n", gai_strerror(error));         }            exit(EXIT_FAILURE);     }      /* loop over all returned results and do inverse lookup */     for (res = result; res != NULL; res = res->ai_next) {            char hostname[NI_MAXHOST];         error = getnameinfo(res->ai_addr, res->ai_addrlen, hostname, NI_MAXHOST, NULL, 0, 0);          if (error != 0) {             fprintf(stderr, "error in getnameinfo: %s\n", gai_strerror(error));             continue;         }         if (*hostname != '\0')             printf("hostname: %s\n", hostname);     }      freeaddrinfo(result);     return 0; } 

See also

Related Research Articles

The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to each of the associated entities. Most prominently, it translates readily memorized domain names to the numerical IP addresses needed for locating and identifying computer services and devices with the underlying network protocols. The Domain Name System has been an essential component of the functionality of the Internet since 1985.

Berkeley sockets is an application programming interface (API) for Internet sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

In computing, the Windows Sockets API (WSA), later shortened to Winsock, is an application programming interface (API) that defines how Windows network application software should access network services, especially TCP/IP. It defines a standard interface between a Windows TCP/IP client application and the underlying TCP/IP protocol stack. The nomenclature is based on the Berkeley sockets API used in BSD for communications between programs.

SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server. SOCKS5 optionally provides authentication so only authorized users may access a server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.

Zero-configuration networking (zeroconf) is a set of technologies that automatically creates a usable computer network based on the Internet Protocol Suite (TCP/IP) when computers or network peripherals are interconnected. It does not require manual operator intervention or special configuration servers. Without zeroconf, a network administrator must set up network services, such as Dynamic Host Configuration Protocol (DHCP) and Domain Name System (DNS), or configure each computer's network settings manually.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computer networking, localhost is a hostname that refers to the current device used to access it. It is used to access the network services that are running on the host via the loopback network interface. Using the loopback interface bypasses any local network interface hardware.

stat (system call) Unix system call

stat is a Unix system call that returns file attributes about an inode. The semantics of stat vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

In computer networks, a reverse DNS lookup or reverse DNS resolution (rDNS) is the querying technique of the Domain Name System (DNS) to determine the domain name associated with an IP address – the reverse of the usual "forward" DNS lookup of an IP address from a domain name. The process of reverse resolving of an IP address uses PTR records. rDNS involves searching domain name registry and registrar tables. The reverse DNS database of the Internet is rooted in the .arpa top-level domain.

In computer networking, the multicast DNS (mDNS) protocol resolves hostnames to IP addresses within small networks that do not include a local name server. It is a zero-configuration service, using essentially the same programming interfaces, packet formats and operating semantics as unicast Domain Name System (DNS). It was designed to work as either a stand-alone protocol or compatibly with standard DNS servers. It uses IP multicast User Datagram Protocol (UDP) packets, and is implemented by the Apple Bonjour and open source Avahi software packages, included in most Linux distributions. Although the Windows 10 implementation was limited to discovering networked printers, subsequent releases resolved hostnames as well. mDNS can work in conjunction with DNS Service Discovery (DNS-SD), a companion zero-configuration networking technique specified separately in RFC 6763.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

A network socket is a software structure within a network node of a computer network that serves as an endpoint for sending and receiving data across the network. The structure and properties of a socket are defined by an application programming interface (API) for the networking architecture. Sockets are created only during the lifetime of a process of an application running in the node.

Construct is a Python library for the construction and deconstruction of data structures in a declarative fashion. In this context, construction, or building, refers to the process of converting (serializing) a programmatic object into a binary representation. Deconstruction, or parsing, refers to the opposite process of converting (deserializing) binary data into a programmatic object. Being declarative means that user code defines the data structure, instead of the convention of writing procedural code to accomplish the goal. Construct can work seamlessly with bit- and byte-level data granularity and various byte-ordering.

Spawn in computing refers to a function that loads and executes a new child process. The current process may wait for the child to terminate or may continue to execute concurrent computing. Creating a new subprocess requires enough memory in which both the child process and the current program can execute.

In computing, Microsoft's Windows Vista and Windows Server 2008 introduced in 2007/2008 a new networking stack named Next Generation TCP/IP stack, to improve on the previous stack in several ways. The stack includes native implementation of IPv6, as well as a complete overhaul of IPv4. The new TCP/IP stack uses a new method to store configuration settings that enables more dynamic control and does not require a computer restart after a change in settings. The new stack, implemented as a dual-stack model, depends on a strong host-model and features an infrastructure to enable more modular components that one can dynamically insert and remove.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

<span class="mw-page-title-main">SocketCAN</span> Open source controller area network drivers and networking stack for the Linux kernel

SocketCAN is a set of open source CAN drivers and a networking stack contributed by Volkswagen Research to the Linux kernel. SocketCAN was formerly known as Low Level CAN Framework (LLCF).

gSOAP is a C and C++ software development toolkit for SOAP/XML web services and generic XML data bindings. Given a set of C/C++ type declarations, the compiler-based gSOAP tools generate serialization routines in source code for efficient XML serialization of the specified C and C++ data structures. Serialization takes zero-copy overhead.

References

  1. 1 2 "freeaddrinfo, getaddrinfo - get address information". The Open Group Base Specifications Issue 7, 2018 edition (POSIX.1-2017). The Open Group. Retrieved 2022-03-05.
  2. "CVE-2015-7547: Glibc getaddrinfo stack-based buffer overflow".
  3. Stevens R., Fenner, Rudoff [2003] UNIX® Network Programming Volume 1, Third Edition: The Sockets Networking API. Publisher: Addison-Wesley Professional. Pub. Date: November 14, 2003 p. 256
  4. Hajimu UMEMOTO [2000] getaddrinfo.c Accessed from: https://opensource.apple.com/source/passwordserver_sasl/passwordserver_sasl-14/cyrus_sasl/lib/getaddrinfo.c