Stride of an array

Last updated

In computer programming, the stride of an array (also referred to as increment, pitch or step size) is the number of locations in memory between beginnings of successive array elements, measured in bytes or in units of the size of the array's elements. The stride cannot be smaller than the element size but can be larger, indicating extra space between elements.

Computer programming Process that leads from an original formulation of a computing problem to executable computer programs

Computer programming is the process of designing and building an executable computer program for accomplishing a specific computing task. Programming involves tasks such as: analysis, generating algorithms, profiling algorithms' accuracy and resource consumption, and the implementation of algorithms in a chosen programming language. The source code of a program is written in one or more languages that are intelligible to programmers, rather than machine code, which is directly executed by the central processing unit. The purpose of programming is to find a sequence of instructions that will automate the performance of a task on a computer, often for solving a given problem. The process of programming thus often requires expertise in several different subjects, including knowledge of the application domain, specialized algorithms, and formal logic.

In computer science, an array data structure, or simply an array, is a data structure consisting of a collection of elements, each identified by at least one array index or key. An array is stored such that the position of each element can be computed from its index tuple by a mathematical formula. The simplest type of data structure is a linear array, also called one-dimensional array.

The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures.

Contents

An array with stride of exactly the same size as the size of each of its elements is contiguous in memory. Such arrays are sometimes said to have unit stride. Unit stride arrays are sometimes more efficient than non-unit stride arrays, but non-unit stride arrays can be more efficient for 2D or multi-dimensional arrays, depending on the effects of caching and the access patterns used [ citation needed ]. This can be attributed to the principle of locality, specifically spatial locality.

A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost to access data from the main memory. A cache is a smaller, faster memory, closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have different independent caches, including instruction and data caches, where the data cache is usually organized as a hierarchy of more cache levels.

In computing, a memory access pattern or IO access pattern is the pattern with which a system or program reads and writes memory on secondary storage. These patterns differ in the level of locality of reference and drastically affect cache performance, and also have implications for the approach to parallelism and distribution of workload in shared memory systems. Further, cache coherency issues can affect multiprocessor performance, which means that certain memory access patterns place a ceiling on parallelism.

Reasons for non-unit stride

Arrays may have a stride larger than their elements' width in bytes in at least three cases:

Padding

Many languages (including C and C++) allow structures to be padded to better take advantage either of the word length and/or cache line size of the machine. For example:

C (programming language) general-purpose programming language

C is a general-purpose, procedural computer programming language supporting structured programming, lexical variable scope, and recursion, while a static type system prevents unintended operations. By design, C provides constructs that map efficiently to typical machine instructions and has found lasting use in applications previously coded in assembly language. Such applications include operating systems and various application software for computers, from supercomputers to embedded systems.

C++ General-purpose programming language

C++ is a general-purpose programming language created by Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significantly over time, and modern C++ has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, and IBM, so it is available on many platforms.

Data structure alignment refers to the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.

structA{inta;charb;};structAmyArray[100];

In the above code snippet, myArray might well turn out to have a stride of eight bytes, rather than five (4 bytes for the int plus one for the char), if the C code were compiled for a 32-bit architecture, and the compiler had optimized (as is usually the case) for minimum processing time rather than minimum memory usage.

In computer architecture, 32-bit integers, memory addresses, or other data units are those that are 32 bits wide. Also, 32-bit CPU and ALU architectures are those that are based on registers, address buses, or data buses of that size. 32-bit microcomputers are computers in which 32-bit microprocessors are the norm.

Overlapping parallel arrays

Some languages allow arrays of structures to be treated as overlapping parallel arrays with non-unit stride:

In computing, a group of parallel arrays is a form of implicit data structure that uses multiple arrays to represent a singular array of records. It keeps a separate, homogeneous data array for each field of the record, each having the same number of elements. Then, objects located at the same index in each array are implicitly the fields of a single record. Pointers from one object to another are replaced by array indices. This contrasts with the normal approach of storing all fields of each record together in memory. For example, one might declare an array of 100 names, each a string, and 100 ages, each an integer, associating each name with the age that has the same index.

#include<stdio.h>structMyRecord{intvalue;char*text;};/*    Print the contents of an array of ints with the given stride.    Note that size_t is the correct type, as int can overflow.*/voidprint_some_ints(constint*arr,intlength,size_tstride){inti;printf("Address\t\tValue\n");for(i=0;i<length;++i){printf("%p\t%d\n",arr,arr[0]);arr=(int*)((unsignedchar*)arr+stride);}}intmain(void){intints[100]={0};structMyRecordrecords[100]={0};print_some_ints(&ints[0],100,sizeofints[0]);print_some_ints(&records[0].value,100,sizeofrecords[0]);return0;}

This idiom is a form of type punning.

Array cross-section

Some languages like PL/I allow what is known as an array cross-section, which select certain columns or rows from a larger array. [1] :p.262 For example, if a two-dimensional array is declared as

PL/I is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. It has been used by academic, commercial and industrial organizations since it was introduced in the 1960s, and is still used.

  declare some_array (12,2)fixed; 

an array of one dimension consisting only of the second column may be referenced as

  some_array(*,2) 

Example of multidimensional array with non-unit stride

Non-unit stride is particularly useful for images. It allows for creating subimages without copying the pixel data. Java example:

publicclassGrayscaleImage{privatefinalintwidth,height,widthStride;/** Pixel data. Pixel in single row are always considered contiguous in this example. */privatefinalbyte[]pixels;/** Offset of the first pixel within pixels */privatefinalintoffset;/** Constructor for contiguous data */publicImage(intwidth,intheight,byte[]pixels){this.width=width;this.height=height;this.pixels=pixels;this.offset=0;this.widthStride=width;}/** Subsection constructor */publicImage(intwidth,intheight,byte[]pixels,intoffset,intwidthStride){this.width=width;this.height=height;this.pixels=pixels;this.offset=offset;this.widthStride=widthStride;}/** Returns a subregion of this Image as a new Image. This and the new image share        the pixels, so changes to the returned image will be reflected in this image. */publicImagecrop(intx1,inty1,intx2,inty2){returnnewImage(x2-x1,y2-y1,pixels,offset+y1*widthStride+x1,widthStride);}/** Returns pixel value at specified coordinate */publicbytegetPixelAt(intx,inty){returnpixels[offset+y*widthStride+x];}}

Related Research Articles

In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.

The BMP file format, also known as bitmap image file or device independent bitmap (DIB) file format or simply a bitmap, is a raster graphics image file format used to store bitmap digital images, independently of the display device, especially on Microsoft Windows and OS/2 operating systems.

The syntax of the C programming language, the rules governing writing of software in the language, is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

Pointer (computer programming) programming language data type

In computer science, a pointer is a programming language object that stores the memory address of another value located in computer memory. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

Foreach loop

For each loop is a control flow statement for traversing items in a collection. Foreach is usually used in place of a standard for loop statement. Unlike other for loop constructs, however, foreach loops usually maintain no explicit counter: they essentially say "do this to everything in this set", rather than "do this x times". This avoids potential off-by-one errors and makes code simpler to read. In object-oriented languages an iterator, even if implicit, is often used as the means of traversal.

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analysers intend to make and compute useful information for understanding aliasing in programs.

A struct in the C programming language is a composite data type declaration that defines a physically grouped list of variables under one name in a block of memory, allowing the different variables to be accessed via a single pointer or by the struct declared name which returns the same address. The struct data type can contain other data types so is used for mixed-data-type records such as a hard-drive directory entry, or other mixed-type records.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax as declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

In the programming languages C and C++, the unary operator sizeof generates the size of a variable or datatype, measured in the number of char-sized storage units required for the type. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern systems this is eight bits. The result of sizeof has an unsigned integral type that is usually denoted by size_t.

XImage is the X client side storage mechanism for a X Window System pixel map. The structure of an XImage as defined by the X Window core protocol is the following:

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

In computer science, type punning is a common term for any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

BSAVE image file format

BSAVE and BLOAD are commands in many varieties of the BASIC programming language. BSAVE copies RAM to a binary file, and BLOAD copies the contents of the file to RAM. The term "BSAVE image" could mean any of various raw image formats of video display controllers, or more generally any file containing the raw contents of a section of memory.

ALGOL 68RS is the second ALGOL 68 compiler written by I.F. Currie and J.D. Morrison at the Royal Signals and Radar Establishment. Unlike the earlier ALGOL 68R it was designed as a portable compiler and implemented the language of the revised Report.

The computer programming languages C and Object Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes.

In the C programming language, operations can be performed on a bit level using bitwise operators.

XvImage is the X Window System X video extension client side storage mechanism for a X video extension pixel map. The structure of an XvImage as defined by the Addendum to the Xv Client library documentation is the following:

Flexible array members were introduced in the C99 standard of the C programming language. It is a member of a struct, which is an array without a given dimension. It must be the last member of such a struct and it must be accompanied by at least one other member, as in the following example:

References

  1. Hughes, Joan K (1979). PL/I Structured Programming (second ed.) . New York: John Wiley and Sons. ISBN   0-471-01908-9.