Natural sort order

Last updated

In computing, natural sort order (or natural sorting) is the ordering of strings in alphabetical order, except that multi-digit numbers are treated atomically, i.e., as if they were a single character. Natural sort order has been promoted as being more human-friendly ("natural") than machine-oriented, pure alphabetical sort order. [1]

For example, in alphabetical sorting, "z11" would be sorted before "z2" because the "1" in the first string is sorted as smaller than "2", while in natural sorting "z2" is sorted before "z11" because "2" is treated as smaller than "11".

Alphabetical sorting:

  1. z11
  2. z2

Natural sorting:

  1. z2
  2. z11

Functionality to sort by natural sort order is now widely available in software libraries for many programming languages. [2] [3] [4] [5] [6] [7] During the 1996 MacHack conference, the Natural Order Mac OS System Extension was conceived and implemented overnight on-site as an entry for the Best Hack contest. [8] [9] Dave Koelle wrote the Alphanum Algorithm in 1997 [10] and Martin Pool published Natural Order String Comparison in 2000. [11]

Related Research Articles

Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office filing systems, library catalogs, and reference books.

<span class="mw-page-title-main">PHP</span> Scripting language created in 1994

PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. PHP was originally an abbreviation of Personal Home Page, but it now stands for the recursive initialism PHP: Hypertext Preprocessor.

<span class="mw-page-title-main">Regular expression</span> Sequence of characters that forms a search pattern

A regular expression is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

<span class="mw-page-title-main">String (computer science)</span> Sequence of characters, data type

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed. A string is generally considered as a data type and is often implemented as an array data structure of bytes that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence data types and structures.

<span class="mw-page-title-main">Sorting algorithm</span> Algorithm that arranges lists in order

In computer science, a sorting algorithm is an algorithm that puts elements of a list into an order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. Efficient sorting is important for optimizing the efficiency of other algorithms that require input data to be in sorted lists. Sorting is also often useful for canonicalizing data and for producing human-readable output.

The Burrows–Wheeler transform rearranges a character string into runs of similar characters. This is useful for compression, since it tends to be easy to compress a string that has runs of repeated characters by techniques such as move-to-front transform and run-length encoding. More importantly, the transformation is reversible, without needing to store any additional data except the position of the first original character. The BWT is thus a "free" method of improving the efficiency of text compression algorithms, costing only some extra computation. The Burrows–Wheeler transform is an algorithm used to prepare data for use with data compression techniques such as bzip2. It was invented by Michael Burrows and David Wheeler in 1994 while Burrows was working at DEC Systems Research Center in Palo Alto, California. It is based on a previously unpublished transformation discovered by Wheeler in 1983. The algorithm can be implemented efficiently using a suffix array thus reaching linear time complexity.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is a profoundly different language —features of D can be considered streamlined and expanded-upon ideas from C++, however D also draws inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

<span class="mw-page-title-main">R (programming language)</span> Programming language for statistics

The R language is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinformaticians and statisticians for data analysis and developing statistical software. The core R language is augmented by a large number of extension packages containing reusable code and documentation.

Alphabetical order is a system whereby character strings are placed in order based on the position of the characters in the conventional ordering of an alphabet. It is one of the methods of collation. In mathematics, a lexicographical order is the generalization of the alphabetical order to other data types, such as sequences of numbers or other ordered mathematical objects.

In mathematics, the lexicographic or lexicographical order is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a totally ordered set.

In computer programming, the Schwartzian transform is a technique used to improve the efficiency of sorting a list of items. This idiom is appropriate for comparison-based sorting when the ordering is actually based on the ordering of a certain property of the elements, where computing that property is an intensive operation that should be performed a minimal number of times. The Schwartzian transform is notable in that it does not use named temporary arrays.

sort (Unix) Standard UNIX utility

In computing, sort is a standard command line program of Unix and Unix-like operating systems, that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator. The command supports a number of command-line options that can vary by implementation. For instance the "-r" flag will reverse the sort order.

The following outline is provided as an overview of and topical guide to computer programming:

This is a comparison of notable web frameworks, software used to build and deploy web applications.

Timsort is a hybrid, stable sorting algorithm, derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data. It was implemented by Tim Peters in 2002 for use in the Python programming language. The algorithm finds subsequences of the data that are already ordered (runs) and uses them to sort the remainder more efficiently. This is done by merging runs until certain criteria are fulfilled. Timsort has been Python's standard sorting algorithm since version 2.3. It is also used to sort arrays of non-primitive type in Java SE 7, on the Android platform, in GNU Octave, on V8, Swift, and Rust.

CoffeeScript is a programming language that compiles to JavaScript. It adds syntactic sugar inspired by Ruby, Python, and Haskell in an effort to enhance JavaScript's brevity and readability. Specific additional features include list comprehension and destructuring assignment.

HipHop Virtual Machine (HHVM) is an open-source virtual machine based on just-in-time (JIT) compilation that serves as an execution engine for the Hack programming language. By using the principle of JIT compilation, Hack code is first transformed into intermediate HipHop bytecode (HHBC), which is then dynamically translated into x86-64 machine code, optimized, and natively executed. This contrasts with PHP's usual interpreted execution, in which the Zend Engine transforms PHP source code into opcodes that serve as a form of bytecode, and executes the opcodes directly on the Zend Engine's virtual CPU.

Phalcon is a PHP web framework based on the model–view–controller (MVC) pattern. Originally released in 2012, it is an open-source framework licensed under the terms of the BSD License.

<span class="mw-page-title-main">PeachPie</span>

PeachPie is an open-source PHP language compiler and runtime for the .NET Framework and .NET. It is built on top of the Microsoft Roslyn compiler platform and is based on the first-generation Phalanger project. PeachPie compiles source code written in PHP to CIL byte-code. PeachPie takes advantage of the JIT compiler component of the .NET Framework in order to handle the beginning of the compilation process. Its purpose is not to generate or optimize native code, but rather to compile PHP scripts into .NET assemblies containing CIL code and meta-data. In July 2017, the project became a member of the .NET Foundation.

References

  1. "Sorting for Humans : Natural Sort Order". blog.codinghorror.com. 12 December 2007.
  2. "PHP: natsort - Manual". php.net.
  3. "Sort::Naturally - metacpan.org". metacpan.org.
  4. Morton, Seth M. (23 December 2021). "natsort: Simple yet flexible natural sorting in Python" via PyPI.
  5. "Customizable Natural-Order Sort - File Exchange - MATLAB Central".
  6. Kornblith, Simon (25 December 2021). "NaturalSort: Natural Sort Order in Julia". github.com.
  7. Pažourek, Tomáš (1 April 2022). "NaturalSort.Extension: Support for natural sorting in .NET/C#". github.com.
  8. "Natural Order Numerical Sorting".
  9. "TidBITS: The Natural Order of Things". 3 February 1997.
  10. "Dave Koelle's Alphanum Algorithm".
  11. "Martin Pool's Natural Order String Comparison".