Wildcard character

Last updated

In software, a wildcard character is a kind of placeholder represented by a single character, such as an asterisk (*), which can be interpreted as a number of literal characters or an empty string. It is often used in file searches so the full name need not be typed. [1]

Contents

Telecommunication

In telecommunications, a wildcard is a character that may be substituted for any of a defined subset of all possible characters.

Computing

In computer (software) technology, a wildcard is a symbol used to replace or represent zero or more characters. [2] Algorithms for matching wildcards have been developed in a number of recursive and non-recursive varieties. [3]

File and directory patterns

When specifying file names (or paths) in CP/M, DOS, Microsoft Windows, and Unix-like operating systems, the asterisk character (*, also called "star") matches zero or more characters. For example, doc* matches doc and document but not dodo. If files are named with a date stamp, wildcards can be used to match date ranges, such as 202501*.mp4 to select video recordings from January2025, to facilitate file operations such as copying and moving.

In Unix-like and DOS operating systems, the question mark ? matches exactly one character. In DOS, if the question mark is placed at the end of the word, it will also match missing (zero) trailing characters; for example, the pattern 123? will match 123 and 1234, but not 12345.

In Unix shells and Windows PowerShell, ranges of characters enclosed in square brackets ([ and ]) match a single character within the set; for example, [A-Za-z] matches any single uppercase or lowercase letter. In Unix shells, a leading exclamation mark ! negates the set and matches only a character not within the list. In shells that interpret ! as a history substitution, a leading caret ^ can be used instead.

The operation of matching of wildcard patterns to multiple file or path names is referred to as globbing.

Databases

In SQL, wildcard characters can be used in LIKE expressions; the percent sign % matches zero or more characters, and underscore _ a single character. Transact-SQL also supports square brackets ([ and ]) to list sets and ranges of characters to match, a leading caret ^ negates the set and matches only a character not within the list. In Microsoft Access, the asterisk sign * matches zero or more characters, the question mark ? matches a single character, the number sign # matches a single digit (0–9), and square brackets can be used for sets or ranges of characters to match.

Regular expressions

In regular expressions, the period (., also called "dot") is the wildcard pattern which matches any single character. Followed by the Kleene star operator, which is denoted as an asterisk (*), we obtain .*, which will match zero or more arbitrary characters.

See also

Related Research Articles

<span class="mw-page-title-main">Regular expression</span> Sequence of characters that forms a search pattern

A regular expression, sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

sed Standard UNIX utility for editing streams of data

sed is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed and the earlier qed. It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.

An 8.3 filename is one that obeys the filename convention used by CP/M and old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename, to provide compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

A metacharacter is a character that has a special meaning to a computer program, such as a shell interpreter or a regular expression (regex) engine.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

The tilde is a grapheme ˜ or ~ with a number of uses. The name of the character came into English from Spanish tilde, which in turn came from the Latin titulus, meaning 'title' or 'superscription'. Its primary use is as a diacritic (accent) in combination with a base letter. Its freestanding form is used in modern texts mainly to indicate approximation.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

glob (programming) Patterns used in computer programming

glob is a libc function for globbing, which is the archetypal use of pattern matching against the names in a filesystem directory such that a name pattern is expanded into a list of names matching that pattern. Although globbing may now refer to glob -style pattern matching of any string, not just expansion into a list of filesystem names, the original meaning of the term is still widespread.

join is a command in Unix and Unix-like operating systems that merges the lines of two sorted text files based on the presence of a common field. It is similar to the join operator used in relational databases but operating on text files.

wildmat is a pattern matching library developed by Rich Salz. Based on the wildcard syntax already used in the Bourne shell, wildmat provides a uniform mechanism for matching patterns across applications with simpler syntax than that typically offered by regular expressions. Patterns are implicitly anchored at the beginning and end of each string when testing for a match.

cmd.exe Command prompt program

Command Prompt, also known as cmd.exe or cmd, is the default command-line interpreter for the OS/2, eComStation, ArcaOS, Microsoft Windows, and ReactOS operating systems. On Windows CE .NET 4.2, Windows CE 5.0 and Windows Embedded CE 6.0 it is referred to as the Command Processor Shell. Its implementations differ between operating systems, but the behavior and basic set of commands are consistent. cmd.exe is the counterpart of COMMAND.COM in DOS and Windows 9x systems, and analogous to the Unix shells used on Unix-like systems. The initial version of cmd.exe for Windows NT was developed by Therese Stowell. Windows CE 2.11 was the first embedded Windows release to support a console and a Windows CE version of cmd.exe. The ReactOS implementation of cmd.exe is derived from FreeCOM, the FreeDOS command line interpreter.

In Unix-like operating systems, find is a command-line utility that locates files based on some user-specified criteria and either prints the pathname of each matched object or, if another action is requested, performs that action on each matched object.

<span class="mw-page-title-main">Comparison of command shells</span>

A command shell is a command-line interface to interact with and manipulate a computer's operating system.

alias (command) Command in various command line interpreters

In computing, alias is a command in various command-line interpreters (shells), which enables a replacement of a word by another string. It is mainly used for abbreviating a system command, or for adding default arguments to a regularly used command. alias is available in Unix shells, AmigaDOS, 4DOS/4NT, FreeDOS, KolibriOS, Windows PowerShell, ReactOS, and the EFI shell. Aliasing functionality in the MS-DOS and Microsoft Windows operating systems is provided by the DOSKey command-line utility.

expr is a command line utility on Unix and Unix-like operating systems which evaluates an expression and outputs the corresponding value. It first appeared in Unix v7. The command is available for Microsoft Windows as part of the UnxUtils collection of native Win32 ports of common GNU Unix-like utilities. The expr command has also been ported to the IBM i operating system.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive mode available with punched cards.

<span class="mw-page-title-main">Hamilton C shell</span>

Hamilton C shell is a clone of the Unix C shell and utilities for Microsoft Windows created by Nicole Hamilton at Hamilton Laboratories as a completely original work, not based on any prior code. It was first released on OS/2 on December 12, 1988 and on Windows NT in July 1992. The OS/2 version was discontinued in 2003 but the Windows version continues to be actively supported.

In computer science, the Krauss wildcard-matching algorithm is a pattern matching algorithm. Based on the wildcard syntax in common use, e.g. in the Microsoft Windows command-line interface, the algorithm provides a non-recursive mechanism for matching patterns in software applications, based on syntax simpler than that typically offered by regular expressions.

In computer science, an algorithm for matching wildcards is useful in comparing text strings that may contain wildcard syntax. Common uses of these algorithms include command-line interfaces, e.g. the Bourne shell or Microsoft Windows command-line or text editor or file manager, as well as the interfaces for some search engines and databases. Wildcard matching is a subset of the problem of matching regular expressions and string matching in general.

References

  1. "Using wildcard characters". Microsoft. Archived from the original on 2017-03-24. Retrieved 2018-01-23.
  2. "What is a wildcard?". Computer Hope. Archived from the original on 2016-11-21. Retrieved 2016-11-21.
  3. Cantatore, Alessandro (Apr 25, 2003). "Wildcard matching algorithms". Archived from the original on Oct 14, 2023.