Case sensitivity

Last updated
The lowercase "a" and uppercase "A" are the two case variants of the first letter in the English alphabet. A (capital and small).svg
The lowercase "a" and uppercase "A" are the two case variants of the first letter in the English alphabet.

In computers, case sensitivity defines whether uppercase and lowercase letters are treated as distinct (case-sensitive) or equivalent (case-insensitive). For instance, when users interested in learning about dogs search an e-book, "dog" and "Dog" are of the same significance to them. Thus, they request a case-insensitive search. But when they search an online encyclopedia for information about the United Nations, for example, or something with no ambiguity regarding capitalization and ambiguity between two or more terms cut down by capitalization, they may prefer a case-sensitive search.

Contents

Areas of significance

Case sensitivity may differ depending on the situation:

In programming languages

Some programming languages are case-sensitive for their identifiers (C, C++, Java, C#, Verilog, [2] Ruby, [3] Python and Swift). Others are case-insensitive (i.e., not case-sensitive), such as ABAP, Ada, most BASICs (an exception being BBC BASIC), Common Lisp, Fortran, SQL (for the syntax, and for some vendor implementations, e.g. Microsoft SQL Server, the data itself) [NB 2] and Pascal. There are also languages, such as Haskell, Prolog, and Go, in which the capitalisation of an identifier encodes information about its semantics. Some other programming languages have varying case sensitivity; in PHP, for example, variable names are case-sensitive but function names are not case-sensitive. This means that if you define a function in lowercase, you can call it in uppercase, but if you define a variable in lowercase, you cannot refer to it in uppercase. Nim is case-insensitive and ignores underscores, as long as the first characters match. [4]

A text search operation could be case-sensitive or case-insensitive, depending on the system, application, or context. The user can in many cases specify whether a search is sensitive to case, e.g. in most text editors, word processors, and Web browsers. A case-insensitive search is more comprehensive, finding "Language" (at the beginning of a sentence), "language", and "LANGUAGE" (in a title in capitals); a case-sensitive search will find the computer language "BASIC" but exclude most of the many unwanted instances of the word. For example, the Google Search engine is basically case-insensitive, with no option for case-sensitive search. [5] In Oracle SQL most operations and searches are case-sensitive by default, [6] while in most other DBMS's SQL searches are case-insensitive by default. [7]

Case-insensitive operations are sometimes said to fold case, from the idea of folding the character code table so that upper- and lowercase letters coincide.

In filesystems

In filesystems in Unix-like systems, filenames are usually case-sensitive (there can be separate readme.txt and Readme.txt files in the same directory). MacOS is somewhat unusual in that, by default, it uses HFS+ and APFS in a case-insensitive (so that there cannot be a readme.txt and a Readme.txt in the same directory) but case-preserving mode (so that a file created as readme.txt is shown as readme.txt and a file created as Readme.txt is shown as Readme.txt) by default. This causes some issues for developers and power users, because most file systems in other Unix-like environments are case-sensitive, and, for example, a source code tree for software for Unix-like systems might have both a file named Makefile and a file named makefile in the same directory. In addition, some Mac Installers assume case insensitivity and fail on case-sensitive file systems.

The older MS-DOS filesystems FAT12 and FAT16 were case-insensitive and not case-preserving, so that a file whose name is entered as readme.txt or ReadMe.txt is saved as README.TXT. Later, with VFAT in Windows 95 the FAT file systems became case-preserving as an extension of supporting long filenames. [8] Later Windows file systems such as NTFS are internally case-sensitive, and a readme.txt and a Readme.txt can coexist in the same directory. However, for practical purposes filenames behave as case-insensitive as far as users and most software are concerned. [9] This can cause problems for developers or software coming from Unix-like environments, similar to the problems with macOS case-insensitive file systems.

Notes

  1. The link "Friendly Fire" must go through the (disambiguation) qualifier in a Wikipedia article to avoid a WP:INTDAB error.
  2. Although one can explicitly set a single database or column collation to be case-sensitive

Related Research Articles

An 8.3 filename is one that obeys the filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename, to provide compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

A filename extension, file name extension or file extension is a suffix to the name of a computer file. The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically delimited from the rest of the filename with a period, but in some systems it is separated with spaces.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

<span class="mw-page-title-main">John the Ripper</span> Password cracking software tool

John the Ripper is a free password cracking software tool. Originally developed for the Unix operating system, it can run on fifteen different platforms. It is among the most frequently used password testing and breaking programs as it combines a number of password crackers into one package, autodetects password hash types, and includes a customizable cracker. It can be run against various encrypted password formats including several crypt password hash types most commonly found on various Unix versions, Kerberos AFS, and Windows NT/2000/XP/2003 LM hash. Additional modules have extended its ability to include MD4-based password hashes and passwords stored in LDAP, MySQL, and others.

In computing, a command is a directive to a computer program to perform a specific task. It may be issued via a command-line interface or as input to a network service as part of a network protocol, or as an event triggered in a graphical user interface.

In computing, the superuser is a special user account used for system administration. Depending on the operating system (OS), the actual name of this account might be root, administrator, admin or supervisor. In some cases, the actual name of the account is not the determining factor; on Unix-like systems, for example, the user with a user identifier (UID) of zero is the superuser, regardless of the name of that account; and in systems which implement a role-based security model, any user with the role of superuser can carry out all actions of the superuser account. The principle of least privilege recommends that most users and applications run under an ordinary account to perform their work, as a superuser account is capable of making unrestricted, potentially adverse, system-wide changes.

An environment variable is a user-definable value that can affect the way running processes will behave on a computer. Environment variables are part of the environment in which a process runs. For example, a running process can query the value of the TEMP environment variable to discover a suitable location to store temporary files, or the HOME or USERPROFILE variable to find the directory structure owned by the user running the process.

<span class="mw-page-title-main">Filename</span> Text string used to uniquely identify a computer file

A filename or file name is a name used to uniquely identify a computer file in a file system. Different file systems impose different restrictions on filename lengths.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txttextfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard and *.txt is a glob pattern. The wildcard * stands for "any string of any length including empty, but excluding the path separator characters ".

<span class="mw-page-title-main">README</span> Software information file

In software development, a README file contains information about the other files in a directory or archive of computer software. A form of documentation, it is usually a simple plain text file called README, Read Me, READ.ME, README.TXT, README.md, or README.1ST.

less (Unix) Terminal pager

less is a terminal pager program on Unix, Windows, and Unix-like systems used to view the contents of a text file one screen at a time. It is similar to more, but has the extended capability of allowing both forward and backward navigation through the file. Unlike most Unix text editors/viewers, less does not need to read the entire file before starting, allowing for immediate viewing regardless of file size.

Code injection is the exploitation of a computer bug that is caused by processing invalid data. The injection is used by an attacker to introduce code into a vulnerable computer program and change the course of execution. The result of successful code injection can be disastrous, for example, by allowing computer viruses or computer worms to propagate.

<span class="mw-page-title-main">Case preservation</span> Text processing that preserves the original capitalization of text

In file systems, case preservation is the preservation of the letter case of letters in file names. If an attempt is made to create a file named "ThisIsAFile" on a file system that preserves letter case, the file's name will be "ThisIsAFile", rather than, for example, "thisisafile" or "THISISAFILE".

<span class="mw-page-title-main">Comparison of command shells</span>

A command shell is a command-line interface to interact with and manipulate a computer's operating system.

An INI file is a configuration file for computer software that consists of a text-based content with a structure and syntax comprising key–value pairs for properties, and sections that organize the properties. The name of these configuration files comes from the filename extension INI, for initialization, used in the MS-DOS operating system which popularized this method of software configuration. The format has become an informal standard in many contexts of configuration, but many applications on other operating systems use different file name extensions, such as conf and cfg.

An autorun.inf file is a text file that can be used by the AutoRun and AutoPlay components of Microsoft Windows operating systems. For the file to be discovered and used by these component, it must be located in the root directory of a volume. As Windows has a case-insensitive view of filenames, the autorun.inf file can be stored as AutoRun.inf or Autorun.INF or any other case combination.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

In computing, runas is a command in the Microsoft Windows line of operating systems that allows a user to run specific tools and programs under a different username to the one that was used to logon to a computer interactively. It is similar to the Unix commands sudo and su, but the Unix commands generally require prior configuration by the system administrator to work for a particular user and/or command.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

References

  1. See WP:DIFFCAPS
  2. Kernighan, Brian W.; Ritchie, Dennis M. (February 1978). "Chapter 2: Types, Operators and Expressions". The C Programming Language (1st ed.). Englewood Cliffs, NJ: Prentice Hall. p.  33. ISBN   0-13-110163-3.
  3. Matsumoto, Yukihiro (January 2002). "Chapter 2: Language Basics". Ruby in a nutshell (1st ed.). O'Reilly Media. p.  9. ISBN   0-596-00214-9.
  4. "Nim Manual: Identifier Equality". nim-lang.github.io. Retrieved 2019-04-27.
  5. "case-sensitive-search - case sensitive google search - Google Project Hosting". code.google.com. Retrieved 2013-05-20.
  6. "2.10 Making Queries Case Insensitive". Oracle SQL Developer User's Guide, Release 1.5 (PDF). Oracle Corporation. August 2013.
  7. "C.5.5.1 Case Sensitivity in String Searches". MySQL 5.0 Reference Manual. MySQL. Retrieved 2013-05-20.
  8. "Case Sensitivity in Subsystem for UNIX-based Applications". Microsoft Learn. 2005-08-22. Retrieved 2013-05-20.
  9. "Filenames are Case Sensitive on NTFS Volumes". Microsoft Support. 2006-11-01. Archived from the original on Jul 23, 2013. Retrieved 2013-05-20.