Hungarian notation

Last updated

Hungarian notation is an identifier naming convention in computer programming in which the name of a variable or function indicates its intention or kind, or in some dialects, its type. The original Hungarian notation uses intention or kind in its naming convention and is sometimes called Apps Hungarian as it became popular in the Microsoft Apps division in the development of Word, Excel and other applications. When the Microsoft Windows division adopted the naming convention, they based it on the actual data type, and this convention became widely spread through the Windows API; this is sometimes called Systems Hungarian notation.

Contents

Simonyi: ...BCPL [had] a single type which was a 16-bit word... not that it matters.

Booch: Unless you continue the Hungarian notation.

Simonyi: Absolutely... we went over to the typed languages too later ... But ... we would look at one name and I would tell you exactly a lot about that... [1]

Hungarian notation was designed to be language-independent, and found its first major use with the BCPL programming language. Because BCPL has no data types other than the machine word, nothing in the language itself helps a programmer remember variables' types. Hungarian notation aims to remedy this by providing the programmer with explicit knowledge of each variable's data type.

In Hungarian notation, a variable name starts with a group of lower-case letters which are mnemonics for the type or purpose of that variable, followed by whatever name the programmer has chosen; this last part is sometimes distinguished as the given name. The first character of the given name can be capitalized to separate it from the type indicators (see also CamelCase). Otherwise the case of this character denotes scope.

History

The original Hungarian notation was invented by Charles Simonyi, a programmer who worked at Xerox PARC circa 1972–1981, and who later became Chief Architect at Microsoft. The name of the notation is a reference to Simonyi's nation of origin, and also, according to Andy Hertzfeld, because it made programs "look like they were written in some inscrutable foreign language". [2] Hungarian people's names are "reversed" compared to most other European names; the family name precedes the given name. For example, the anglicized name "Charles Simonyi" in Hungarian was originally "Simonyi Károly". In the same way, the type name precedes the "given name" in Hungarian notation. The similar Smalltalk "type last" naming style (e.g. aPoint and lastPoint) was common at Xerox PARC during Simonyi's tenure there.[ citation needed ]

Simonyi's paper on the notation referred to prefixes used to indicate the "type" of information being stored. [3] [4] His proposal was largely concerned with decorating identifier names based upon the semantic information of what they store (in other words, the variable's purpose). Simonyi's notation came to be called Apps Hungarian, since the convention was used in the applications division of Microsoft. Systems Hungarian developed later in the Microsoft Windows development team. Apps Hungarian is not entirely distinct from what became known as Systems Hungarian, as some of Simonyi's suggested prefixes contain little or no semantic information (see below for examples). [4]

Systems Hungarian vs. Apps Hungarian

Where Systems notation and Apps notation differ is in the purpose of the prefixes.

In Systems Hungarian notation, the prefix encodes the actual data type of the variable. For example:

Apps Hungarian notation strives to encode the logical data type rather than the physical data type; in this way, it gives a hint as to what the variable's purpose is, or what it represents.

Most, but not all, of the prefixes Simonyi suggested are semantic in nature. To modern eyes, some prefixes seem to represent physical data types, such as sz for strings. However, such prefixes were still semantic, as Simonyi intended Hungarian notation for languages whose type systems could not distinguish some data types that modern languages take for granted.

The following are examples from the original paper: [3]

While the notation always uses initial lower-case letters as mnemonics, it does not prescribe the mnemonics themselves. There are several widely used conventions (see examples below), but any set of letters can be used, as long as they are consistent within a given body of code.

It is possible for code using Apps Hungarian notation to sometimes contain Systems Hungarian when describing variables that are defined solely in terms of their type.

Relation to sigils

In some programming languages, a similar notation now called sigils is built into the language and enforced by the compiler. For example, in some forms of BASIC, name$ names a string and count% names an integer. The major difference between Hungarian notation and sigils is that sigils declare the type of the variable in the language, whereas Hungarian notation is purely a naming scheme with no effect on the machine interpretation of the program text.

Examples

The mnemonics for pointers and arrays, which are not actual data types, are usually followed by the type of the data element itself:

While Hungarian notation can be applied to any programming language and environment, it was widely adopted by Microsoft for use with the C language, in particular for Microsoft Windows, and its use remains largely confined to that area. In particular, use of Hungarian notation was widely evangelized by Charles Petzold's "Programming Windows", the original (and for many readers, the definitive) book on Windows API programming. Thus, many commonly seen constructs of Hungarian notation are specific to Windows:

The notation is sometimes extended in C++ to include the scope of a variable, optionally separated by an underscore. [5] [6] This extension is often also used without the Hungarian type-specification:

In JavaScript code using jQuery, a $ prefix is often used to indicate that a variable holds a jQuery object (versus a plain DOM object or some other value). [7]

Advantages

(Some of these apply to Systems Hungarian only.)

Supporters argue that the benefits of Hungarian Notation include: [3]

Disadvantages

Most arguments against Hungarian notation are against Systems Hungarian notation, not Apps Hungarian notation. Some potential issues are:

Notable opinions

See also

Related Research Articles

<span class="mw-page-title-main">Dylan (programming language)</span>

Dylan is a multi-paradigm programming language that includes support for functional and object-oriented programming (OOP), and is dynamic and reflective while providing a programming model designed to support generating efficient machine code, including fine-grained control over dynamic and static behaviors. It was created in the early 1990s by a group led by Apple Computer.

In mathematics and computing, the hexadecimal numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexadecimal uses 16 distinct symbols, most often the symbols "0"–"9" to represent values 0 to 9, and "A"–"F" to represent values from 10 to 15.

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are commonly represented in a computer as a group of binary digits (bits). The size of the grouping varies so the set of integer sizes available varies between different types of computers. Computer hardware nearly always provides a way to represent a processor register or memory address as an integer.

<span class="mw-page-title-main">String (computer science)</span> Sequence of characters, data type

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed. A string is generally considered as a data type and is often implemented as an array data structure of bytes that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence data types and structures.

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term". Usually the terms are various constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term. Type systems formalize and enforce the otherwise implicit categories the programmer uses for algebraic data types, data structures, or other components.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

This article compares two programming languages: C# with Java. While the focus of this article is mainly the languages and their features, such a comparison will necessarily also consider some features of platforms and libraries. For a more detailed comparison of the platforms, see Comparison of the Java and .NET platforms.

MBASIC is the Microsoft BASIC implementation of BASIC for the CP/M operating system. MBASIC is a descendant of the original Altair BASIC interpreters that were among Microsoft's first products. MBASIC was one of the two versions of BASIC bundled with the Osborne 1 computer. The name "MBASIC" is derived from the disk file name MBASIC.COM of the BASIC interpreter.

In computer programming, a sigil is a symbol affixed to a variable name, showing the variable's datatype or scope, usually a prefix, as in $foo, where $ is the sigil.

In some programming languages, const is a type qualifier that indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in being part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time on each use. Languages which utilize it include C, C++, D, JavaScript, Julia, and Rust.

In computer programming, a naming convention is a set of rules for choosing the character sequence to be used for identifiers which denote variables, types, functions, and other entities in source code and documentation.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In computer programming, Intentional Programming is a programming paradigm developed by Charles Simonyi that encodes in software source code the precise intention which programmers have in mind when conceiving their work. By using the appropriate level of abstraction at which the programmer is thinking, creating and maintaining computer programs become easier. By separating the concerns for intentions and how they are being operated upon, the software becomes more modular and allows for more reusable software code.

<span class="mw-page-title-main">C Sharp (programming language)</span> Multi-paradigm (object-oriented) programming language

C# is a general-purpose high-level programming language supporting multiple paradigms. C# encompasses static typing, strong typing, lexically scoped, imperative, declarative, functional, generic, object-oriented (class-based), and component-oriented programming disciplines.

A scanf format string is a control parameter used in various functions to specify the layout of an input string. The functions can then divide the string and translate into values of appropriate data types. String scanning functions are often supplied in standard libraries.Scanf is a function that reads formatted data from the standard input string, which is usually the keyboard and writes the results whenever called in the specified arguments.

In computer programming, an enumerated type is a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language. An enumerated type can be seen as a degenerate tagged union of unit type. A variable that has been declared as having an enumerated type can be assigned any of the enumerators as a value. In other words, an enumerated type has values that are different from each other, and that can be compared and assigned, but are not specified by the programmer as having any particular concrete representation in the computer's memory; compilers and interpreters can represent them arbitrarily.

Systems Programming Language, often shortened to SPL but sometimes known as SPL/3000, was a procedurally-oriented programming language written by Hewlett-Packard for the HP 3000 minicomputer line and first introduced in 1972. SPL was used to write the HP 3000's primary operating system, Multi-Programming Executive (MPE). Similar languages on other platforms were generically referred to as system programming languages, confusing matters.

<span class="mw-page-title-main">Visual Basic (classic)</span> Event-driven programming language

The original Visual Basic is a third-generation event-driven programming language from Microsoft known for its Component Object Model (COM) programming model first released in 1991 and declared legacy during 2008. Microsoft intended Visual Basic to be relatively easy to learn and use. Visual Basic was derived from BASIC and enables the rapid application development (RAD) of graphical user interface (GUI) applications, access to databases using Data Access Objects, Remote Data Objects, or ActiveX Data Objects, and creation of ActiveX controls and objects.

In computer programming, a constant is a value that should not be altered by the program during normal execution, i.e., the value is constant. When associated with an identifier, a constant is said to be "named," although the terms "constant" and "named constant" are often used interchangeably. This is contrasted with a variable, which is an identifier with a value that can be changed during normal execution, i.e., the value is variable.

References

  1. "Oral History of Charles Simonyi" (PDF). Archive.computerhistory.org\accessdate=5 August 2018. Archived (PDF) from the original on 2015-09-10.
  2. Rosenberg, Scott (1 January 2007). "Anything You Can Do, I Can Do Meta". MIT Technology Review. Retrieved 21 July 2022.
  3. 1 2 3 Charles Simonyi (November 1999). "Hungarian Notation". MSDN Library. Microsoft Corp.
  4. 1 2 3 Spolsky, Joel (2005-05-11). "Making Wrong Code Look Wrong". Joel on Software. Retrieved 2005-12-13.
  5. "Mozilla Coding Style". Developer.mozilla.org. Retrieved 17 March 2015.
  6. "Webkit Coding Style Guidelines". Webkit.org. Retrieved 17 March 2015.
  7. "Why would a JavaScript variable start with a dollar sign?". Stack Overflow. Retrieved 12 February 2016.
  8. Jones, Derek M. (2009). The New C Standard: A Cultural and Economic Commentary (PDF). Addison-Wesley. p. 727. ISBN   978-0-201-70917-9. Archived (PDF) from the original on 2011-05-01.
  9. "Make an app for any task - FileMaker — An Apple Subsidiary". Filemaker.com. Retrieved 5 August 2018.
  10. Martin, Robert Cecil (2008). Clean Code: A Handbook of Agile Software Craftsmanship. Redmond, WA: Prentice Hall PTR. ISBN   978-0-13-235088-4.
  11. "Linux kernel coding style". Linux kernel documentation. Retrieved 9 March 2018.
  12. McConnell, Steve (2004). Code Complete (2nd ed.). Redmond, WA: Microsoft Press. ISBN   0-7356-1967-0.
  13. Stroustrup, Bjarne (2007). "Bjarne Stroustrup's C++ Style and Technique FAQ" . Retrieved 15 February 2015.
  14. "Design Guidelines for Developing Class Libraries: General Naming Conventions" . Retrieved 2008-01-03.