Abstraction principle (computer programming)

Last updated

In software engineering and programming language theory, the abstraction principle (or the principle of abstraction) is a basic dictum that aims to reduce duplication of information in a program (usually with emphasis on code duplication) whenever practical by making use of abstractions provided by the programming language or software libraries. [1] The principle is sometimes stated as a recommendation to the programmer, but sometimes stated as a requirement of the programming language, assuming it is self-understood why abstractions are desirable to use. The origins of the principle are uncertain; it has been reinvented a number of times, sometimes under a different name, with slight variations.

Contents

When read as recommendations to the programmer, the abstraction principle can be generalized as the "don't repeat yourself" (DRY) principle, which recommends avoiding the duplication of information in general, and also avoiding the duplication of human effort involved in the software development process.

The principle

As a recommendation to the programmer, in its formulation by Benjamin C. Pierce in Types and Programming Languages (2002), the abstraction principle reads (emphasis in original): [2]

Each significant piece of functionality in a program should be implemented in just one place in the source code. Where similar functions are carried out by distinct pieces of code, it is generally beneficial to combine them into one by abstracting out the varying parts.

As a requirement of the programming language, in its formulation by David A. Schmidt in The structure of typed programming languages (1994), the abstraction principle reads:. [3]

The phrases of any semantically meaningful syntactic class may be named.

History and variations

The abstraction principle is mentioned in several books. Some of these, together with the formulation if it is succinct, are listed below.

The principle plays a central role in design patterns in object-oriented programming, although most writings on that topic do not give a name to the principle. The Design Patterns book by the Gang of Four, states: "The focus here is encapsulating the concept that varies, a theme of many design patterns." This statement has been rephrased by other authors as "Find what varies and encapsulate it." [7]

In this century, the principle has been reinvented in extreme programming under the slogan "Once and Only Once". The definition of this principle was rather succinct in its first appearance: "no duplicate code". [8] It has later been elaborated as applicable to other issues in software development: "Automate every process that's worth automating. If you find yourself performing a task many times, script it." [9]

Implications

The abstraction principle is often stated in the context of some mechanism intended to facilitate abstraction. The basic mechanism of control abstraction is a function or subroutine. Data abstractions include various forms of type polymorphism. More elaborate mechanisms that may combine data and control abstractions include: abstract data types, including classes, polytypism etc. The quest for richer abstractions that allow less duplication in complex scenarios is one of the driving forces in programming language research and design.

Inexperienced programmers may be tempted to introduce too much abstraction in their programabstraction that won't be used more than once. [ citation needed ] A complementary principle that emphasizes this issue is "You Ain't Gonna Need It" and, more generally, the KISS principle.

Since code is usually subject to revisions, following the abstraction principle may entail refactoring code.[ citation needed ] The effort of rewriting a piece of code generically needs to be amortized against the estimated future benefits of an abstraction. A rule of thumb governing this was devised by Martin Fowler, and popularized as the rule of three. It states that if a piece of code is copied more than twice, i.e. it would end up having three or more copies, then it needs to be abstracted out.

Generalizations

"Don't repeat yourself", or the "DRY principle", is a generalization developed in the context of multi-tier architectures, where related code is by necessity duplicated to some extent across tiers, usually in different languages. In practical terms, the recommendation here is to rely on automated tools, like code generators and data transformations to avoid repetition.[ citation needed ]

Hardware programming interfaces

In addition to optimizing code, a hierarchical/recursive meaning of Abstraction level in programming also refers to the interfaces between hardware communication layers, also called "abstraction levels" and "abstraction layers." In this case, level of abstraction often is synonymous with interface. For example, in examining shellcode and the interface between higher and lower level languages, the level of abstraction changes from operating system commands (for example, in C) to register and circuit level calls and commands (for example, in assembly and binary). In the case of that example, the boundary or interface between the abstraction levels is the stack. [10]

Related Research Articles

<span class="mw-page-title-main">Programming language</span> Language for communicating instructions to a machine

A programming language is a system of notation for writing computer programs.

In computer programming and software design, code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior. Refactoring is intended to improve the design, structure, and/or implementation of the software, while preserving its functionality. Potential advantages of refactoring may include improved code readability and reduced complexity; these can improve the source code's maintainability and create a simpler, cleaner, or more expressive internal architecture or object model to improve extensibility. Another potential goal for refactoring is improved performance; software engineers face an ongoing challenge to write programs that perform faster or use less memory.

In software engineering and computer science, abstraction is the process of generalizing concrete details, such as attributes, away from the study of objects and systems to focus attention on details of greater importance. Abstraction is a fundamental concept in computer science and software engineering, especially within the object-oriented programming paradigm. Examples of this include:

In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate significant areas of computing systems, making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.

Software design is the process by which an agent creates a specification of a software artifact intended to accomplish goals, using a set of primitive components and subject to constraints. The term is sometimes used broadly to refer to "all the activity involved in conceptualizing, framing, implementing, commissioning, and ultimately modifying" the software, or more specifically "the activity following requirements specification and before programming, as ... [in] a stylized software engineering process."

The Law of Demeter (LoD) or principle of least knowledge is a design guideline for developing software, particularly object-oriented programs. In its general form, the LoD is a specific case of loose coupling. The guideline was proposed by Ian Holland at Northeastern University towards the end of 1987, and the following three recommendations serve as a succinct summary:

In computer science, separation of concerns is a design principle for separating a computer program into distinct sections. Each section addresses a separate concern, a set of information that affects the code of a computer program. A concern can be as general as "the details of the hardware for an application", or as specific as "the name of which class to instantiate". A program that embodies SoC well is called a modular program. Modularity, and hence separation of concerns, is achieved by encapsulating information inside a section of code that has a well-defined interface. Encapsulation is a means of information hiding. Layered designs in information systems are another embodiment of separation of concerns.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined to accomplish a task, much as one might use multiple hands to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

Hardware abstractions are sets of routines in software that provide programs with access to hardware resources through programming interfaces. The programming interface allows all devices in a particular class C of hardware devices to be accessed through identical interfaces even though C may contain different subclasses of devices that each provide a different hardware interface.

In computing, an abstraction layer or abstraction level is a way of hiding the working details of a subsystem. Examples of software models that use layers of abstraction include the OSI model for network protocols, OpenGL, and other graphics libraries, which allow the separation of concerns to facilitate interoperability and platform independence. Another example is Media Transfer Protocol.

In software, a data access object (DAO) is a pattern that provides an abstract interface to some type of database or other persistence mechanism. By mapping application calls to the persistence layer, the DAO provides data operations without exposing database details. This isolation supports the single responsibility principle. It separates the data access the application needs, in terms of domain-specific objects and data types, from how these needs can be satisfied with a specific DBMS.

In computer science, automatic programming is a type of computer programming in which some mechanism generates a computer program to allow human programmers to write the code at a higher abstraction level.

Extensibility is a software engineering and systems design principle that provides for future growth. Extensibility is a measure of the ability to extend a system and the level of effort required to implement the extension. Extensions can be through the addition of new functionality or through modification of existing functionality. The principle provides for enhancements without impairing existing system functions.

In object-oriented design, the dependency inversion principle is a specific methodology for loosely coupled software modules. When following this principle, the conventional dependency relationships established from high-level, policy-setting modules to low-level, dependency modules are reversed, thus rendering high-level modules independent of the low-level module implementation details. The principle states:

"Don't repeat yourself" (DRY) is a principle of software development aimed at reducing repetition of information which is likely to change, replacing it with abstractions that are less likely to change, or using data normalization which avoids redundancy in the first place.

In computer programming, duplicate code is a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable for a number of reasons. A minimum requirement is usually applied to the quantity of code that must appear in a sequence for it to be considered duplicate rather than coincidentally similar. Sequences of duplicate code are sometimes known as code clones or just clones, the automated process of finding duplications in source code is called clone detection.

S-algol is a computer programming language derivative of ALGOL 60 developed at the University of St Andrews in 1979 by Ron Morrison and Tony Davie. The language is a modification of ALGOL to contain orthogonal data types that Morrison created for his PhD thesis. Morrison would go on to become professor at the university and head of the department of computer science. The S-algol language was used for teaching at the university at an undergraduate level until 1999. It was also the language taught for several years in the 1980s at a local school in St. Andrews, Madras College. The computer science text Recursive Descent Compiling describes a recursive descent compiler for S-algol, implemented in S-algol.

The following outline is provided as an overview of and topical guide to computer programming:

Object-oriented programming (OOP) is a programming paradigm based on the concept of objects, which can contain data and code: data in the form of fields, and code in the form of procedures.

In computer programming, design smells are "structures in the design that indicate violation of fundamental design principles and negatively impact design quality". The origin of the term "design smell" can be traced to the term "code smell" which was featured in the book Refactoring: Improving the Design of Existing Code by Martin Fowler.

References

  1. Mishra, Jibitesh (2011). Software Engineering. Pearson Education India. ISBN   978-81-317-5869-4.
  2. Pierce, Benjamin (2002). Types and Programming Languages. MIT Press. p. 339. ISBN   0-262-16209-1.
  3. David A. Schmidt, The structure of typed programming languages, MIT Press, 1994, ISBN   0-262-19349-3, p. 32
  4. Alfred John Cole, Ronald Morrison, An introduction to programming with S-algol, CUP Archive, 1982, ISBN   0-521-25001-3, p. 150
  5. Bruce J. MacLennan, Principles of programming languages: design, evaluation, and implementation, Holt, Rinehart, and Winston, 1983, p. 53
  6. Jon Pearce, Programming and meta-programming in scheme, Birkhäuser, 1998, ISBN   0-387-98320-1, p. 40
  7. Alan Shalloway, James Trott, Design patterns explained: a new perspective on object-oriented design, Addison-Wesley, 2002, ISBN   0-201-71594-5, p. 115
  8. Kent Beck, Extreme programming explained: embrace change, 2nd edition, Addison-Wesley, 2000, ISBN   0-201-61641-6, p. 61
  9. Chromatic, Extreme programming pocket guide, O'Reilly, 2003, ISBN   0-596-00485-0
  10. Koziol, The Shellcoders Handbook", Wiley, 2004, p. 10, ISBN   0-7645-4468-3