Program execution |
---|
General concepts |
Types of code |
Compilation strategies |
Notable runtimes |
|
Notable compilers & toolchains |
|
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only understands machine code, source code must be translated before a computer can execute it. The translation process can be implemented three ways. Source code can be converted into machine code by a compiler or an assembler. The resulting executable is machine code ready for the computer. Alternatively, source code can be executed without conversion via an interpreter. An interpreter loads the source code into memory. It simultaneously translates and executes each statement. A method that combines compilation and interpretation is to first produce bytecode. Bytecode is an intermediate representation of source code that is quickly interpreted.
The first programmable computers, which appeared at the end of the 1940s, [2] were programmed in machine language (simple instructions that could be directly executed by the processor). Machine language was difficult to debug and was not portable between different computer systems. [3] Initially, hardware resources were scarce and expensive, while human resources were cheaper. [4] As programs grew more complex, programmer productivity became a bottleneck. This led to the introduction of high-level programming languages such as Fortran in the mid-1950s. These languages abstracted away the details of the hardware, instead being designed to express algorithms that could be understood more easily by humans. [5] [6] As instructions distinct from the underlying computer hardware, software is therefore relatively recent, dating to these early high-level programming languages such as Fortran, Lisp, and Cobol. [6] The invention of high-level programming languages was simultaneous with the compilers needed to translate the source code automatically into machine code that can be directly executed on the computer hardware. [7]
Source code is the form of code that is modified directly by humans, typically in a high-level programming language. Object code can be directly executed by the machine and is generated automatically from the source code, often via an intermediate step, assembly language. While object code will only work on a specific platform, source code can be ported to a different machine and recompiled there. For the same source code, object code can vary significantly—not only based on the machine for which it is compiled, but also based on performance optimization from the compiler. [8] [9]
Most programs do not contain all the resources needed to run them and rely on external libraries. Part of the compiler's function is to link these files in such a way that the program can be executed by the hardware. [10]
Software developers often use configuration management to track changes to source code files (version control). The configuration management system also keeps track of which object code file corresponds to which version of the source code file. [11]
The number of lines of source code is often used as a metric when evaluating the productivity of computer programmers, the economic value of a code base, effort estimation for projects in development, and the ongoing cost of software maintenance after release. [12]
Source code is also used to communicate algorithms between people – e.g., code snippets online or in books. [13]
Computer programmers may find it helpful to review existing source code to learn about programming techniques. [13] The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills. [13] Some people consider source code an expressive artistic medium. [14]
Source code often contains comments—blocks of text marked for the compiler to ignore. This content is not part of the program logic, but is instead intended to help readers understand the program. [15]
Companies often keep the source code confidential in order to hide algorithms considered a trade secret. Proprietary, secret source code and algorithms are widely used for sensitive government applications such as criminal justice, which results in black box behavior with a lack of transparency into the algorithm's methodology. The result is avoidance of public scrutiny of issues such as bias. [16]
Access to the source code (not just the object code) is essential to modifying it. [17] Understanding existing code is necessary to understand how it works [17] and before modifying it. [18] The rate of understanding depends both on the code base as well as the skill of the programmer. [19] Experienced programmers have an easier time understanding what the code does at a high level. [20] Software visualization is sometimes used to speed up this process. [21]
Many software programmers use an integrated development environment (IDE) to improve their productivity. IDEs typically have several features built in, including a source-code editor that can alert the programmer to common errors. [22] Modification often includes code refactoring (improving the structure without changing functionality) and restructuring (improving structure and functionality at the same time). [23] Nearly every change to code will introduce new bugs or unexpected ripple effects, which require another round of fixes. [18]
Code reviews by other developers are often used to scrutinize new code added to a project. [24] The purpose of this phase is often to verify that the code meets style and maintainability standards and that it is a correct implementation of the software design. [25] According to some estimates, code review dramatically reduce the number of bugs persisting after software testing is complete. [24] Along with software testing that works by executing the code, static program analysis uses automated tools to detect problems with the source code. Many IDEs support code analysis tools, which might provide metrics on the clarity and maintainability of the code. [26] Debuggers are tools that often enable programmers to step through execution while keeping track of which source code corresponds to each change of state. [27]
Source code files in a high-level programming language must go through a stage of preprocessing into machine code before the instructions can be carried out. [7] After being compiled, the program can be saved as an object file and the loader (part of the operating system) can take this saved file and execute it as a process on the computer hardware. [10] Some programming languages use an interpreter instead of a compiler. An interpreter converts the program into machine code at run time, which makes them 10 to 100 times slower than compiled programming languages. [22] [28]
Software quality is an overarching term that can refer to a code's correct and efficient behavior, its reusability and portability, or the ease of modification. [29] It is usually more cost-effective to build quality into the product from the beginning rather than try to add it later in the development process. [30] Higher quality code will reduce lifetime cost to both suppliers and customers as it is more reliable and easier to maintain. [31] [32]
Maintainability is the quality of software enabling it to be easily modified without breaking existing functionality. [33] Following coding conventions such as using clear function and variable names that correspond to their purpose makes maintenance easier. [34] Use of conditional loop statements only if the code could execute more than once, and eliminating code that will never execute can also increase understandability. [35] Many software development organizations neglect maintainability during the development phase, even though it will increase long-term costs. [32] Technical debt is incurred when programmers, often out of laziness or urgency to meet a deadline, choose quick and dirty solutions rather than build maintainability into their code. [36] A common cause is underestimates in software development effort estimation, leading to insufficient resources allocated to development. [37] A challenge with maintainability is that many software engineering courses do not emphasize it. [38] Development engineers who know that they will not be responsible for maintaining the software do not have an incentive to build in maintainability. [18]
The situation varies worldwide, but in the United States before 1974, software and its source code was not copyrightable and therefore always public domain software. [39] In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright". [40] [41]
Proprietary software is rarely distributed as source code. [42] Although the term open-source software literally refers to public access to the source code, [43] open-source software has additional requirements: free redistribution, permission to modify the source code and release derivative works under the same license, and nondiscrimination between different uses—including commercial use. [44] [45] The free reusability of open-source software can speed up development. [46]
Software consists of computer programs that instruct the execution of a computer. Software also includes design documents and specifications.
Computer programming or coding is the composition of sequences of instructions, called programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of procedures, by writing code in one or more programming languages. Programmers typically use high-level programming languages that are more easily intelligible to humans than machine code, which is directly executed by the central processing unit. Proficient programming usually requires expertise in several different subjects, including knowledge of the application domain, details of programming languages and generic code libraries, specialized algorithms, and formal logic.
In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a low-level programming language to create an executable program.
A computer program is a sequence or set of instructions in a programming language for a computer to execute. It is one component of software, which also includes documentation and other intangible components.
A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their syntax (form) and semantics (meaning), usually defined by a formal language. Languages usually provide features such as a type system, variables, and mechanisms for error handling. An implementation of a programming language is required in order to execute programs, namely an interpreter or a compiler. An interpreter directly executes the source code, while a compiler produces an executable program.
In computing, a virtual machine (VM) is the virtualization or emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination of the two. Virtual machines differ and are organized by their function, shown here:
In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:
In computer science, a library is a collection of resources that is leveraged during software development to implement a computer program.
In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate significant areas of computing systems, making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.
A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture; commands or functions in the language are structurally similar to a processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture.
A programming paradigm is a relatively high-level way to conceptualize and structure the implementation of a computer program. A programming language can be classified as supporting one or more paradigms.
In software systems, encapsulation refers to the bundling of data with the mechanisms or methods that operate on the data. It may also refer to the limiting of direct access to some of that data, such as an object's components. Essentially, encapsulation prevents external code from being concerned with the internal workings of an object.
Software development is the process of designing and implementing a software solution to satisfy a user. The process is more encompassing than programming, writing code, in that it includes conceiving the goal, evaluating feasibility, analyzing requirements, design, testing and release. The process is part of software engineering which also includes organizational management, project management, configuration management and other aspects.
An object file is a file that contains machine code or bytecode, as well as other data and metadata, generated by a compiler or assembler from source code during the compilation or assembly process. The machine code that is generated is known as object code.
A programming tool or software development tool is a computer program that is used to develop another computer program, usually by helping the developer manage computer files. For example, a programmer may use a tool called a source code editor to edit source code files, and then a compiler to convert the source code into machine code files. They may also use build tools that automatically package executable program and data files into shareable packages or install kits.
Hardware abstractions are sets of routines in software that provide programs with access to hardware resources through programming interfaces. The programming interface allows all devices in a particular class C of hardware devices to be accessed through identical interfaces even though C may contain different subclasses of devices that each provide a different hardware interface.
Software maintenance is the modification of software after delivery.
A system architecture is the conceptual model that defines the structure, behavior, and views of a system. An architecture description is a formal description and representation of a system, organized in a way that supports reasoning about the structures and behaviors of the system.
An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build such a connection or interface is called an API specification. A computer system that meets this standard is said to implement or expose an API. The term API may refer either to the specification or to the implementation.
PWCT is a free open source visual programming language for software development. The project was founded in December 2005 as a free open-source project that supports designing applications through visual programming then generating the source code. The software supports code generation in many textual programming languages.