Source code

Last updated

Simple C-language source code example, a procedural programming language. The resulting program prints "hello, world" on the computer screen. This first known "Hello world" snippet from the seminal book The C Programming Language originates from Brian Kernighan in the Bell Laboratories in 1974. Hello world c.svg
Simple C-language source code example, a procedural programming language. The resulting program prints "hello, world" on the computer screen. This first known "Hello world" snippet from the seminal book The C Programming Language originates from Brian Kernighan in the Bell Laboratories in 1974.

In computing, source code, or simply code, is text (usually plain text) that conforms to a human-readable programming language and specifies the behavior of a computer. A programmer writes code to produce a program that runs on a computer.

Contents

Since a computer, at base, only understands machine code, source must be translated in order to be used by the computer and this can be implemented in a variety of ways depending on available technology. Source code can be converted by a compiler or an assembler into machine code that can be directly executed. Alternatively, source code can be processed without conversion to machine code via an interpreter that performs the actions prescribed by the source code via the interpreter's machine code. Other technology (i.e. bytecode) incorporates both mechanisms by converting the source code to an intermediate form that is often not human-readable but also not machine code and an interpreter executes the intermediate form.

Most languages allow for comments. The programmer can add comments to document the source code for themself and for other programmers reading the code. Comments cannot be represented in machine code, and therefore, are ignored by compilers, interpreters and the like.

Often, the source code of application software is not distributed or publicly available since the producer wants to protect their Intellectual property (IP). But, if the source code is available (open source), it can be useful to a user, programmer or a system administrator, any of whom might wish to study or modify the program.

Definitions

Richard Stallman's definition, formulated in his 1989 seminal license, proposed source code as whatever form in which software is modified:

The "source code" for a work means the preferred form of the work for making modifications to it. [2]

Some classical sources define source code as the text form of programming languages, for example:

Source code (also referred to as source or code) is the version of software as it is originally written (i.e., typed into a computer) by a human in plain text (i.e., human readable alphanumeric characters). [3]

This responds to the fact that, when program translation first appeared, the contemporary form of software production were textual programming languages, thus source code was text code while machine code was target code. However, as programming pipelines started to incorporate more intermediate forms, some in languages like JavaScript that could be either source or target, text code stopped being synonymous with source code.

Stallman's definition thus contemplates JavaScript and HTML's source-target ambivalence, as well as contemplating possible future forms of software production, like visual programming languages, or datasets in Machine Learning. [4] [5]

Other broader interpretations, however, consider source code to include the machine code along with all the high level languages that produce it, this definition undoes the original machine/text distinction by considering each step in the program translation to be source code.

For the purpose of clarity "source code" is taken to mean any fully executable description of a software system. It is therefore so construed as to include machine code, very high level languages and executable graphical representations of systems. [6] [7]

This approach allows for a much more flexible approach to system analysis, dispensing with the requirement for designer to collaborate by publishing a convenient form for understanding and modification. It can also be applied to scenarios where a designer is not needed, like DNA. However, this form of analysis does not contemplate a costlier machine-to-machine code analysis than human-to-machine code analysis.

The earliest programs for stored-program computers were entered in binary through the front panel switches of the computer. This first-generation programming language had no distinction between source code and machine code.

When IBM first offered software to work with its machine, the source code was provided at no additional charge. At that time, the cost of developing and supporting software was included in the price of the hardware. For decades, IBM distributed source code with its software product licenses, until 1983. [8]

Most early computer magazines published source code as type-in programs.

Occasionally the entire source code to a large program is published as a hardback book, such as Computers and Typesetting, vol. B: TeX, The Program by Donald Knuth, PGP Source Code and Internals by Philip Zimmermann, PC SpeedScript by Randy Thompson, and µC/OS, The Real-Time Kernel by Jean Labrosse.

Organization

The source code which constitutes a program is usually in one or more text files stored on a computer file system. A larger codebase may be organized in a directory tree known as a source tree. Source code can also be stored in a database, as is common for stored procedures, or elsewhere.

A more complex Java source code example. Written in object-oriented programming style, it demonstrates boilerplate code. With prologue comments indicated in red, inline comments indicated in green, and program statements indicated in blue. CodeCmmt002.svg
A more complex Java source code example. Written in object-oriented programming style, it demonstrates boilerplate code. With prologue comments indicated in red, inline comments indicated in green, and program statements indicated in blue.

A program's source code can be written in multiple programming languages. [9] For example, it is not uncommon for a program written primarily in C to have portions written in assembly language for optimization purposes.

Some languages allow multiple languages in the same file. For example, a block of assembly embedded in a C file.

Library linking allows for components to be written and compiled separately, sometimes in multiple languages, and later integrated into a program. For example, with Java, classes are compiled into separate files that are linked together by the interpreter at runtime. Microsoft Windows supports programs built from DLLs; each of which can be written in any language that can be compiled to a DLL. Similarly, Microsoft .NET supports programs built from .NET assemblies; each of which can be written in any .NET language.

A program's entry point can be an interpreter. The interpreter could be designed for an application-specific, custom language or for a general-purpose language so that the interpreter is can be used for multiple applications. [10]

Typically, source code is stored in a version control system.

To produce runnable software, a complex codebase often requires building (compiling, assembling, ...) many source code files hundreds, thousands or more. Instructions for building, such as a Makefile, are often controlled with the source code in the same version control repository. These build instruction files describe the relationships among the source code files and contain information about how they are to be built separately and then combined together.

Purposes

Source code is primarily input to an computer process that ultimately results in controlling computer behavior. In other words, it is input to a compiler, interpreter or the like.

It is also used to communicate algorithms between people e.g., code snippets online or in books. [11]

Computer programmers may find it helpful to review existing source code to learn about programming techniques. [11] The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills. [11] Some people consider source code an expressive artistic medium. [12]

Porting software to another computer platform is usually prohibitively difficult and expensive without source code. One possible porting option without source code is binary translation. An other is emulation of the original platform although this is often too computationally expensive; runs slowly. [13]

Decompilation is the process of converting machine code to a more usable form often to assembly code or high-level language source code.

Software reusability describes the practice of using existing software in another software system via a software library. Some may consider reuse to include adapting source code from one piece of software to another.

The situation varies worldwide, but in the United States before 1974, software and its source code was not copyrightable and therefore always public domain software. [14]

In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright". [15] [16]

In 1983 in the United States court case Apple v. Franklin it was ruled that the same applied to object code; and that the Copyright Act gave computer programs the copyright status of literary works.

In 1999, in the United States court case Bernstein v. United States it was further ruled that source code could be considered a constitutionally protected form of free speech. Proponents of free speech argued that because source code conveys information to programmers, is written in a language, and can be used to share humor and other artistic pursuits, it is a protected form of communication. [17] [18] [19]

Licensing

Copyright notice example: [20]

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

An author of a non-trivial work like software, [16] has several exclusive rights, among them the copyright for the source code and object code. [21] The author has the right and possibility to grant customers and users of his software some of his exclusive rights in form of software licensing. Software, and its accompanying source code, can be associated with several licensing paradigms; the most important distinction is free software vs proprietary software. This is done by including a copyright notice that declares licensing terms. If no notice is found, then the default of All rights reserved is implied.

Generally speaking, a software is free software if its users are free to use it for any purpose, study and change its source code, give or sell its exact copies, and give or sell its modified copies. Software is proprietary if it is distributed while the source code is kept secret, or is privately owned and restricted. One of the first software licenses to be published and to explicitly grant these freedoms was the GNU General Public License in 1989; the BSD license is another early example from 1990.

For proprietary software, the provisions of the various copyright laws, trade secrecy and patents are used to keep the source code closed. Additionally, many pieces of retail software come with an end-user license agreement (EULA) which typically prohibits decompilation, reverse engineering, analysis, modification, or circumventing of copy protection. Types of source code protection—beyond traditional compilation to object code—include code encryption, code obfuscation or code morphing.

Quality

The way a program is written can have important consequences for its maintainers. Coding conventions, which stress readability and some language-specific conventions, are aimed at the maintenance of the software source code, which involves debugging and updating. Other priorities, such as the speed of the program's execution, or the ability to compile the program for multiple architectures, often make code readability a less important consideration, since code quality generally depends on its purpose.

See also

Related Research Articles

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a low-level programming language to create an executable program.

<span class="mw-page-title-main">Free software</span> Software licensed to be freely used, modified and distributed

Free software, libre software, or libreware is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, not price; all users are legally free to do what they want with their copies of a free software regardless of how much is paid to obtain the program. Computer programs are deemed "free" if they give end-users ultimate control over the software and, subsequently, over their devices.

<span class="mw-page-title-main">GNU</span> Free software collection

GNU is an extensive collection of free software, which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operating systems popularly known as Linux. Most of GNU is licensed under the GNU Project's own General Public License (GPL).

<span class="mw-page-title-main">GNU Compiler Collection</span> Free and open-source compiler for various programming languages

The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free software under the GNU General Public License. GCC is a key component of the GNU toolchain and the standard compiler for most projects related to GNU and the Linux kernel. With roughly 15 million lines of code in 2019, GCC is one of the biggest free programs in existence. It has played an important role in the growth of free software, as both a tool and an example.

<span class="mw-page-title-main">GNU Debugger</span> Source-level debugger

The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and partially others.

<span class="mw-page-title-main">Open-source license</span> Software license allowing source code to be used, modified, and shared

Open-source licenses are software licenses that allow content to be used, modified, and shared. They facilitate free and open-source software (FOSS) development. Intellectual property (IP) laws restrict the modification and sharing of creative works. Free and open-source licenses use these existing legal structures for an inverse purpose. They grant the recipient the rights to use the software, examine the source code, modify it, and distribute the modifications. These criteria are outlined in the Open Source Definition.

<span class="mw-page-title-main">Interpreter (computing)</span> Program that executes source code without a separate compilation step

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute that;
  3. Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter's Virtual Machine.
<span class="mw-page-title-main">GNU Project</span> Free software project

The GNU Project is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collaboratively developing and publishing software that gives everyone the rights to freely run the software, copy and distribute it, study it, and modify it. GNU software grants these rights in its license.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined to accomplish a task, much as one might use multiple hands to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

A GPL linking exception modifies the GNU General Public License (GPL) in a way that enables software projects which provide library code to be "linked to" the programs that use them, without applying the full terms of the GPL to the using program. Linking is the technical process of connecting code in a library to the using code, to produce a single executable file. It is performed either at compile time or run-time in order to produce functional machine-readable code. The Free Software Foundation states that, without applying the linking exception, a program linked to GPL library code may only be distributed under a GPL-compatible license. This has not been explicitly tested in court, but linking violations have resulted in settlement. The license of the GNU Classpath project explicitly includes a statement to that effect.

Commercial software, or seldom payware, is a computer software that is produced for sale or that serves commercial purposes. Commercial software can be proprietary software or free and open-source software.

<span class="mw-page-title-main">Richard Stallman</span> American free software activist and GNU Project founder (born 1953)

Richard Matthew Stallman, also known by his initials, rms, is an American free software movement activist and programmer. He campaigns for software to be distributed in such a manner that its users have the freedom to use, study, distribute, and modify that software. Software that ensures these freedoms is termed free software. Stallman launched the GNU Project, founded the Free Software Foundation (FSF) in October 1985, developed the GNU Compiler Collection and GNU Emacs, and wrote all versions of the GNU General Public License.

<span class="mw-page-title-main">History of free and open-source software</span>

In the 1950s and 1960s, computer operating software and compilers were delivered as a part of hardware purchases without separate fees. At the time, source code, the human-readable form of software, was generally distributed with the software providing the ability to fix bugs or add new functions. Universities were early adopters of computing technology. Many of the modifications developed by universities were openly shared, in keeping with the academic principles of sharing knowledge, and organizations sprung up to facilitate sharing. As large-scale operating systems matured, fewer organizations allowed modifications to the operating software, and eventually such operating systems were closed to modification. However, utilities and other added-function applications are still shared and new organizations have been formed to promote the sharing of software.

<span class="mw-page-title-main">GNU Emacs</span> GNU version of the Emacs text editor

GNU Emacs is a free software text editor. It was created by GNU Project founder Richard Stallman, based on the Emacs editor developed for Unix operating systems. GNU Emacs has been a central component of the GNU project and a flagship project of the free software movement. Its tag line is "the extensible self-documenting text editor."

<span class="mw-page-title-main">Copyleft</span> Practice of mandating free use in all derivatives of a work

Copyleft is the legal technique of granting certain freedoms over copies of copyrighted works with the requirement that the same rights be preserved in derivative works. In this sense, freedoms refers to the use of the work for any purpose, and the ability to modify, copy, share, and redistribute the work, with or without a fee. Licenses which implement copyleft can be used to maintain copyright conditions for works ranging from computer software, to documents, art, scientific discoveries and even certain patents.

Proprietary software is software that grants its creator, publisher, or other rightsholder or rightsholder partner a legal monopoly by modern copyright and intellectual property law to exclude the recipient from freely sharing the software or modifying it, and—in some cases, as is the case with some patent-encumbered and EULA-bound software—from making use of the software on their own, thereby restricting their freedoms.

<span class="mw-page-title-main">GNU General Public License</span> Series of free software licenses

The GNU General Public License is a series of widely used free software licenses or copyleft that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general use and was originally written by Richard Stallman, the founder of the Free Software Foundation (FSF), for the GNU Project. The license grants the recipients of a computer program the rights of the Free Software Definition. These GPL series are all copyleft licenses, which means that any derivative work must be distributed under the same or equivalent license terms. It is more restrictive than the Lesser General Public License and even further distinct from the more widely used permissive software licenses BSD, MIT, and Apache.

<span class="mw-page-title-main">Unix</span> Family of computer operating systems

Unix is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others.

The following outline is provided as an overview of and topical guide to the Perl programming language:

References

  1. Kernighan, Brian W. "Programming in C: A Tutorial" (PDF). Bell Laboratories, Murray Hill, N. J. Archived from the original (PDF) on 23 February 2015.
  2. "The GNU General Public License v3.0". GNU Project. Free Software Foundation. 29 June 2007. Archived from the original on 15 January 2024.
  3. "Source Code Definition". The Linux Information Project. 14 February 2006 [May 23, 2004]. Archived from the original on 3 October 2017.
  4. "What is Free Software?". GNU. Archived from the original on 3 July 2017. Retrieved 12 December 2015.
  5. Stallman, Richard (15 November 2017). "The JavaScript Trap". GNU Project. Retrieved 20 July 2022.
  6. "Why Source Code Analysis and Manipulation Will Always Be Important" by Mark Harman, 10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010). Timișoara, Romania, 12–13 September 2010.
  7. "SCAM Working Conference". Archived 29 September 2017 at the Wayback Machine .
  8. Martin Goetz (8 February 1988). "Object-code only: Is IBM playing fair?". Computerworld . Vol. 22, no. 6. p. 59. It was in 1983 that IBM reversed its 20-year-old policy of distributing source code with its software product licenses.
  9. "Extending and Embedding the Python Interpreter". Python documentation. Archived from the original on 3 October 2012. Retrieved 17 August 2014.
  10. Rouse, Margaret (12 August 2020). "Interpreter Method". Techopedia. Retrieved 4 August 2022.
  11. 1 2 3 Spinellis, D: Code Reading: The Open Source Perspective. Addison-Wesley Professional, 2003. ISBN   0-201-79940-5
  12. "Art and Computer Programming" ONLamp.com Archived 20 February 2018 at the Wayback Machine , (2005)
  13. "Software Portability - CodeProject". www.codeproject.com. Retrieved 4 August 2022.
  14. Liu, Joseph P.; Dogan, Stacey L. (2005). "Copyright Law and Subject Matter Specificity: The Case of Computer Software". New York University Annual Survey of American Law. 61 (2). Archived from the original on 25 June 2021.
  15. Apple Computer, Inc. v. Franklin Computer Corporation Puts the Byte Back into Copyright Protection for Computer Programs Archived 7 May 2017 at the Wayback Machine in Golden Gate University Law Review Volume 14, Issue 2, Article 3 by Jan L. Nussbaum (January 1984)
  16. 1 2 Lemley, Menell, Merges and Samuelson. Software and Internet Law, p. 34.
  17. "Info" (PDF). cr.yp.to. Archived (PDF) from the original on 7 June 2011. Retrieved 27 December 2019.
  18. Bernstein v. US Department of Justice Archived 4 April 2018 at the Wayback Machine on eff.org
  19. EFF at 25: Remembering the Case that established Code as Speech Archived 5 January 2018 at the Wayback Machine on EFF.org by Alison Dame-Boyle (16 April 2015)
  20. "License". www.apache.org. Archived from the original on 23 September 2015. Retrieved 27 December 2019.
  21. Hancock, Terry (29 August 2008). "What if copyright didn't apply to binary executables?". Free Software Magazine. Archived from the original on 25 January 2016. Retrieved 25 January 2016.

Sources