Self (programming language)

Last updated
Self
Self-logo.svg
Self logo
Paradigms object-oriented (prototype-based)
Family Smalltalk
Designed by David Ungar, Randall Smith
Developers David Ungar, Randall Smith,
Stanford University,
Sun Microsystems
First appeared1987;37 years ago (1987)
Stable release
2024.1 / August 28, 2024;2 months ago (2024-08-28)
Typing discipline dynamic, strong
OS Cross-platform: Unix-like, macOS, Windows
License BSD-like
Website www.selflanguage.org
Major implementations
Self
Influenced by
Smalltalk, APL [1]
Influenced
NewtonScript, JavaScript, Io, Agora, Squeak, Lua, Factor, Rebol

Self is a general-purpose, high-level, object-oriented programming language based on the concept of prototypes . Self began as a dialect of Smalltalk, being dynamically typed and using just-in-time compilation (JIT) with the prototype-based approach to objects: it was first used as an experimental test system for language design in the 1980s and 1990s. In 2006, Self was still being developed as part of the Klein project, which was a Self virtual machine written fully in Self. The latest version, 2024.1 was released in August 2024. [2]

Contents

Several just-in-time compilation techniques were pioneered and improved in Self research as they were required to allow a very high level object oriented language to perform at up to half the speed of optimized C. Much of the development of Self took place at Sun Microsystems, and the techniques they developed were later deployed for Java's HotSpot virtual machine.

At one point a version of Smalltalk was implemented in Self. Because it was able to use the JIT, this also gave extremely good performance. [3]

History

Self was designed mostly by David Ungar and Randall Smith in 1986 while working at Xerox PARC. Their objective was to advance the state of the art in object-oriented programming language research, once Smalltalk-80 was released by the labs and began to be taken seriously by the industry. They moved to Stanford University and continued work on the language, building the first working Self compiler in 1987. Then, focus changed to working to build a full system for Self, in contrast to only the language.

The first public release was in 1990, and the next year the team moved to Sun Microsystems where they continued work on the language. Several new releases followed until falling largely dormant in 1995 with version 4.0. In 2006, version 4.3 was released, for Mac OS X and Solaris. in 2010, a new release, version 4.4, [4] was developed by a group comprising some of the original team and independent programmers, for Mac OS X and Linux, as are all later versions. In January 2014, a follow-up, 4.5 was released, [5] and three years later, version 2017.1 was released in May 2017.

The Morphic user interface construction environment was originally developed by Randy Smith and John Maloney for the Self programming language. [6] Morphic has been ported to other notable programming languages including Squeak, JavaScript, Python, and Objective-C.

Self also inspired a number of languages based on its concepts. Most notable, perhaps, were NewtonScript for the Apple Newton and JavaScript used in all modern browsers. Other examples include Io, Lisaac and Agora. The IBM Tivoli Framework's distributed object system, developed in 1990, was, at the lowest level, a prototype based object system inspired by Self.

Prototype-based programming languages

Traditional class-based OO languages are based on a deep-rooted duality:

  1. Classes define the basic qualities and behaviours of objects.
  2. Object instances are particular manifestations of a class.

For example, suppose objects of the Vehicle class have a name and the ability to perform various actions, such as drive to work and deliver construction materials. Bob's car is a particular object (instance) of the class Vehicle, with the name "Bob's car". In theory one can then send a message to Bob's car, telling it to deliver construction materials.

This example shows one of the problems with this approach: Bob's car, which happens to be a sports car, is not able to carry and deliver construction materials (in any meaningful sense), but this is a capability that Vehicles are modelled to have. A more useful model arises from the use of subclassing to create specializations of Vehicle; for example Sports Car and Flatbed Truck. Only objects of the class Flatbed Truck need provide a mechanism to deliver construction materials; sports cars, which are ill-suited to that sort of work, need only drive fast. However, this deeper model requires more insight during design, insight that may only come to light as problems arise.

This issue is one of the motivating factors behind prototypes. Unless one can predict with certainty what qualities a set of objects and classes will have in the distant future, one cannot design a class hierarchy properly. All too often the program would eventually need added behaviours, and sections of the system would need to be re-designed (or refactored) to break out the objects in a different way.[ citation needed ] Experience with early OO languages like Smalltalk showed that this sort of issue came up again and again. Systems would tend to grow to a point and then become very rigid, as the basic classes deep below the programmer's code grew to be simply "wrong". Without some way to easily change the original class, serious problems could arise.[ citation needed ]

Dynamic languages such as Smalltalk allowed for this sort of change via well-known methods in the classes; by changing the class, the objects based on it would change their behaviour. However, such changes had to be done very carefully, as other objects based on the same class might be expecting this "wrong" behavior: "wrong" is often dependent on the context. (This is one form of the fragile base class problem.) Further, in languages like C++, where subclasses can be compiled separately from superclasses, a change to a superclass can actually break precompiled subclass methods. (This is another form of the fragile base class problem, and also one form of the fragile binary interface problem.)

In Self, and other prototype-based languages, the duality between classes and object instances is eliminated.

Instead of having an "instance" of an object that is based on some "class", in Self one makes a copy of an existing object, and changes it. So Bob's car would be created by making a copy of an existing "Vehicle" object, and then adding the drive fast method, modelling the fact that it happens to be a Porsche 911. Basic objects that are used primarily to make copies are known as prototypes. This technique is claimed to greatly simplify dynamism. If an existing object (or set of objects) proves to be an inadequate model, a programmer may simply create a modified object with the correct behavior, and use that instead. Code which uses the existing objects is not changed.

Description

Self objects are a collection of "slots". Slots are accessor methods that return values, and placing a colon after the name of a slot sets the value. For example, for a slot called "name",

myPersonname

returns the value in name, and

myPersonname:'foo'

sets it.

Self, like Smalltalk, uses blocks for flow control and other duties. Methods are objects containing code in addition to slots (which they use for arguments and temporary values), and can be placed in a Self slot just like any other object: a number for example. The syntax remains the same in either case.

Note that there is no distinction in Self between fields and methods: everything is a slot. Since accessing slots via messages forms the majority of the syntax in Self, many messages are sent to "self", and the "self" can be left off (hence the name).

Basic syntax

The syntax for accessing slots is similar to that of Smalltalk. Three kinds of messages are available:

unary
receiver slot_name
binary
receiver + argument
keyword
receiver keyword: arg1 With: arg2

All messages return results, so the receiver (if present) and arguments can be themselves the result of other messages. Following a message by a period means Self will discard the returned value. For example:

'Hello, World!'print.

This is the Self version of the "Hello, World!" program. The ' syntax indicates a literal string object. Other literals include numbers, blocks and general objects.

Grouping can be forced by using parentheses. In the absence of explicit grouping, the unary messages are considered to have the highest precedence followed by binary (grouping left to right) and the keywords having the lowest. The use of keywords for assignment would lead to some extra parenthesis where expressions also had keyword messages, so to avoid that Self requires that the first part of a keyword message selector start with a lowercase letter, and subsequent parts start with an uppercase letter.

valid:basebottombetween:ligaturebottom+heightAnd:basetop/scalefactor.

can be parsed unambiguously, and means the same as:

valid: ((basebottom)             between: ((ligaturebottom) +height)             And: ((basetop) / (scalefactor))).

In Smalltalk-80, the same expression would be written as:

valid:=selfbasebottombetween:selfligaturebottom+selfheightand:selfbasetop/selfscalefactor.

assuming base, ligature, height and scale were not instance variables of self but were, in fact, methods.

Making new objects

Consider a slightly more complex example:

labelWidgetcopylabel:'Hello, World!'.

makes a copy of the "labelWidget" object with the copy message (no shortcut this time), then sends it a message to put "Hello, World" into the slot called "label". Now to do something with it:

(desktopactiveWindow) draw: (labelWidgetcopylabel:'Hello, World!').

In this case the (desktop activeWindow) is performed first, returning the active window from the list of windows that the desktop object knows about. Next (read inner to outer, left to right) the code we examined earlier returns the labelWidget. Finally the widget is sent into the draw slot of the active window.

Delegation

In theory, every Self object is a stand-alone entity. Self has neither classes nor meta-classes. Changes to a particular object do not affect any other, but in some cases it is desirable if they did. Normally an object can understand only messages corresponding to its local slots, but by having one or more slots indicating parent objects, an object can delegate any message it does not understand itself to the parent object. Any slot can be made a parent pointer by adding an asterisk as a suffix. In this way Self handles duties that would use inheritance in class-based languages. Delegation can also be used to implement features such as namespaces and lexical scoping.

For example, suppose an object is defined called "bank account", that is used in a simple bookkeeping application. Usually, this object would be created with the methods inside, perhaps "deposit" and "withdraw", and any data slots needed by them. This is a prototype, which is only special in the way it is used since it also happens to be a fully functional bank account.

Traits

Making a clone of this object for "Bob's account" will create a new object which starts out exactly like the prototype. In this case we have copied the slots including the methods and any data. However a more common solution is to first make a more simple object called a traits object which contains the items that one would normally associate with a class.

In this example the "bank account" object would not have the deposit and withdraw method, but would have as a parent an object that did. In this way many copies of the bank account object can be made, but we can still change the behaviour of them all by changing the slots in that root object.

How is this any different from a traditional class? Well consider the meaning of:

myObjectparent:someOtherObject.

This excerpt changes the "class" of myObject at runtime by changing the value associated with the 'parent*' slot (the asterisk is part of the slot name, but not the corresponding messages). Unlike with inheritance or lexical scoping, the delegate object can be modified at runtime.

Adding slots

Objects in Self can be modified to include additional slots. This can be done using the graphical programming environment, or with the primitive '_AddSlots:'. A primitive has the same syntax as a normal keyword message, but its name starts with the underscore character. The _AddSlots primitive should be avoided because it is a left over from early implementations. However, we will show it in the example below because it makes the code shorter.

An earlier example was about refactoring a simple class called Vehicle in order to be able to differentiate the behaviour between cars and trucks. In Self one would accomplish this with something like this:

_AddSlots: (|vehicle<- (|parent*=traitsclonable|) |).

Since the receiver of the '_AddSlots:' primitive isn't indicated, it is "self". In the case of expressions typed at the prompt, that is an object called the "lobby". The argument for '_AddSlots:' is the object whose slots will be copied over to the receiver. In this case it is a literal object with exactly one slot. The slot's name is 'vehicle' and its value is another literal object. The "<-" notation implies a second slot called 'vehicle:' which can be used to change the first slot's value.

The "=" indicates a constant slot, so there is no corresponding 'parent:'. The literal object that is the initial value of 'vehicle' includes a single slot so it can understand messages related to cloning. A truly empty object, indicated as (| |) or more simply as (), cannot receive any messages at all.

vehicle_AddSlots: (|name<-'automobile'|).

Here the receiver is the previous object, which now will include 'name' and 'name:' slots in addition to 'parent*'.

_AddSlots: (|sportsCar<-vehiclecopy|).sportsCar_AddSlots: (|driveToWork= (''somecode,thisisamethod'') |).

Though previously 'vehicle' and 'sportsCar' were exactly alike, now the latter includes a new slot with a method that the original doesn't have. Methods can only be included in constant slots.

_AddSlots: (|porsche911<-sportsCarcopy|).porsche911name:'Bobs Porsche'.

The new object 'porsche911' started out exactly like 'sportsCar', but the last message changed the value of its 'name' slot. Note that both still have exactly the same slots even though one of them has a different value.

Environment

One feature of Self is that it is based on the same sort of virtual machine system that earlier Smalltalk systems used. That is, programs are not stand-alone entities as they are in languages such as C, but need their entire memory environment in order to run. This requires that applications be shipped in chunks of saved memory known as snapshots or images . One disadvantage of this approach is that images are sometimes large and unwieldy; however, debugging an image is often simpler than debugging traditional programs because the runtime state is easier to inspect and modify. (The difference between source-based and image-based development is analogous to the difference between class-based and prototypical object-oriented programming.)

In addition, the environment is tailored to the rapid and continual change of the objects in the system. Refactoring a "class" design is as simple as dragging methods out of the existing ancestors into new ones. Simple tasks like test methods can be handled by making a copy, dragging the method into the copy, then changing it. Unlike traditional systems, only the changed object has the new code, and nothing has to be rebuilt in order to test it. If the method works, it can simply be dragged back into the ancestor.

Performance

Self VMs achieved performance of approximately half the speed of optimised C on some benchmarks. [7]

This was achieved by just-in-time compilation techniques which were pioneered and improved in Self research to make a high level language perform this well.

Garbage collection

The garbage collector for Self uses generational garbage collection which segregates objects by age. By using the memory management system to record page writes a write-barrier can be maintained. This technique gives excellent performance, although after running for some time a full garbage collection can occur, taking considerable time.[ vague ]

Optimizations

The run time system selectively flattens call structures. This gives modest speedups in itself, but allows extensive caching of type information and multiple versions of code for different caller types. This removes the need to do many method lookups and permits conditional branch statements and hard-coded calls to be inserted- often giving C-like performance with no loss of generality at the language level, but on a fully garbage collected system. [8]

See also

Related Research Articles

In object-oriented programming, a class defines the shared aspects of objects created from the class. The capabilities of a class differ between programming languages, but generally the shared aspects consist of state (variables) and behavior (methods) that are each either associated with a particular object or with all objects of that class.

<span class="mw-page-title-main">Dylan (programming language)</span> Multi-paradigm programming language

Dylan is a multi-paradigm programming language that includes support for functional and object-oriented programming (OOP), and is dynamic and reflective while providing a programming model designed to support generating efficient machine code, including fine-grained control over dynamic and static behaviors. It was created in the early 1990s by a group led by Apple Computer.

<span class="mw-page-title-main">Smalltalk</span> Object-oriented programming language released first in 1972

Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learning Research Group (LRG) scientists, including Alan Kay, Dan Ingalls, Adele Goldberg, Ted Kaehler, Diana Merry, and Scott Wallace.

A visitor pattern is a software design pattern that separates the algorithm from the object structure. Because of this separation, new operations can be added to existing object structures without modifying the structures. It is one way to follow the open/closed principle in object-oriented programming and software engineering.

NewtonScript is a prototype-based programming language created to write programs for the Newton platform. It is heavily influenced by the Self programming language, but modified to be more suited to needs of mobile and embedded devices.

Prototype-based programming is a style of object-oriented programming in which behavior reuse is performed via a process of reusing existing objects that serve as prototypes. This model can also be known as prototypal, prototype-oriented,classless, or instance-based programming.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

In computer programming, a generic function is a function defined for polymorphism.

In object-oriented programming languages, a mixin is a class that contains methods for use by other classes without having to be the parent class of those other classes. How those other classes gain access to the mixin's methods depends on the language. Mixins are sometimes described as being "included" rather than "inherited".

Io is a pure object-oriented programming language inspired by Smalltalk, Self, Lua, Lisp, Act1, and NewtonScript. Io has a prototype-based object model similar to those in Self and NewtonScript, eliminating the distinction between instance and class. Like Smalltalk, everything is an object and it uses dynamic typing. Like Lisp, programs are just data trees. Io uses actors for concurrency.

In object-oriented programming, a metaclass is a class whose instances are classes themselves. Unlike ordinary classes, which define the behaviors of objects, metaclasses specify the behaviors of classes and their instances. Not all object-oriented programming languages support the concept of metaclasses. For those that do, the extent of control metaclasses have over class behaviors varies. Metaclasses are often implemented by treating classes as first-class citizens, making a metaclass an object that creates and manages these classes. Each programming language adheres to its own metaobject protocol, which are the rules that determine interactions among objects, classes, and metaclasses. Metaclasses are utilized to automate code generation and to enhance framework development.

In computer science, dynamic dispatch is the process of selecting which implementation of a polymorphic operation to call at run time. It is commonly employed in, and considered a prime characteristic of, object-oriented programming (OOP) languages and systems.

In computer science, message passing is a technique for invoking behavior on a computer. The invoking program sends a message to a process and relies on that process and its supporting infrastructure to then select and run some appropriate code. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name. Message passing is key to some models of concurrency and object-oriented programming.

this, self, and Me are keywords used in some computer programming languages to refer to the object, class, or other entity which the currently running code is a part of. The entity referred to thus depends on the execution context. Different programming languages use these keywords in slightly different ways. In languages where a keyword like "this" is mandatory, the keyword is the only way to access data and methods stored in the current object. Where optional, these keywords can disambiguate variables and functions with the same name.

This comparison of programming languages compares how object-oriented programming languages such as C++, Java, Smalltalk, Object Pascal, Perl, Python, and others manipulate data structures.

<span class="mw-page-title-main">Object-oriented programming</span> Programming paradigm based on the concept of objects

Object-oriented programming (OOP) is a programming paradigm based on the concept of objects, which can contain data and code: data in the form of fields, and code in the form of procedures. In OOP, computer programs are designed by making them out of objects that interact with one another.

<span class="mw-page-title-main">DrGeo</span> Geometry software

GNU Dr. Geo is an interactive geometry software that allows its users to design & manipulate interactive geometric sketches, including dynamic models of Physics. It is free software, created by Hilaire Fernandes, it is part of the GNU project. It runs over a Morphic graphic system. Dr. Geo was initially developed in C++ with Scheme scripting, then in various versions of Smalltalk with Squeak, Etoys_(programming_language) for One Laptop per Child Pharo then Cuis-Smalltalk.

Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTSTEP operating system. Due to Apple macOS’s direct lineage from NeXTSTEP, Objective-C was the standard language used, supported, and promoted by Apple for developing macOS and iOS applications from 1997, when Apple purchased NeXT until the introduction of the Swift language in 2014.

In object-oriented programming, method cascading is syntax which allows multiple methods to be called on the same object. This is particularly applied in fluent interfaces.

Citrine is a general-purpose programming language for various operating systems. It focuses on readability and maintainability. Readability is achieved by syntactic and conceptual minimalism. The language is heavily inspired by Smalltalk and Self but has some very distinctive features. Like Smalltalk, Citrine treats everything as an object and focuses on sending messages to these objects. However, unlike Smalltalk, Citrine lacks the concept of a class. In this regard, Citrine is more like Self and JavaScript because it uses prototypes. The combination of Smalltalk-like messages and prototypes is what makes Citrine unique.

References

  1. Ungar, David; Smith, Randall B. (2007). "Self". Proceedings of the third ACM SIGPLAN conference on History of programming languages. doi:10.1145/1238844.1238853. ISBN   9781595937667. S2CID   220937663.
  2. "Self "Mandarin" 2017.1". GitHub . 24 May 2017. Retrieved 1 November 2024.
  3. Wolczko, Mario (1996). self includes: Smalltalk. Workshop on Prototype-Based Languages, ECOOP '96. Linz, Austria.
  4. "Self 4.4 released". 16 July 2010. Archived from the original on 5 December 2017. Retrieved 24 May 2017.
  5. "Self Mallard (4.5.0) released". 12 January 2014. Archived from the original on 6 December 2017. Retrieved 24 May 2017.
  6. Maloney, John H.; Smith, Randall B. (1995). "Directness and liveness in the morphic user interface construction environment". Proceedings of the 8th annual ACM symposium on User interface and software technology. pp. 21–28. doi:10.1145/215585.215636. ISBN   089791709X. S2CID   14479674 . Retrieved 24 March 2020.
  7. Agesen, Ole (March 1997). "Design and Implementation of Pep, a Java Just-In-Time Translator". Theory and Practice of Object Systems. 3 (2): 127–155. doi:10.1002/(SICI)1096-9942(1997)3:2<127::AID-TAPO4>3.0.CO;2-S. Archived from the original on November 24, 2006.
  8. Chambers, Craig (March 13, 1992). The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages (PDF) (PhD thesis). Stanford University.

Further reading