Tree shaking

Last updated

In computing, tree shaking is a dead code elimination technique that is applied when optimizing code. [1] Often contrasted with traditional single-library dead code elimination techniques common to minifiers, tree shaking eliminates unused functions from across the bundle by starting at the entry point and only including functions that may be executed. [2] [3] It is succinctly described as "live code inclusion".

Contents

History

Dead code elimination in dynamic languages is a much harder problem than in static languages. The idea of a "treeshaker" originated in LISP [4] in the 1990s. The idea is that all possible execution flows of a program can be represented as a tree of function calls, so that functions that are never called can be eliminated.

The algorithm was applied to JavaScript in Google Closure Tools and then to Dart in the dart2js compiler also written by Google, presented by Bob Nystrom in 2012 [5] [3] and described by the book Dart in Action by author Chris Buckett in 2013:

When code is converted from Dart to JavaScript the compiler does 'tree shaking'. In JavaScript you have to add an entire library even if you only need it for one function, but thanks to tree shaking the Dart-derived JavaScript only includes the individual functions that you need from a library

Chris Buckett

The next wave of popularity of the term is attributed to Rich Harris's Rollup project [6] developed in 2015.

Relation to ECMAScript 6 modules

The popularity of tree shaking in JavaScript is based on the fact that in contrast to CommonJS modules, ECMAScript 6 module loading is static and thus the whole dependency tree can be deduced by statically parsing the syntax tree. Thus tree shaking becomes an easy problem. However, tree shaking does not only apply at the import/export level: it can also work at the statement level, depending on the implementation.[ citation needed ]

Related Research Articles

<span class="mw-page-title-main">Dylan (programming language)</span> Multi-paradigm programming language

Dylan is a multi-paradigm programming language that includes support for functional and object-oriented programming (OOP), and is dynamic and reflective while providing a programming model designed to support generating efficient machine code, including fine-grained control over dynamic and static behaviors. It was created in the early 1990s by a group led by Apple Computer.

OCaml is a general-purpose, high-level, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

<span class="mw-page-title-main">Interpreter (computing)</span> Program that executes source code without a separate compilation step

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute that;
  3. Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter's virtual machine.

Prototype-based programming is a style of object-oriented programming in which behavior reuse is performed via a process of reusing existing objects that serve as prototypes. This model can also be known as prototypal, prototype-oriented,classless, or instance-based programming.

In computer programming, the scope of a name binding is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts of the program, the name may refer to a different entity, or to nothing at all. Scope helps prevent name collisions by allowing the same name to refer to different objects – as long as the names have separate scopes. The scope of a name binding is also known as the visibility of an entity, particularly in older or more technical literature—this is in relation to the referenced entity, not the referencing name.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

In object-oriented programming languages, a mixin is a class that contains methods for use by other classes without having to be the parent class of those other classes. How those other classes gain access to the mixin's methods depends on the language. Mixins are sometimes described as being "included" rather than "inherited".

In some programming languages, eval, short for the English evaluate, is a function which evaluates a string as though it were an expression in the language, and returns a result; in others, it executes multiple lines of code as though they had been included instead of the line including the eval. The input to eval is not necessarily a string; it may be structured representation of code, such as an abstract syntax tree, or of special type such as code. The analog for a statement is exec, which executes a string as if it were a statement; in some languages, such as Python, both are present, while in other languages only one of either eval or exec is.

<span class="mw-page-title-main">History of programming languages</span>

The history of programming languages spans from documentation of early mechanical computers to modern tools for software development. Early programming languages were highly specialized, relying on mathematical notation and similarly obscure syntax. Throughout the 20th century, research in compiler theory led to the creation of high-level programming languages, which use a more accessible syntax to communicate instructions.

Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of the desired functionality.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written or compiled in another one. An FFI is often used in contexts where calls are made into a binary dynamic-link library.

Haxe is a high-level cross-platform programming language and compiler that can produce applications and source code for many different computing platforms from one code-base. It is free and open-source software, released under an MIT License. The compiler, written in OCaml, is released under the GNU General Public License (GPL) version 2.

TypeScript is a free and open-source high-level programming language developed by Microsoft that adds static typing with optional type annotations to JavaScript. It is designed for the development of large applications and transpiles to JavaScript. Because TypeScript is a superset of JavaScript, all JavaScript programs are syntactically valid TypeScript, but they can fail to type-check for safety reasons.

<span class="mw-page-title-main">Node.js</span> JavaScript runtime environment

Node.js is a cross-platform, open-source JavaScript runtime environment that can run on Windows, Linux, Unix, macOS, and more. Node.js runs on the V8 JavaScript engine, and executes JavaScript code outside a web browser.

Dart is a programming language designed by Lars Bak and Kasper Lund and developed by Google. It can be used to develop web and mobile apps as well as server and desktop applications.

<span class="mw-page-title-main">OpenLisp</span> Family of programming languages known for symbolic computation and its use of parentheses

OpenLisp is a programming language in the Lisp family developed by Christian Jullien from Eligis. It conforms to the international standard for ISLISP published jointly by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), ISO/IEC 13816:1997(E), revised to ISO/IEC 13816:2007(E).

<span class="mw-page-title-main">Elm (programming language)</span> Functional programming language

Elm is a domain-specific programming language for declaratively creating web browser-based graphical user interfaces. Elm is purely functional, and is developed with emphasis on usability, performance, and robustness. It advertises "no runtime exceptions in practice", made possible by the Elm compiler's static type checking.

References

  1. "Reduce JavaScript Payloads with Tree Shaking".
  2. Harris, Rich (15 January 2019). "Tree-shaking versus dead code elimination" . Retrieved 16 September 2020.
  3. 1 2 Ladd, Seth (29 January 2013). "Minification is not enough, you need tree shaking". Seth Ladd's Blog.
  4. comp.lang.lisp What's a treeshaker?
  5. Can Google Dart Solve JavaScript's Speed and Scale Problems?
  6. How To Clean Up Your JavaScript Build With Tree Shaking