MILEPOST GCC

MILEPOST GCC
Developer(s)	cTuning foundation / MILEPOST consortium
Initial release	2009
Stable release	4.4.x / May 21, 2010
Repository	github.com/ctuning/reproduce-milepost-project ;
Operating system	Cross-platform
Type	Compiler
License	GNU General Public License (version 3 or later)
Website	GitHub, online API, cTuning.org/ctuning-cc, cTuning.org/milepost-gcc

Last updated November 03, 2021

MILEPOST GCC is a free, community-driven, open-source, adaptive, self-tuning compiler that combines stable production-quality GCC, Interactive Compilation Interface and machine learning plugins to adapt to any given architecture and program automatically and predict profitable optimizations to improve program execution time, code size and compilation time.^[1]^[2] It is currently used and supported by academia and industry^[3] and is intended to open up research opportunities to automate compiler and architecture design and optimization.^[4]

MILEPOST GCC is currently a part of the community-driven Collective Tuning Initiative (cTuning) to enable self-tuning computing systems based on collaborative open-source R&D infrastructure with unified interfaces and to improve the quality and reproducibility of research on code and architecture optimization. MILEPOST GCC is connected with the Collective Optimization Database to collect and reuse profitable optimization cases from the community and predict high-quality optimizations based on statistical analysis of past optimization data.

In January 2018, the cTuning foundation and the Raspberry Pi Foundation published an interactive article featuring MILEPOST GCC and Collective Knowledge framework "for collaborative research into multi-objective autotuning and machine learning techniques."^[5]

Versions

MILEPOST GCC 4.4.x ICI 2.0 - released in May, 2010.
MILEPOST GCC 4.4.0 - released in May, 2009.
MILEPOST GCC 4.2.2 - released in July, 2008.

Current developments:

GitHub development website - this version is implemented as a Collective Knowledge package and uses optimization results from the open Collective Knowledge repository to train predictive models.
Online MILEPOST demo to predict GCC or LLVM compiler flags using machine learning and MILEPOST features.

Past developments:

Related Research Articles

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

In computer science, program optimization, code optimization, or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be optimized so that it executes more rapidly, or to make it capable of operating with less memory storage or other resources, or draw less power.

In compiler design, static single assignment form is a property of an intermediate representation (IR), which requires that each variable be assigned exactly once, and every variable be defined before it is used. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript in textbooks, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

Collaborative intelligence characterizes multi-agent, distributed systems where each agent, human or machine, is autonomously contributing to a problem solving network. Collaborative autonomy of organisms in their ecosystems makes evolution possible. Natural ecosystems, where each organism's unique signature is derived from its genetics, circumstances, behavior and position in its ecosystem, offer principles for design of next generation social networks to support collaborative intelligence, crowdsourcing individual expertise, preferences, and unique contributions in a problem solving process.

LLVM is a set of compiler and toolchain technologies, which can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes.

In software engineering, retargeting is an attribute of software development tools that have been specifically designed to generate code for more than one computing platform.

Interprocedural optimization (IPO) is a collection of compiler techniques used in computer programming to improve performance in programs containing many frequently used functions of small or medium length. IPO differs from other compiler optimization because it analyzes the entire program; other optimizations look at only a single function, or even a single block of code.

The Portable C Compiler is an early compiler for the C programming language written by Stephen C. Johnson of Bell Labs in the mid-1970s, based in part on ideas proposed by Alan Snyder in 1973, and "distributed as the C compiler by Bell Labs... with the blessing of Dennis Ritchie."

In control theory a self-tuning system is capable of optimizing its own internal running parameters in order to maximize or minimize the fulfilment of an objective function; typically the maximization of efficiency or error minimization.

Oracle Developer Studio, formerly named Oracle Solaris Studio, Sun Studio, Sun WorkShop, Forte Developer, and SunPro Compilers, is Oracle Corporation's flagship software development product for the Solaris and Linux operating systems. It includes optimizing C, C++, and Fortran compilers, libraries, and performance analysis and debugging tools, for Solaris on SPARC and x86 platforms, and Linux on x86/x64 platforms, including multi-core systems.

The Interactive Compilation Interface (ICI) is a plugin system with a high-level compiler-independent and low-level compiler-dependent API to transform current black-box compilers into collaborative modular interactive toolsets. It was developed by Grigori Fursin during MILEPOST project. The ICI framework acts as a "middleware" interface between the compiler and the user-definable plugins. It opens up and reuses the production-quality compiler infrastructure to enable program analysis and instrumentation, fine-grain program optimizations, simple prototyping of new development and research ideas while avoiding building new compilation tools from scratch. For example, it is used in MILEPOST GCC to automate compiler and architecture design and program optimizations based on statistical analysis and machine learning, and predict profitable optimization to improve program execution time, code size and compilation time.

The Collective Tuning Initiative is a community-driven initiative started by Grigori Fursin to develop free collaborative open-source research tools with unified API for code and architecture characterization, optimization and co-design. This enables the sharing of benchmarks, data sets and optimization cases from the community in the open optimization repository through unified web services to predict better optimizations or architecture designs. Using common research-and-development tools should help to improve the quality and reproducibility of research into code, architecture design and optimization, encouraging innovation in this area. This approach helped establish Artifact Evaluation at several ACM-sponsored conferences to encourage sharing of artifacts and validation of experimental results from accepted papers.

The Collective Knowledge (CK) project is an open-source framework and repository to enable collaborative, reproducible and sustainable research and development of complex computational systems. CK is a small, portable, customizable and decentralized infrastructure helping researchers and practitioners:

The cTuning Foundation is a global non-profit organization developing open-source tools and a common methodology to enable sustainable, collaborative and reproducible research in Computer science, perform collaborative optimization of realistic workloads across devices provided by volunteers, enable self-optimizing computer systems, and automate artifact evaluation at machine learning and systems conferences and journals.

Grigori Fursin is a British computer scientist and the president of the non-profit CTuning foundation. His research group created open-source machine learning based self-optimizing compiler, MILEPOST GCC, considered to be the first in the world. At the end of the MILEPOST project he established cTuning foundation to crowdsource program optimisation and machine learning across diverse devices provided by volunteers. His foundation also developed Collective Knowledge Framework to support open research. Since 2015 Fursin leads Artifact Evaluation at several ACM and IEEE computer systems conferences. He is also a founding member of the ACM taskforce on Data, Software, and Reproducibility in Publication.

Differentiable programming is a programming paradigm in which a numeric computer program can be differentiated throughout via automatic differentiation. This allows for gradient based optimization of parameters in the program, often via gradient descent. Differentiable programming has found use in a wide variety of areas, particularly scientific computing and artificial intelligence.

References

↑ Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Chris Williams, Michael O'Boyle. Milepost gcc: Machine learning enabled self-tuning compiler International journal of parallel programming, Volume 39, Issue 3, pp. 296-327, June 2011 (link)
↑ Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Courtois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris Williams, Michael O'Boyle. MILEPOST GCC: machine learning based research compiler. Proceedings of the GCC Developers' Summit, Ottawa, Canada, June 2008 (link)
↑ IBM Releases Open Source Machine Learning Compiler, Slashdot, July 2009 (link)
↑ Rethinking code optimization for mobile and multicore, InfoWorld, July 2009 (link)
↑ Grigori Fursin, Anton Lokhmotov, Dmitry Savenko, Eben Upton. A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques, arXiv:1801.08024, January 2018 (arXiv link, interactive report with reproducible experiments)

External links

Official website

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Bilha Mendelson, Ayal Zaks, Eric Courtois, Francois Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Chris Williams, Michael O'Boyle. Milepost gcc: Machine learning enabled self-tuning compiler International journal of parallel programming, Volume 39, Issue 3, pp. 296-327, June 2011 (link)

[2] Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Courtois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris Williams, Michael O'Boyle. MILEPOST GCC: machine learning based research compiler. Proceedings of the GCC Developers' Summit, Ottawa, Canada, June 2008 (link)

[3] IBM Releases Open Source Machine Learning Compiler, Slashdot, July 2009 (link)

[4] Rethinking code optimization for mobile and multicore, InfoWorld, July 2009 (link)

[5] Grigori Fursin, Anton Lokhmotov, Dmitry Savenko, Eben Upton. A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques, arXiv:1801.08024, January 2018 (arXiv link, interactive report with reproducible experiments)

[1]

[2]

[3]

[4]

[5]

MILEPOST GCC

Contents

Versions

Related Research Articles

References

External links