ROSE (compiler framework)

Last updated
ROSE
Original author(s) Daniel J. Quinlan, Chunhua (Leo) Liao, Robb P. Matzke, Markus Schordan, et al.
Developer(s) Lawrence Livermore National Laboratory
Stable release
0.9.13.0 / 15 December 2019;19 months ago (2019-12-15)
Preview release
0.9.6a / 16 March 2016;5 years ago (2016-03-16)
Written in C++
Operating system Linux, OS X
Platform IA-32, x86-64
Available inEnglish
Type Compiler
License BSD modified
Website rosecompiler.org

The ROSE compiler framework, developed at Lawrence Livermore National Laboratory (LLNL), is an open-source software compiler infrastructure to generate source-to-source analyzers and translators for multiple source languages including C (C89, C98, Unified Parallel C (UPC)), C++ (C++98, C++11), Fortran (77, 95, 2003), OpenMP, Java, Python, and PHP.

Contents

It also supports certain binary files, and auto-parallelizing compilers by generating source code annotated with OpenMP directives. Unlike most other research compilers, ROSE is aimed at enabling non-experts to leverage compiler technologies to build their own custom software analyzers and optimizers.

The infrastructure

ROSE consists of multiple front-ends, a midend operating on its internal intermediate representation (IR), and backends regenerating (unparse) source code from IR. Optionally, vendor compilers can be used to compile the unparsed source code into final executables.

To parse C and C++ applications, ROSE uses the Edison Design Group's C++ front-end. [1] Fortran support, including F2003 and earlier 1977, 1990, and 1995 versions, is based on the Open Fortran Parser (OFP) developed at Los Alamos National Laboratory. [2]

The ROSE IR consists of an abstract syntax tree, symbol tables, control flow graph, etc. It is an object-oriented IR with several levels of interfaces for quickly building source-to-source translators. All information from the input source code is carefully preserved in the ROSE IR, including C preprocessor control structure, source comments, source position information, and C++ template information, e.g., template arguments.

ROSE is released under a BSD-style license. It targets Linux and OS X on both IA-32 and x86-64 platforms. Its Edison Design Group (EDG) parts are proprietary and distributed in binary form. Source files of the EDG parts can be obtained if users have a commercial or research license from EDG.

Award

The ROSE compiler infrastructure received one of the 2009 R&D 100 Awards. [3] The R&D 100 Awards are presented annually by R&D Magazine to recognize the 100 most significant proven research and development advances introduced over the past year. An independent expert panel selects the winners.

See also

Related Research Articles

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

GNU Compiler Collection Free and open-source compiler for various programming languages

The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free software under the GNU General Public License. GCC is a key component of the GNU toolchain and the standard compiler for most projects related to GNU and the Linux kernel. With roughly 15 million lines of code in 2019, GCC is one of the biggest open source programs in existence. It has played an important role in the growth of free software, as both a tool and an example.

Yacc is a computer program for the Unix operating system developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right (LALR) parser generator, generating a LALR parser based on a formal grammar, written in a notation similar to Backus–Naur Form (BNF). Yacc is supplied as a standard utility on BSD and AT&T Unix. GNU-based Linux distributions include Bison, a forward-compatible Yacc replacement.

Interpreter (computing) Program that executes source code without a separate compilation step

In computer science, an interpreter is a translator that repeatedly reads instructions and translates them to machine code. It then executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute this;
  3. Explicitly execute stored precompiled code made by a compiler which is part of the interpreter system.

In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine.

LLVM Compiler backend for multiple programming languages

LLVM is a set of compiler and toolchain technologies, which can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes.

XPL is a programming language based on PL/I, a portable one-pass compiler written in its own language, and a parser generator tool for easily implementing similar compilers for other languages. XPL was designed in 1967 as a way to teach compiler design principles and as starting point for students to build compilers for their own languages.

Open64 is a free, open-source, optimizing compiler for the Itanium and x86-64 microprocessor architectures. It derives from the SGI compilers for the MIPS R10000 processor, called MIPSPro. It was initially released in 2000 as GNU GPL software under the name Pro64. The following year, University of Delaware adopted the project and renamed the compiler to Open64. It now mostly serves as a research platform for compiler and computer architecture research groups. Open64 supports Fortran 77/95 and C/C++, as well as the shared memory programming model OpenMP. It can conduct high-quality interprocedural analysis, data-flow analysis, data dependence analysis, and array region analysis. Development has ceased, although other projects can use the project's source.

In computer science, the term automatic programming identifies a type of computer programming in which some mechanism generates a computer program to allow human programmers to write the code at a higher abstraction level.

JetPAG is an open-source LL(k) parser and lexical analyzer generator, licensed under the GNU General Public License. It is a personal work of Tareq H. Sharafy, and is currently at final beta stages of development.

The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems.

Clang Compiler front-end

Clang is a compiler front end for the C, C++, Objective-C and Objective-C++ programming languages, as well as the OpenMP, OpenCL, RenderScript, CUDA and HIP frameworks. It uses the LLVM compiler infrastructure as its back end and has been part of the LLVM release cycle since LLVM 2.6.

Oracle Developer Studio, formerly named Oracle Solaris Studio, Sun Studio, Sun WorkShop, Forte Developer, and SunPro Compilers, is Oracle Corporation's flagship software development product for the Solaris and Linux operating systems. It includes optimizing C, C++, and Fortran compilers, libraries, and performance analysis and debugging tools, for Solaris on SPARC and x86 platforms, and Linux on x86/x64 platforms, including multi-core systems.

Babel is an open source middleware system serving the scientific computing community. As a language interoperability tool, Babel enables the arbitrary mixing of software libraries written in C/C++, Fortran, Python, and Java. As a distributed computing platform, Babel provides a language-neutral Remote Method Invocation (RMI) scheme similar to Java's RMI which allows third-party plug-ins to specify custom data encodings and network protocols.

History of compiler construction

In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.

The Edison Design Group (EDG) is a company that makes compiler front ends for C++ and formerly Java and Fortran. Their front ends are widely used in commercially available compilers and code analysis tools. Users include the Intel C++ compiler, Microsoft Visual C++ (IntelliSense), NVIDIA CUDA Compiler, SGI MIPSpro, The Portland Group, and Comeau C++. They are widely known for having the first, and likely only, front end to implement the now-deprecated export keyword of C++.

For several years parallel hardware was only available for distributed computing but recently it is becoming available for the low end computers as well. Hence it has become inevitable for software programmers to start writing parallel applications. It is quite natural for programmers to think sequentially and hence they are less acquainted with writing multi-threaded or parallel processing applications. Parallel programming requires handling various issues such as synchronization and deadlock avoidance. Programmers require added expertise for writing such applications apart from their expertise in the application domain. Hence programmers prefer to write sequential code and most of the popular programming languages support it. This allows them to concentrate more on the application. Therefore, there is a need to convert such sequential applications to parallel applications with the help of automated tools. The need is also non-trivial because large amount of legacy code written over the past few decades needs to be reused and parallelized.

References

  1. "Edison Design Group's C++ front-end". Edg.com. Archived from the original on 2008-12-24.CS1 maint: unfit URL (link)
  2. "Open Fortran Parser (OFP)". Fortran-parser.sourceforge.net.
  3. "Free compiler aids novices, experts". R&D Magazine. Advantage Business Media. 30 July 2009. Retrieved 18 March 2016.