Profile-guided optimization

Last updated September 05, 2025

In computer programming, profile-guided optimization (PGO, sometimes pronounced as pogo^[1]), also known as profile-directed feedback (PDF)^[2] or feedback-directed optimization (FDO),^[3] is the compiler optimization technique of using prior analyses of software artifacts or behaviors ("profiling") to improve the expected runtime performance of the program.

Method

Optimization techniques based on static program analysis of the source code consider code performance improvements without actually executing the program. No dynamic program analysis is performed. For example, inferring or placing formal constraints on the number of iterations a loop is likely to execute is fundamentally useful when considering whether to unroll it or not, but such facts typically rely on complex runtime factors that are difficult to conclusively establish. Usually, static analysis will have incomplete information and only be able to approximate estimates of the eventual runtime conditions.

The first high-level compiler, introduced as the Fortran Automatic Coding System in 1957, broke the code into blocks and devised a table of the frequency each block is executed via a simulated execution of the code in a Monte Carlo fashion in which the outcome of conditional transfers (as via IF-type statements) is determined by a random number generator suitably weighted by whatever FREQUENCY statements were provided by the programmer.^[4]

Rather than programmer-supplied frequency information, profile-guided optimization uses the results of profiling test runs of the instrumented program to optimize the final generated code.^[5]^[6]^[7] The compiler accesses profile data from a sample run of the program across a representative input set. The results indicate which areas of the program are executed more frequently, and which areas are executed less frequently. All optimizations benefit from profile-guided feedback because they are less reliant on heuristics when making compilation decisions. The caveat, however, is that the sample of data fed to the program during the profiling stage must be statistically representative of the typical usage scenarios; otherwise, profile-guided feedback has the potential to harm the overall performance of the final build instead of improving it.

Just-in-time compilation can make use of runtime information to dynamically recompile parts of the executed code to generate more efficient native code. If the dynamic profile changes during execution, it can deoptimize the previous native code, and generate a new code optimized with the information from the new profile.

Adoption

There is support for building Firefox using PGO.^[8] Even though PGO is effective, it has not been widely adopted by software projects, due to its tedious dual-compilation model.^[9] It is also possible to perform PGO without instrumentation by collecting a profile using hardware performance counters.^[9] This sampling-based approach has a much lower overhead and does not require a special compilation.

The HotSpot Java virtual machine (JVM) uses profile-guided optimization to dynamically generate native code. As a consequence, a software binary is optimized for the actual load it is receiving. If the load changes, adaptive optimization can dynamically recompile the running software to optimize it for the new load. This means that all software executed on the HotSpot JVM effectively make use of profile-guided optimization.^[10]

PGO has been adopted in the Microsoft Windows version of Google Chrome. PGO was enabled in the 64-bit edition of Chrome starting with version 53 and version 54 for the 32-bit edition.^[11]

Google published a paper ^[12] describing a tool in use for using production profiles to guide builds resulting in up to a 10% performance improvement.

Implementations

Examples of compilers that implement PGO are:

Intel C++ Compiler and Fortran compilers^[6]
GNU Compiler Collection compilers (commonly called GCC)
Oracle Solaris Studio (formerly called Sun Studio)
Microsoft Visual C++ compiler^[1]^[13]
Clang ^[14]
IBM XL C/C++ ^[15]
GraalVM ^[16] Enterprise Edition
.NET JIT compiler^[17]
Go ^[18]

References

1 2 "Microsoft Visual C++ Team Blog". 12 November 2008.
↑ "Profile-directed feedback (PDF)". XL C/C++ for AIX. Retrieved 23 November 2013.
↑ Baptiste Wicht; Roberto A. Vitillo; Dehao Chen; David Levinthal (24 November 2014). "Hardware Counted Profile-Guided Optimization". arXiv: 1411.6361 . Bibcode:2014arXiv1411.6361W.{{cite journal}}: Cite journal requires |journal= (help)
↑ J. W. Backus, R. J. Beeber, et al., The Fortran Automatic Coding System, Proceedings of the Western Joint Computer Conference, February 1957, p. 195
↑ "K. Pettis, R. Hansen, Profile Guided Code Positioning, ACM SIGPLAN Programming Language Design and Implementation Conference 1990" (PDF).
1 2 "Intel Fortran Compiler 10.1, Professional and Standard Editions, for Mac OS X". Archived from the original on 28 September 2013.
↑ "Profile-Guided Optimization (PGO) Quick Reference".
↑ Building with Profile-Guided Optimization, mozilla.org, 13 August 2013
1 2 Dehao Chen (2010), "Taming hardware event samples for fdo compilation", Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, pp. 42–52.
↑ Ivanov, Vladimir (25 July 2013). "JVM JIT compilation overview" . Retrieved 10 September 2016.
↑ Marchand, Sébastien (31 October 2016). "Making Chrome on Windows faster with PGO". Archived from the original on 1 November 2016. Retrieved 1 November 2016.
↑ Chen, Dehao; Li, David Xinliang; Moseley, Tipp (2016). "AutoFDO: Automatic feedback-directed optimization for warehouse-scale applications". Proceedings of the 2016 International Symposium on Code Generation and Optimization. New York, NY, USA. pp. 12–23. doi: 10.1145/2854038.2854044 . ISBN 978-1-4503-3778-6. S2CID 17473127.{{cite book}}: CS1 maint: location missing publisher (link)
↑ "Profile-guided optimizations[VS 2019]". 18 October 2022.
↑ "Profile-guided optimization [Clang Compiler User's Manual]".
↑ Quintero, Dino; Chabrolles, Sebastien; Chen, Chi Hui; Dhandapani, Murali; Holloway, Talor; Jadhav, Chandrakant; Kim, Sae Kee; Kurian, Sijo; Raj, Bharath; Resende, Ronan; Roden, Bjorn; Srinivasan, Niranjan; Wale, Richard; Zanatta, William; Zhang, Zhi; Redbooks, I. B. M. (1 May 2013). IBM Power Systems Performance Guide: Implementing and Optimizing. IBM Redbooks. ISBN 978-0-7384-3766-8 – via Google Books.
↑ "Optimize a Native Executable with Profile-Guided Optimizations [GraalVM How-to Guides]".
↑ "What's new in .NET 6: Profile-guided optimization". 26 May 2023.
↑ "Profile-guided optimization".

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[msdn-1] 1 2 "Microsoft Visual C++ Team Blog". 12 November 2008.

[2] "Profile-directed feedback (PDF)". XL C/C++ for AIX. Retrieved 23 November 2013.

[3] Baptiste Wicht; Roberto A. Vitillo; Dehao Chen; David Levinthal (24 November 2014). "Hardware Counted Profile-Guided Optimization". arXiv: 1411.6361 . Bibcode:2014arXiv1411.6361W.{{cite journal}}: Cite journal requires |journal= (help)

[4] J. W. Backus, R. J. Beeber, et al., The Fortran Automatic Coding System, Proceedings of the Western Joint Computer Conference, February 1957, p. 195

[pettis-5] "K. Pettis, R. Hansen, Profile Guided Code Positioning, ACM SIGPLAN Programming Language Design and Implementation Conference 1990" (PDF).

[intel10-6] 1 2 "Intel Fortran Compiler 10.1, Professional and Standard Editions, for Mac OS X". Archived from the original on 28 September 2013.

[intelhpc-7] "Profile-Guided Optimization (PGO) Quick Reference".

[moz-8] Building with Profile-Guided Optimization, mozilla.org, 13 August 2013

[Chen-9] 1 2 Dehao Chen (2010), "Taming hardware event samples for fdo compilation", Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, pp. 42–52.

[10] Ivanov, Vladimir (25 July 2013). "JVM JIT compilation overview" . Retrieved 10 September 2016.

[11] Marchand, Sébastien (31 October 2016). "Making Chrome on Windows faster with PGO". Archived from the original on 1 November 2016. Retrieved 1 November 2016.

[12] Chen, Dehao; Li, David Xinliang; Moseley, Tipp (2016). "AutoFDO: Automatic feedback-directed optimization for warehouse-scale applications". Proceedings of the 2016 International Symposium on Code Generation and Optimization. New York, NY, USA. pp. 12–23. doi: 10.1145/2854038.2854044 . ISBN 978-1-4503-3778-6. S2CID 17473127.{{cite book}}: CS1 maint: location missing publisher (link)

[msdn2-13] "Profile-guided optimizations[VS 2019]". 18 October 2022.

[msdn22-14] "Profile-guided optimization [Clang Compiler User's Manual]".

[15] Quintero, Dino; Chabrolles, Sebastien; Chen, Chi Hui; Dhandapani, Murali; Holloway, Talor; Jadhav, Chandrakant; Kim, Sae Kee; Kurian, Sijo; Raj, Bharath; Resende, Ronan; Roden, Bjorn; Srinivasan, Niranjan; Wale, Richard; Zanatta, William; Zhang, Zhi; Redbooks, I. B. M. (1 May 2013). IBM Power Systems Performance Guide: Implementing and Optimizing. IBM Redbooks. ISBN 978-0-7384-3766-8 – via Google Books.

[16] "Optimize a Native Executable with Profile-Guided Optimizations [GraalVM How-to Guides]".

[17] "What's new in .NET 6: Profile-guided optimization". 26 May 2023.

[18] "Profile-guided optimization".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Upwards exposed uses Use-define chain Reaching definitions
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis

Profile-guided optimization

Contents

Method

Adoption

Implementations

See also

References