Intel® Compiler New Feature: Hardware-Based PGO

ID 标签 687696
已更新 7/7/2017
版本 Latest
公共

作者 Xiaoping Duan

Intel® Compiler 18.0 provides a new feature "Hardware Based Profile Guided Optimization (PGO)".

This new feature can get benefits of Profile Guided Optimization (PGO) through the hardware-based event sampling data collected by Intel® VTune™ Amplifier (supporting Windows* and Linux*). Compared with traditional PGO it has no instrumentation caused overhead and no memory increase of application being profiled. This low overhead model is mainly targeted towards embedded or real time systems where overhead of traditional PGO prevents its use.

It is a three steps process to use hardware based PGO:

  1. Prepare for data collection by building the application with compiler option:
    • Linux: -prof-gen-sampling
    • Windows: /Qprof-gen-sampling
    It will generate an executable containing an extended form of debug information but without instrumentation.
  2. Run the application by Intel® VTune™ Amplifier using typical input dataset:
    • amplxe-pgo-report.sh <app> <options>
    During the running Intel® VTune™ Amplifier will collect hardware base event sampling data and save them into .pgo file. If there are multiple .pgo file generated by multiple running with different datasets an optional step is to merge those .pgo files into a .db file.
  3. Perform PGO optimization by re-building the application with compiler option
    • Linux: -prof-use-sampling:<.pgo data files> or <.db file>
    • Windows: /Qprof-use-sampling=<.pgo data files> or <.db file>
    It will generate the final executable optimized with feedback compilation. Additional information about hardware based PGO optimization can be get from optimization report with "-opt-report-phase:pgo". It has the same format as traditional PGO optimization report. Refer to the Intel® Parallel Studio XE 2018 Composer Edition product documentation for additional details.

 

"