New Vectorization Diagnostics Starting from Intel® Fortran Compiler...

Intel® Fortran Compiler

Intel Fortran Compiler: provides CPU and GPU offload support
Intel Fortran Compiler Classic: provides continuity with existing CPU-focused workflows and is provided to support users making the transition to the Intel® Fortran Compiler.
Both versions integrate seamlessly with popular third-party compilers, development environments, and operating systems.

Learn More

The following diagnostic messages are from the vectorization report produced by Intel® Fortran Compiler. To obtain a vectorization report, use Intel® Fortran Compiler options: -qopt-report -qopt-report-phase=vec (Linux* OS and OS X*) or /Qopt-report /Qopt-report-phase:vec (Windows* OS).

Diagnostic 7617: This host associated object appears in a 'defining' context in a PURE procedure or in an internal procedure contained in a PURE procedure.

In Fortran, A PURE procedure has restrictions on side-effects that allow parallelization and better optimization. PURE procedures are not allowed to define or change the definition status of variables that are host or use associated, or in COMMON. In the following example, hostvar is host associated inside PURE subroutine puresub. When this source is compiled, the assignment to hostvar causes error 7617 to be reported.

program F7617
implicit none

integer hostvar

call puresub

contains

pure subroutine puresub
hostvar = 1
end subroutine puresub
end program F7617

Note that ELEMENTAL procedures are also PURE, unless they are also given the IMPURE prefix (a Fortran 2008 feature supported by Intel Fortran Compiler 16.0 and above.)

To resolve this error, do not use host associated variables in a definition context within a PURE procedure.

Diagnostic 15300: Loop was Vectorized

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options states that loop was vectorized:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

program f15300
  implicit none
  integer, parameter :: n=100
  real, dimension(n) :: x = (/(i, i=1,n)/)
  integer :: i

  do i = 1,n
    x(i) = x(i) * 10.
  enddo

end program f15300

ifort -c -O2 -qopt-report2 -qopenmp-simd -qopt-report-file=stderr -qopt-report-phase=vec f15300.f90

Begin optimization report for: F15300

Report from: Vector optimizations [vec]

LOOP BEGIN at f15300.f90(7,3)
remark #15300: LOOP WAS VECTORIZED
LOOP END

Diagnostic 15304: non-vectorizable loop instance

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine add(k, a)  
  integer :: k  
  real :: a(20)
  
   DO i = 1, 20 
    a(i) = a(i+k) * 2.0 
     end do  
end subroutine add

$ ifort -c -qopt-report-file=stderr -qopt-report-phase=vec f15304.f90

Begin optimization report for: ADD

Report from: Vector optimizations [vec].

LOOP BEGIN at f15304.f90(6,5)
<Multiversioned v3>
remark #15304: loop was not vectorized: non-vectorizable loop instance from multiversioning
LOOP END

Resolution:

The compiler generates 3 loop versions, for k=0, k>0 and k<0. The version for k<0 cannot be safely vectorized because each later iteration may depend on the result of earlier iterations. You can override the compiler by inserting the !DIR$ IVDEP directive.The IVDEP directive tells the compiler it can safely ignore potential dependencies, so it does not need to generate special code for the case of K<0.

subroutine add(k, a)  
  integer :: k  
  real :: a(20)
!DIR$ IVDEP   
   DO i = 1, 20 
    a(i) = a(i+k) * 2.0 
     end do  
end subroutine add

$ ifort -c -qopt-report-file=stderr -qopt-report-phase=vec f15304.f90

Begin optimization report for: ADD
Report from: Vector optimizations [vec]

LOOP BEGIN at f15304.f90(6,5)

remark #15300: LOOP WAS VECTORIZED

LOOP END

Diagnostic 15310: xxxx was not vectorized: operation cannot be vectorized

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

When the loop contains an assignment to a derived data type which is not directly vectorizable. The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

module my_mod
   integer, parameter :: N=1000
   type :: my_type
      integer :: c1
      real(8) :: c2
   end type my_type
   type(my_type), dimension(N) :: my_inst
end module my_mod

subroutine f15310(init_data)
   use my_mod
   implicit none
   integer :: i
   type(my_type), intent(in) :: init_data

   do i=1,N
      my_inst(i) = init_data
   enddo
end subroutine f15310

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15310.f90

Begin optimization report for: F15310

Report from: Vector optimizations [vec].

LOOP BEGIN at f15310.f90(16,4)
remark #15310: loop was not vectorized: operation cannot be vectorized
LOOP END

Resolution:

The loop may be vectorized if an assignment operator is defined for the derived type.

Diagnostic 15319: Loop Was Not Vectorized: Novector Directive Used

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

When using NOVECTOR directive in code, the vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, b, n)
	implicit none
	integer, intent(in) :: n
	real, intent(inout) :: a(n) 
	real, intent(in)    :: b(n)
	
    integer :: i

!DEC$ NOVECTOR
    
       do i=1,n 
		     a(i)= b(i)+1
       end do
       
end subroutine foo

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15319.f90

Begin optimization report for: FOO

Report from: Vector optimizations [vec]

LOOP BEGIN at f15319.f90(11,8)

remark #15319: loop was not vectorized: novector directive used

LOOP END

Resolution:

There may be cases where you want to explicitly avoid vectorization of a loop; for example, if vectorization would result in a performance regression rather than an improvement. In these cases, you can use the NOVECTOR directive to disable vectorization of the loop.

Diagnostic 15328 vectorization support: irregularly indexed load was emulated for the variable <a(index(i))>, part of index is read from memory

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

A vectorizable loop contains loads from memory locations that are not contiguous in memory (sometimes known as a “gather”). These may be indexed loads, as in the example below, or loads with non-unit stride. The compiler has emulated a hardware gather instruction by issuing individual loads for the different memory locations in software.

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options:

Windows* OS: /O2 /Qopt-report:4 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report=4 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine gathr(n, a, b, index)
   implicit none
   integer, parameter :: RT=8
   integer,                intent(in)  :: n
   integer,  dimension(n), intent(in)  :: index
   real(RT), dimension(n), intent(in)  :: a
   real(RT), dimension(n), intent(out) :: b
   integer                             :: i
   
   do i=1,n
       b(i) = 1.0_RT + 0.1_RT*a(index(i))
   enddo
end subroutine gathr

$ ifort -c -qopt-report=4 -qopt-report-file=stdout gathr.f90

When using Intel Fortran compiler version 16.0 the following remark is generated:

[...]

remark #15328: vectorization support: gather was emulated for the variable a: indirect access [ gathr.f90(11,31) ]

[...]

When using Intel Fortran compiler version 17.0 the following remark is generated:

[...]

remark #15328: vectorization support: irregularly indexed load was emulated for the variable <a(index(i))>, part of index is read from memory [ gathr.f90(11,31) ]

[...]

The compiler has vectorized the loop by emulating a “gather” instruction in software.

The assembly code contains no gather instructions.

Diagnostic 15328 vectorization support: gather was emulated for the variable a: indirect access

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine gathr(n, a, b, index)
   implicit none
   integer,                intent(in)  :: n
   integer,  dimension(n), intent(in)  :: index
   real(RT), dimension(n), intent(in)  :: a
   real(RT), dimension(n), intent(out) :: b
   integer                             :: i

   do i=1,n
       b(i) = 1.0_RT + 0.1_RT*a(index(i))
   enddo

end subroutine gathr

$ ifort -c -xcore-avx2 -qopt-report=4 -qopt-report-file=stdout gathr.F90 -DRT=8 -S | egrep 'gather|VECTORIZED'

remark #15328: vectorization support: gather was emulated for the variable a: indirect access [ gathr.F90(10,29) ]

remark #15300: LOOP WAS VECTORIZED

remark #15458: masked indexed (or gather) loads: 1

remark #15301: REMAINDER LOOP WAS VECTORIZED

The compiler has vectorized the loop by emulating a “gather” instruction in software.

The assembly code contains no gather instructions.

Compare to the behavior when compiling with -DRT=4 as described in the article for diagnostic #15415.

Diagnostic 15331: Using FP Model: Precise Prevents Vectorization

Product Version: Intel® Fortran Compiler 15.0 and above

Cause

When using Intel® Fortran Compiler's option /fp:precise (Linux OS and OS X syntax: -fp-model precise) the vectorization report generated using Visual Fortran Compiler's optimization and vectorization report options ( /O2 /Qopt-report:2 /Qopt-report-phase:vec) includes non-vectorized loop instance.

Example

An example below will generate the following remark in optimization report:

subroutine foo(a, b, n)
	implicit none
    integer, intent(in) :: n
	real, intent(inout) :: a(n)
	real, intent(out) :: b
	real :: x = 0
	integer :: i
       
	do i=1,n
			x = x + a(i)			
	end do
	
	b = x
	
end subroutine foo

ifort -c /fp:precise /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15331.f90

Begin optimization report for: FOO

Report from: Vector optimizations [vec]

LOOP BEGIN at f15331.f90(9,2)

remark #15331: loop was not vectorized: precise FP model implied by the command line or a directive prevents vectorization. Consider using fast FP model [f15331.f90(10,4)]

LOOP END

Resolution:

Using fast FP model (Windows OS: /fp:fast, Linux and OS X syntax: -fp-model fast) option which allows more aggressive optimizations on floating-point data will get this loop vectorized.

See also:

fp-model, fp

Diagnostic 15335: Vectorization Possible But Seems Inefficient

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

The vectorization report generated using Visual Fortran Compiler's optimization and vectorization report options -O2 -Qvec-report2 -Qopt-report:2 states that loop was not vectorized: vectorization possible but seems inefficient.

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, b, n)
    implicit none
    integer, intent(in) :: n
    real, intent(inout) :: a(n)
    real, intent(out)   :: b
    real :: x = 0
    integer :: i

    do i=1,n
            x = x + a(2*i)
    end do
    
    b = x
    
end subroutine foo

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15335.f90

Report from: Vector optimizations [vec]

LOOP BEGIN at f15335.f90(9,5)

remark #15335: loop was not vectorized: vectorization possible but seems inefficient. Use vector always directive or /Qvec-threshold0 to override
LOOP END

Resolution:

Using !DEC$ VECTOR ALWAYS directive in the code will vectorize the loop by overriding efficiency heuristics of the vectorizer.

Diagnostic 15336: simd loop was not vectorized: scalar assignment in simd loop is prohibited, consider private, lastprivate or reduction clauses

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

When a loop contains a conditional statement which controls the assignment of a scalar value AND the scalar value is referenced AFTER the loop exits. The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine f13379( a, b, n )  
  implicit none 
  integer, intent(in)               :: n
  integer, intent(in),  dimension(n) :: a
  integer, intent(out),dimension(n) :: b
   
  integer                           :: i, x=10  
   
!$omp simd  
  do i=1,n  
    if( a(i) > 0 ) then 
     x = i                    !...here is the scalar assignment 
    end if 
    b(i) = x  
  end do 
!... reference the scalar outside of the loop  
  write(*,*) "last value of x: ", x  
end subroutine f13379

$ ifort -c -O2 -qopt-report2 -qopenmp-simd -qopt-report-file=stderr -qopt-report-phase=vec f13379.f90

When using Intel Fortran Compiler version 16.0 the following remark is generated:

LOOP BEGIN at f13379.f90(8,3)

remark #15336: simd loop was not vectorized: conditional assignment to a scalar [ f13379.f90(10,8) ]

remark #13379: loop was not vectorized with "simd"

LOOP END

When using Intel Fortran Compiler version 17.0 the following remark is generated:

LOOP BEGIN at f15336.f90(12,3)
remark #15316: simd loop was not vectorized: scalar assignment in simd loop is prohibited, consider private, lastprivate or reduction clauses f15336.f90(10,6)
remark #15552: loop was not vectorized with "simd"
LOOP END

Resolution:

Using !$omp simd lastprivate(x) instead of !$omp simd will have x initialized for each subroutine in executable code.

Example

subroutine f13379( a, b, n )  
  implicit none 
  integer, intent(in)               :: n
  integer, intent(in),  dimension(n) :: a
  integer, intent(out),dimension(n) :: b
   
  integer                           :: i, x=10  
   
!$omp simd lastprivate(x)  
  do i=1,n  
    if( a(i) > 0 ) then 
     x = i                    !...here is the scalar assignment 
    end if 
    b(i) = x  
  end do 
!... reference the scalar outside of the loop  
  write(*,*) "last value of x: ", x  
end subroutine f13379

$ ifort -c -O2 -qopt-report2 -qopenmp-simd -qopt-report-file=stderr -qopt-report-phase=vec f13379.f90

Begin optimization report for: F13379

Report from: Vector optimizations [vec]

LOOP BEGIN at f13379.f90(10,3)
remark #15301: OpenMP SIMD LOOP WAS VECTORIZED
LOOP END

Diagnostic 15340: Pragma Supersedes Previous Setting

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

This diagnostic message occurs when the parameters of the directive are contradictory. The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options states that pragma supersedes previous setting:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

In the example below, the directive !dir$ loop count has two clauses, avg() and max(). Notice the contradiction in these two clauses: the max() parameter is lesser than the avg() clause value. The following example will generate this remark in optimization report:

subroutine f15340( A, n )
implicit none
real :: A(n)
integer :: n, i
!dir$ loop count avg(30), max(10)
do i=1,n
  A(i) = real(i)
end do
end subroutine f15340

 ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15340.f90

Begin optimization report for: F15340

Report from: Vector optimizations [vec]

LOOP BEGIN at f15340.f90(6,1
remark #15340: pragma supersedes previous setting
remark #15300: LOOP WAS VECTORIZED
LOOP END

Resolution:

Make sure the max() clause is always greater than avg() clause value. Notice that the loop does still vectorize, it simply ignores the avg(30) and uses max(10)

Diagnostic 15344: Loop Was Not Vectorized: Vector Dependence Prevents Vectorization

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

When using Intel® Fortran Compiler's optimization and vectorization report options /O2 /Qopt-report:2 /Qopt-report-phase:vec the vectorization report generated states that loop was not vectorized due to vector dependence which prevents vectorization.

Example:

An example below will generate the following remark in optimization report:

integer function foo(a, n)
    implicit none
    integer, intent(in) :: n
    real, intent(inout) :: a(n)
    real :: max 
    integer :: inx, i
    
    max = a(0)
    do i=1,n
        if (max < a(i)) then
            max = a(i)
            inx = i*i
        endif
    end do
    
    foo = inx
    
end function

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15344.f90

Report from: Vector optimizations [vec]

LOOP BEGIN at f15344.f90(9,5)
remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details [ f15344.f90(12,13) ]
remark #15346: vector dependence: assumed ANTI dependence between line 10 and line 11 [ f15344.f90(12,13) ]
LOOP END

Resolution:

Rewriting the code as in the following example will resolve vector dependence and the loop will be vectorized

integer function foo(a, n)
    implicit none
    integer, intent(in) :: n
    real, intent(inout) :: a(n)
    real :: max 
    integer :: inx, i
    
    max = a(0)
    do i=1,n
        if (max < a(i)) then
            max = a(i)
            inx = i
        endif
    end do
    
    foo = inx*inx
    
end function

Diagnostic 15346: Vector Dependence: Assumed xxx Dependence Between Line x And Line y

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

Example:

An example below will generate the following remark in optimization report:

integer function foo(a, n)
    implicit none
    integer, intent(in) :: n
    real, intent(inout) :: a(n)
    real :: max 
    integer :: inx, i
    
    max = a(0)
    do i=1,n
        if (max < a(i)) then
            max = a(i)
            inx = i*i
        endif
    end do
    
    foo = inx
    
end function

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15344.f90

Report from: Vector optimizations [vec]

LOOP BEGIN at f15344.f90(9,5)
remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details [ f15344.f90(12,13) ]
remark #15346: vector dependence: assumed ANTI dependence between line 10 and line 11 [ f15344.f90(12,13) ]
LOOP END

Resolution:

Rewriting the code as in the following example will resolve vector dependence and the loop will be vectorized

integer function foo(a, n)
    implicit none
    integer, intent(in) :: n
    real, intent(inout) :: a(n)
    real :: max 
    integer :: inx, i
    
    max = a(0)
    do i=1,n
        if (max < a(i)) then
            max = a(i)
            inx = i
        endif
    end do
    
    foo = inx*inx
    
end function

Diagnostic 15378: xxxx was not vectorized: /Qfreestanding flag prevents vectorization of integer divide/remainder

The article you are looking for has been retired! Check out the FAQ Or ask in our Developer forums

Diagnostic 15398: Loop Was Not Vectorized: Loop Was Transformed To Memset Or Memcpy

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

When code contains a loop or array syntax performing a simple initialization or a copy, the compiler may replace the loop with a function call to either set memory (memset) or copy memory (memcpy). The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

program f15398
implicit none
integer, parameter :: N=32
integer :: i,a(N)
  !...initialize array using DO
  do i=1,N
    a(i) = 0
  end do
  a = 0   !...same, with array syntax 
  print*, a(1)

end program f15398

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15398.f90

Begin optimization report for: F15398

Report from: Vector optimizations [vec]

LOOP BEGIN at f15398.f90(6,3)
remark #15398: loop was not vectorized: loop was transformed to memset or memcpy
LOOP END

LOOP BEGIN at f15398.f90(9,3)
remark #15398: loop was not vectorized: loop was transformed to memset or memcpy
LOOP END

Diagnostic 15414: Loop was not vectorized: loop body became empty after optimizations

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

The vectorization report generated when using Intel® Fortran Compiler's optimization options (/O2 /Qopt-report:2) states that loop was not vectorized since loop body became empty after optimizations.

Example:

An example below will generate the following remark in optimization report:

integer function foo(a, b, n) 
    implicit none
    integer, intent(in) :: n
    real, intent(inout) :: a
    real, intent (in)   :: b
    integer :: i
    
    do i=1,n
           a = b + 1
    end do
    
    foo = a
    
end function

ifort -c /O2 /Qopt-report:2 /Qopt-report-file:stdout f15414.f90

Report from: Interprocedural optimizations [ipo]

INLINING OPTION VALUES:

-Qinline-factor: 100

-Qinline-min-size: 30

-Qinline-max-size: 230

-Qinline-max-total-size: 2000

-Qinline-max-per-routine: 10000

-Qinline-max-per-compile: 500000

Begin optimization report for: FOO

Report from: Interprocedural optimizations [ipo]

INLINE REPORT: (FOO) [1] f15414.f90(1,18)

Resolution:

In the example above, there is only one expression inside the loop. When moved outside the loop as a result of the compiler's optimization process there is nothing else left inside the loop to vectorize.

Diagnostic 15415 vectorization support: gather was generated for the variable a: indirect access

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

A vectorizable loop contains loads from memory locations that are not contiguous in memory (sometimes known as a “gather”). These may be indexed loads, as in the example below, or loads with non-unit stride. The compiler has issued a hardware gather instruction for these loads.

(Note that for compiler versions 16.0.1 and earlier, the compiler may also emit this message when gather operations are emulated in software).

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine gathr(n, a, b, index)
   implicit none
   integer,                intent(in)  :: n
   integer,  dimension(n), intent(in)  :: index
   real(RT), dimension(n), intent(in)  :: a
   real(RT), dimension(n), intent(out) :: b
   integer                             :: i

   do i=1,n
       b(i) = 1.0_RT + 0.1_RT*a(index(i))
   enddo

end subroutine gathr

$ ifort -c -xcore-avx2 -qopt-report=4 -qopt-report-file=stdout gathr.F90 -DRT=4 -S | egrep 'gather|VECTORIZED'

remark #15415: vectorization support: gather was generated for the variable a: indirect access [ gathr.F90(10,29) ]

remark #15300: LOOP WAS VECTORIZED

remark #15458: masked indexed (or gather) loads: 1

remark #15301: REMAINDER LOOP WAS VECTORIZED

$ egrep gather gathr.s

vgatherdps %ymm4, -4(%r8,%ymm3,4), %ymm5 #10.29

vgatherdps %ymm7, -4(%r8,%ymm6,4), %ymm8 #10.29

vgatherdps %ymm3, -4(%r8,%ymm2,4), %ymm4 #10.29

The compiler has vectorized the loop using a “gather” instruction from Intel® Advanced Vector Extensions 2 (Intel® AVX2).

Compare to the behavior when compiling with -DRT=8 as described in the article for diagnostic #15328.

Diagnostic 15423: loop has only one iteration

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

The Intel® Fortran Compiler will not vectorize a loop when it knows the loop has only one iteration. If the user requires vectorization by using a SIMD directive, the compiler emits a warning diagnostic.

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine f15423( a, b, n ) 
  implicit none
  real, dimension(*) :: a, b
  integer            :: i, n
  
  n=1
 
!$omp simd
  do i=1,n 
     b(i) = 1. - a(i)**2
  end do
   
end subroutine f15423

$ ifort -c -qopenmp-simd f15423.f90

f15423.f90(8): (col. 7) remark: simd loop has only one iteration

f15423.f90(8): (col. 7) warning #13379: was not vectorized with "simd"

Resolution:

If the loop really has only one iteration, don’t use a SIMD directive or don’t code a loop.

If the statement n=1 was inserted unintentionally, remove it and the loop will vectorize.

Diagnostic 15516: Loop Was Not Vectorized: Cost Model Has Chosen vectorlength Of 1 -- Maybe Possible To Override Via Pragma/directive With Vectorlength Clause

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine f15516(a,b,n)
   implicit none
   complex(8), dimension(n), intent(in ) :: a
   complex(8), dimension(n), intent(out) :: b
   integer,                  intent(in ) :: n
   integer                               :: i 
   
   do i=1,n
      b(i) = 1. / sqrt(1.+a(i)**2)
   enddo
   
end subroutine f15516

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15516.f90

Begin optimization report for: F15516

Report from: Vector optimizations [vec]

LOOP BEGIN at f15516.f90(8,4)
remark #15516: loop was not vectorized: cost model has chosen vectorlength of 1 -- maybe possible to override via pragma/directive with vectorlength clause
LOOP END

Resolution:

lDiagnostic 15517: loops in this subroutine cannot be vectorized due to use of EBX/RBX register in inline ASM

The article you are looking for has been retired! Check out the FAQ Or ask in our Developer forums

Diagnostic 15520: loop was not vectorized: loop with early exits cannot be vectorized unless it meets search loop idiom criteria

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance because of the inherent potential for an early exit from the loop.:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

subroutine f15520(a,b,c,n)
  implicit none
  real, intent(in ), dimension(n) :: a, b
  real, intent(out), dimension(n) :: c
  integer, intent(in)             :: n
  integer                         :: i

  do i=1,n
     if(a(i).lt.0.) exit
     c(i) = sqrt(a(i)) * b(i)
  enddo
end subroutine f15520

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15520.f90

Begin optimization report for: F15520

Report from: Vector optimizations [vec]

LOOP BEGIN at f15520.f90(8,3)
remark #15520: loop was not vectorized: loop with early exits cannot be vectorized unless it meets search loop idiom criteria
LOOP END

Resolution:

A loop with two exits can only be vectorized if it is a very simple search loop. Rewriting the above example as follows will get it vectorized:

 subroutine f15520(a,b,c,n) 
  implicit none 
  real, intent(in ), dimension(n) :: a, b 
  real, intent(out), dimension(n) :: c 
  integer, intent(in)             :: n 
  integer                         :: i, j 
  
  do i=1,n 
     if(a(i).lt.0.) exit
  enddo
  
  do j=1,i-1 
     c(j) = sqrt(a(j)) * b(j) 
  enddo 

end subroutine f15520

ifort -c -qopt-report-file=stderr -qopt-report-phase=vec f15520b.f90
…
LOOP BEGIN at f15520b.f90(8,3)
   remark #15300: LOOP WAS VECTORIZED
LOOP END
…
LOOP BEGIN at f15520b.f90(12,3)
   remark #15300: LOOP WAS VECTORIZED
LOOP END

In this case, the first loop is recognized as a pure search loop (searching for the first value of a that is less than zero).

Diagnostic 15521: loop was not vectorized: explicitly compute the iteration count before executing the loop

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

When using Intel® Fortran Compiler's optimization options:

/O3 /Qopt-report:2 /Qopt-report-phase:vec

The vectorization report generated by the compiler states that the loop was not vectorized since the loop count could not be computed before executing the loop.

Example:

An example below will generate the following remark in optimization report:

integer function  foo
implicit none 
   
  foo = 1
  do while (foo < 10000)  
     foo = foo + foo**2
  end do 
  
end  function foo

ifort -c /O3 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15521.f90

Begin optimization report for: FOO

Report from: Vector optimizations [vec]

LOOP BEGIN at f15521.f90(5,3)
remark #15521: loop was not vectorized: explicitly compute the iteration count before executing the loop or try using canonical loop form
LOOP END

Diagnostic 15522: Loop Was Not Vectorized: Loop Control Flow Is Too Complex

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

The vectorization report generated when using Intel® Fortran Compiler's optimization options (/O3 /Qopt-report:2 /Qopt-report-phase:vec) states that loop was not vectorized since loop control flow is too complex.

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, n)
    implicit none
    integer, intent(in) :: n
    double precision, intent(inout) :: a(n,n)
       integer :: bar
       integer :: i
       integer :: j
       
200   CONTINUE
       i=0
100   CONTINUE
       
       a(i,j)= 0
       i=i+1
       if (i .lt. bar()) goto 100
       j=j+1
       goto 200
     	
end subroutine foo

ifort -c /O3 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15522.f90

Report from: Vector optimizations [vec]

LOOP BEGIN at f1552.f90(10,8)

remark #15522: loop was not vectorized: loop control flow is too complex. remark #15522: loop was not vectorized: loop control flow is too complex. Try using canonical loop form..

LOOP END

Resolution:

-goto statements prevent vectorization, rewriting the code using canonical loops such as 'do - end do' loops will get this loop vectorized.

Diagnostic 15523: Loop Was Not Vectorized: Cannot Compute Loop Iteration Count Before Executing The Loop

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's optimization options ( /O3 /Qopt-report:2 /Qopt-report-phase:vec ) states that loop was not vectorized since loop iteration count cannot be computed before the loop is executed.

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, n)
    
       implicit none
       integer, intent(in) :: n
       double precision, intent(inout) :: a(n)
       integer :: bar
       integer :: i
       
       i=0
 100   CONTINUE
       a(i)=0
       i=i+1
       if (i .lt. bar()) goto 100
       
  end subroutine foo

ifort -c /O3 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15523.f90

Begin optimization report for: FOO

Report from: Vector optimizations [vec]

LOOP BEGIN at f15523.f90(9,8)

LOOP BEGIN
remark #15523: loop was not vectorized: loop control variable I was found, but loop iteration count cannot be computed before executing the loop.

LOOP END

Resolution:

-goto statements prevent vectorization. Rewriting the code using 'do - end do'' loops where iteration count is computed before execution of the loop will get this loop vectorized.

Diagnostic 15524: Loop Was Not Vectorized: Search Loop Cannot Be Vectorized Unless All Memory References Can Be Aligned Vector Load

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's optimization options ( /O2 /Qopt-report:2 Qopt-report-phase:vec ) states that loop was not vectorized since all memory references cannot be an aligned vector load.

Example:

An example below will generate the following remark in optimization report:

integer function foo(a, n) 

       implicit none
       integer, intent(in) :: n
       real, intent(inout) :: a(n) 
       integer :: i 
       
       do i=1,n   
         if (a(2*i) .eq. 0) goto 100 
       end do
       
100    CONTINUE 
       foo = i
       
end function foo

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15524.f90

Report from: Vector optimizations [vec]

LOOP BEGIN at f15524.f90(8,8)

remark #15524: loop was not vectorized: search loop cannot be vectorized unless all memory references can be aligned vector load.

LOOP END

Resolution:

In search loop vectorization if a “vector load” fits entirely within a cache line, and if such a vector load has one non-speculatively accessed element, it is safe for the compiler to speculatively load all other elements (within the same cache line) in such a vector load.
Rewriting the code and avoiding GOTO statements will get this loop vectorized.

Diagnostic 15527: Loop Was Not Vectorized: Function Call To xxx Cannot Be Vectorized

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's optimization options ( /O2 /Qopt-report:2 /Qopt-report-phase:vec ) states that loop was not vectorized since loop with function call cannot be vectorized.

Example:

An example below will generate the following remark in optimization report:

subroutine bar(a) 
    implicit none
    include 'omp_lib.h' 
    
    integer (kind=omp_lock_kind) a 
    call omp_init_lock(a) 
end subroutine bar
 
subroutine foo(a,n) 
       include 'omp_lib.h' 
       integer (kind=omp_lock_kind) a(n)

       do i=1,n
        call bar (a(i))
       end do
end subroutine foo

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15527.f90

Report from: Vector optimizations [vec]

LOOP BEGIN at f15527.f90(14,8)

f15527.f90(6,10):remark #15527: simd loop was not vectorized: function call to omp_init_lock cannot be vectorized

LOOP END

Resolution:

In order for the loop to be vectorized there should be no special operators and no function or subroutine calls, unless these are inlined, either manually or automatically by the compiler, or they are SIMD (vectorized) functions.

Diagnostic 15529: Loop Was Not Vectorized: Volatile Assignment Was Not Vectorized. Try Using Non-Volatile Assignment

Product Version: Intel® Fortran Compiler 15.0 and a later version

Cause:

When a code contains a variable which is declared as VOLATILE, the vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in optimization report:

PROGRAM LOOP
  IMPLICIT NONE
  INTEGER, PARAMETER :: N=100000
  INTEGER            :: I
  REAL,DIMENSION(N)  :: E, B = (/(I, I=1,N)/)
  REAL, VOLATILE     :: D = 2.

  DO I=1,N 
     E(I) = D*SIN(B(I))
  ENDDO

  PRINT *, E(N)
END PROGRAM LOOP

ifort -c -O2 -qopt-report2 -qopenmp-simd -qopt-report-file=stderr -qopt-report-phase=vec f15529.f90

Begin optimization report for: LOOP

Report from: Vector optimizations [vec]

Non-optimizable loops:

LOOP BEGIN at f15529.f90(10,3)
remark #15529: loop was not vectorized: volatile assignment was not vectorized. Try using non-volatile assignment.[f15529.f90(9,6)]
LOOP END

Diagnostic 15532: Loop Was Not Vectorized: Compile Time Constraints Prevent Loop Optimization

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's optimization options ( /O2 /Qopt-report:2 ) states that compile time constraints prevent optimization.

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, n)
    
       implicit none
       integer, intent(in) :: n
       double precision, intent(inout) :: a(n)
       integer :: bar
       integer :: i
       
       i=0
 100   CONTINUE
       a(i)=0
       i=i+1
       if (i .lt. bar()) goto 100
       
  end subroutine foo

Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]

LOOP BEGIN
remark #15532: loop was not vectorized: compile time constraints prevent loop optimization. Consider using -O3.

LOOP END

Resolution:

GOTO statements prevent vectorization since loop iteration count cannot be computed.

Diagnostic 15534: Loop was not vectorized: loop contains arithmetic if or computed goto

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's optimization options ( /O3 or /O2 /Qopt-report:2 ) states that loop was not vectorized since loop contains GOTO statement.

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, b, n) 
      implicit none 
      integer, intent(in)    :: n 
      real,    intent(inout) :: a(n) 
      integer, intent(in)    :: b(n) 
      integer:: i 
        
      do 500 i=1,n 
        goto (100,200,300,400,100,200) b(i) 
100     a(i) = 1. + a(i)**4 
        go to 500  
200     a(i) = 1. + log(a(i))
        go to 500
300     a(i) = 2. * a(i) 
        go to 500
400     a(i) = 1. - sin(a(i))**2
500   continue 

end subroutine foo

$ ifort -c -xavx f15534a.f90 -qopt-report-file=stderr -qopt-report-phase=vec

Report from: Vector optimizations [vec]

Non-optimizable loops:

LOOP BEGIN at f15534a.f90(8,10)

remark #15534: loop was not vectorized: loop contains arithmetic if or computed goto. Consider using if-then-else statement. [ f15534a.f90(9,9) ]

LOOP END

Resolution:

Complex flow control can prevent vectorization; rewriting the code using IF statements instead of a computed GO TO will get this loop vectorized:”

subroutine foo(a, b, n) 
      implicit none 
      integer, intent(in)    :: n 
      real,    intent(inout) :: a(n) 
      integer, intent(in)    :: b(n) 
      integer:: i 
        
      do i=1,n 
        if(b(i).eq.1 .or. b(I).eq.5) a(i) = 1. + a(i)**4 
        if(b(i).eq.2 .or. b(i).eq.6) a(i) = 1. + log(a(i))
        if(b(i).eq.3)                a(i) = 2. * a(i)
        if(b(i).eq.4)                a(i) = 1. - sin(a(i))**2           
      end do 

end subroutine foo

$ ifort -c -xavx f15534b.f90 -qopt-report-file=stderr -qopt-report-phase=vec

…

LOOP BEGIN at f15534b.f90(8,7)

LOOP END

LOOP BEGIN at f15534b.f90(8,7)

remark #15300: LOOP WAS VECTORIZED

LOOP END

LOOP BEGIN at f15534b.f90(8,7)

LOOP END

Diagnostic 15535: xxxx was not vectorized: loop contains switch statement. Consider using if-else statement.

The article you are looking for has been retired! Check out the FAQ Or ask in our Developer forums

Diagnostic 15537: Loop Was Not Vectorized: Implied FP Exception Model Prevents Usage of SVML Library

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's flags and optimization options (/O2 /fpe:0 /Qopt-report:2) states that loop was not vectorized due to floating-point exception handling .

Example:

An example below will generate the following remark in optimization report:

subroutine foo (a, l, n)
       implicit none
       integer, intent(in) :: n
       double precision, intent(inout) :: a(n)
       integer :: l(n)
       integer :: i
       
       do i=1,n
           l(i) = mod(a(i), 1.0)
       end do
end subroutine foo

ifort -c /O2 /fpe:0 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15537.f90

(ifort -c -O2 -fpe=0 -qopt-report2 f15537.f90 for Linux)

Begin optimization report for: FOO

Report from: Vector optimizations [vec]

LOOP BEGIN at f15537.f90(8,8)
remark #15537: loop was not vectorized: implied FP exception model prevents usage of SVML library needed for truncation or integer divide/remainder. Consider changing compiler flags and/or directives in the source to enable fast FP model and to mask FP exceptions [ f15537.f90(9,19) ]
LOOP END

Resolution:

Masking FP exceptions /fpe:1 and setting a threshold for the vectorization of loops to 0 /Qvec-threshold:0 will get the loop vectorized:

ifort -c /O2 /fpe:1 /Qvec-threshold:0 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15537.f90

(ifort -c -O2 -fpe=1 -vec-threshold=0 -qopt-report2 f15537.f90 for Linux)

LOOP BEGIN f15537.f90(8,8)

remark #15300: LOOP WAS VECTORIZED
LOOP END

Diagnostic 15541: Loop Was Not Vectorized: Outer Loop Was Not Auto-Vectorized: Consider Using SIMD Directive

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

The vectorization report generated when using Visual Fortran Compiler's optimization options ( /O2 /Qopt-report:2 /Qopt-report-phase:vec ) states that loop was not vectorized due to vector dependence - outer loop depends on inner loop.

Example:

An example below will generate the following remark in optimization report:

subroutine foo(a, n1, n)
        implicit none
        integer :: n, n1
        integer :: i, j
	real :: a(n,n1)
      
        do i=1,n
            a(j,i) = a(j-1,i)+1 
			  
            do j=1,n
                a(j,i) = a(j-1,i)+1       
            end do
        end do
end subroutine foo

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15541.f90

(ifort -c -O2 -qopt-report2 f15541.f90 for Linux)

Begin optimization report for: FOO

Report from: Interprocedural optimizations [ipo]

INLINE REPORT: (FOO) [1] f15541.f90(1,12)

Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]

LOOP BEGIN at f15541.f90(7,9)
remark #15541: outer loop was not auto-vectorized: consider using SIMD directive

LOOP BEGIN at f15541.f90(10,13)
<Multiversioned v1>
remark #25228: Loop multiversioned for Data Dependence
remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details
remark #15346: vector dependence: assumed FLOW dependence between line 11 and line 11
remark #25439: unrolled with remainder by 2
remark #25456: Number of Array Refs Scalar Replaced In Loop: 3
LOOP END

LOOP BEGIN at f15541.f90(10,13)
<Remainder, Multiversioned v1>
remark #25456: Number of Array Refs Scalar Replaced In Loop: 1
LOOP END

LOOP BEGIN at f15541.f90(10,13)
<Multiversioned v2>
remark #15304: loop was not vectorized: non-vectorizable loop instance from multiversioning
remark #25439: unrolled with remainder by 2
remark #25456: Number of Array Refs Scalar Replaced In Loop: 3
LOOP END

LOOP BEGIN at f15541.f90(10,13)
<Remainder, Multiversioned v2>
remark #25456: Number of Array Refs Scalar Replaced In Loop: 1
LOOP END
LOOP END

Resolution:

Using !DIR$ SIMD directive results in outer loop being vectorized, as shown:

subroutine foo(a, n1, n)
        implicit none
        integer :: n, n1
        integer :: i, j
	real :: a(n,n1)
      !DIR$ SIMD  
        do i=1,n
	  a(j,i) = a(j-1,i)+1 
			  
            do j=1,n
                a(j,i) = a(j-1,i)+1       
            end do
        end do
end subroutine foo

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15541.f90
(ifort -c -O2 -qopt-report2 f15541.f90 for Linux)

Begin optimization report for: FOO

Report from: Interprocedural optimizations [ipo]

INLINE REPORT: (FOO) [1] f15541.f90(1,12)

Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]

LOOP BEGIN at f15541.f90(7,9)
remark #15301: SIMD LOOP WAS VECTORIZED

LOOP BEGIN at f15541.f90(10,13)
remark #25456: Number of Array Refs Scalar Replaced In Loop: 1
LOOP END
LOOP END

LOOP BEGIN at f15541.f90(7,9)
<Remainder loop for vectorization>

LOOP BEGIN at f15541.f90(10,13)
remark #25456: Number of Array Refs Scalar Replaced In Loop: 2
LOOP END
LOOP END

Diagnostic 15542: Loop was not vectorized: inner loop was already vectorized

Product Version: Intel® Visual Fortran Compiler XE 15.0 or a later version

Cause:

When using Visual Fortran Compiler's optimization options ( /O2 /Qopt-report:2 /Qopt-report-phase:vec ) the vectorization report indicates that the outer loop was not vectorized since the inner loop was vectorized.

Example:

An example below will generate the following remark in optimization report:

program f15542
implicit none
integer, parameter :: N=25
real :: a(N,N)=1.0, b(N) 
integer :: i, j 

do j=1,N 
  do i=1,N 
     a(i,j) = a(i,j) * i 
  end do 
  b(j) = 1.0 
end do

print*, a(3,3), b(3) 
end program f15542

ifort -c /O2 /Qopt-report:2 /Qopt-report-phase:vec /Qopt-report-file:stdout f15542.f90

(ifort -c -O2 -qopt-report2 f15542.f90 for Linux)

Begin optimization report for: F15542

Report from: Vector optimizations [vec]

LOOP BEGIN at f15542.f90(7,1)
remark #15542: loop was not vectorized: inner loop was already vectorized

LOOP BEGIN at f15542.f90(8,3)
remark #15300: LOOP WAS VECTORIZED
LOOP END

LOOP END

Diagnostic 15543: Loop Was Not Vectorized: Loop With Function Call Not Considered An Optimization Candidate

Cause:

A function call inside the loop is preventing auto-vectorization.

Example:

Program foo 
    implicit none
    integer, parameter  :: nx = 100000000
    real(8)             :: x, xp, sumx
    integer             :: i
    interface
       real(8) function bar(x, xp) 
          real(8), intent(in) :: x, xp
       end
    end interface 
  
    sumx = 0.
    xp   = 1.

    do i = 1,nx
       x = 1.D-8*real(i,8)
       sumx = sumx + bar(x,xp)
    enddo

    print *, 'Sum =',sumx
      
end


real(8) function bar(x, xp) 
  implicit none
  real(8), intent(in) :: x, xp

  bar = 1. - 2.*(x-xp) + 3.*(x-xp)**2 - 1.5*(x-xp)**3  + 0.2*(x-xp)**4
  bar = bar / sqrt(x**2 + xp**2)
  
end

> ifort -qopt-report-phase=vec -qopt-report-file=stderr bar.f90 foo.f90

( ifort /Qopt-report-phase:vec /Qopt-report-file:stderr bar.f90 foo.f90 on Windows*)

Non-optimizable loops:

LOOP BEGIN at foo.f90(18,5)
remark #15543: loop was not vectorized: loop with function call not considered an optimization candidate. [ foo.f90(17,22) ]
LOOP END

Resolution:

The loop and function call can be vectorized using the explicit vector programming capabilities of OpenMP 4.0 or Intel® Cilk™ Plus.

For example, adding an OpenMP DECLARE SIMD directive to the function bar() and compiling with -qopenmp-simd allows the compiler to generate a SIMD (vectorized) version of bar() as well as a scalar version. The same OpenMP directive must be added to the interface block for bar() inside program foo. The UNIFORM clause specifies that xp is a non-varying argument, i.e., it has the same value for each iteration of the loop in the caller that is being vectorized; thus x is the only vector argument. Without UNIFORM, the compiler would have to take account that xp could also be a vector argument.

real(8) function bar(x, xp) 
!$OMP DECLARE SIMD (bar) UNIFORM(xp)
  implicit none
  real(8), intent(in) :: x, xp

  bar = 1. - 2.*(x-xp) + 3.*(x-xp)**2 - 1.5*(x-xp)**3  + 0.2*(x-xp)**4
  bar = bar / sqrt(x**2 + xp**2)
  
end

> ifort -qopenmp-simd -qopt-report-phase=vec -qopt-report-file=stderr bar.f90 foo.f90
...

remark #15301: FUNCTION WAS VECTORIZED [ bar.f90(1,18) ]

Begin optimization report for: FOO

...

LOOP BEGIN at foo.f90(16,5)

remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details
remark #15346: vector dependence: assumed OUTPUT dependence between line 17 and line 18
LOOP END

A vectorized version (actually, two) of function bar() has been generated; however, the loop inside foo has still not been vectorized. This is because the compiler sees dependencies between loop iterations carried by both x and sumx. The compiler could figure out unaided how to autovectorize a loop with just these dependencies, or a loop with just the function call, but not everything at once. We can instruct the compiler to vectorize the loop by providing a SIMD directive that specifies the properties of x and sumx:

Program foo 
    implicit none
    integer, parameter  :: nx = 100000000
    real(8)             :: x, xp, sumx
    integer             :: i

    interface
       real(8) function bar(x, xp) 
       !$OMP DECLARE SIMD (bar) UNIFORM(xp)
          real(8), intent(in) :: x, xp
       end
    end interface 
  
    sumx = 0.
    xp   = 1. 

    !$OMP SIMD  private(x)  reduction(+:sumx)
    do i = 1,nx
       x = 1.D-8*real(i,8)
       sumx = sumx + bar(x,xp)
    enddo

    print *, 'Sum =',sumx
      
end

> ifort -qopenmp-simd -qopt-report-phase=vec -qopt-report-file=stderr bar.f90 foo.f90
...

remark #15301: FUNCTION WAS VECTORIZED [ bar.f90(1,18) ]

...

LOOP BEGIN at foo.f90(17,5)
remark #15301: OpenMP SIMD LOOP WAS VECTORIZED
LOOP END

The loop is now vectorized successfully; running and timing the program shows a speedup.

Note that if the DECLARE SIMD directive is omitted, the !$OMP SIMD directive will still cause the remaining parts of the loop in foo to be vectorized, but the call to bar() will be serialized, so any performance gain is likely to be small. In either case, the private and reduction clauses of this directive are mandatory; without them, the compiler will assume no loop-carried dependencies and results may be incorrect.

For small functions such as bar(), inlining may be a simpler and more efficient way to achieve vectorization of loops containing function calls. When the caller and callee are in separate source files, as above, the application should be built with interprocedural optimization (-ipo or /Qipo). When caller and callee are in the same source file, inlining of small functions is enabled by default at optimization levels of -O2 and above.

ifort -ipo -qopt-report-phase=vec -qopt-report-file=stderr bar.f90 foo.f90

...

LOOP BEGIN at foo.f90(17,5)
remark #15300: LOOP WAS VECTORIZED
LOOP END

Diagnostic 15552: loop was not vectorized with "simd"

Updated9/13/2018

Product Version: Intel® Fortran Compiler 15.0 and above

Cause:

When a loop contains a conditional statement that controls the assignment of a scalar value AND the scalar value is referenced AFTER the loop exits. The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options includes non-vectorized loop instance:

Windows* OS: /O2 /Qopt-report:2 /Qopt-report-phase:vec

Linux OS or OS X: -O2 -qopt-report2 -qopt-report-phase=vec

Example:

An example below will generate the following remark in the optimization report:

subroutine f13379( a, b, n )  
  implicit none 
  integer, intent(in)               :: n
  integer, intent(in),  dimension(n) :: a
  integer, intent(out),dimension(n) :: b
   
  integer                           :: i, x=10  
   
!$omp simd  
  do i=1,n  
    if( a(i) > 0 ) then 
     x = i                    !...here is the scalar assignment 
    end if 
    b(i) = x  
  end do 
!... reference the scalar outside of the loop  
  write(*,*) "last value of x: ", x  
end subroutine f13379

$ ifort -c -O2 -qopt-report2 -qopenmp-simd -qopt-report-file=stderr -qopt-report-phase=vec f13379.f90

…

LOOP BEGIN at f13379.f90(12,3)

remark #15316:simd loop was not vectorized: scalar assignment in simd loop is prohibited, consider private, lastprivate or reduction clauses

remark #15552: loop was not vectorized with "simd"

LOOP END

…

Resolution:

Using !$omp simd lastprivate(x) instead of !$omp simd will have x initialized for each subroutine in executable code.

Example

subroutine f13379( a, b, n )  
  implicit none 
  integer, intent(in)               :: n
  integer, intent(in),  dimension(n) :: a
  integer, intent(out),dimension(n) :: b
   
  integer                           :: i, x=10  
   
!$omp simd lastprivate(x)  
  do i=1,n  
    if( a(i) > 0 ) then 
     x = i                    !...here is the scalar assignment 
    end if 
    b(i) = x  
  end do 
!... reference the scalar outside of the loop  
  write(*,*) "last value of x: ", x  
end subroutine f13379

$ ifort -c -O2 -qopt-report2 -qopenmp-simd -qopt-report-file=stderr -qopt-report-phase=vec f13379.f90

Begin optimization report for: F13379

Report from: Vector optimizations [vec]

LOOP BEGIN at f13379.f90(10,3)
remark #15301: SIMD LOOP WAS VECTORIZED
LOOP END

Diagnostic 25463: Optimization for this routine was skipped to constrain compile time. Consider overriding limits (-qoverride-limits).

The article you are looking for has been retired! Check out the FAQ Or ask in our Developer forums

Diagnostic 25464: Some optimizations were skipped to constrain compile time. Consider overriding limits (-qoverride-limits).

The Intel® C++ Compiler and Intel® Fortran Compiler contain certain internal limits intended to prevent excessive memory usage or compile times for very large and/or complex compilation units. When such a limit is exceeded, some optimizations are skipped to reduce the memory footprint and compile time. In such cases, for the version 15.0 update 1 and later compilers, one of the following diagnostic remarks may be printed at the head of the optimization report:

remark #25463: Optimization for this routine was skipped to constrain compile time. Consider overriding limits (-qoverride-limits).

remark #25464: Some optimizations were skipped to constrain compile time. Consider overriding limits (-qoverride-limits).

Skipping some optimizations later in the optimization sequence will typically have less impact on performance than skipping optimization for an entire function. If memory footprint and compilation time are not a concern, or if you wish to make tests, the compiler may be asked to ignore the internal limits and continue optimizing by adding the command line switch -qoverride-limits (Linux* or OS X*) or /Qoverride-limits (Windows*). This may substantially increase compile time and/or memory usage. It is the user's responsibility to ensure that sufficient memory is available. This is not a general optimization switch; it should only be used where there is a specific need and is not recommended for inexperienced users.

Other possible ways to avoid restricting optimization due to internal limits include:

Splitting up very large functions that trigger the diagnostic into two or more smaller functions;
Reducing the number of functions with a single source file;
Restricting the amount of interprocedural optimization (including inlining) between source files or within a source file;
Disabling bounds checking. (Note that bounds checking is enabled by default for debug builds within Intel Visual Fortran).
Reducing the optimization level (e.g. from -O3 to -O2) either for the whole source file, or, by using the "optimize" pragma or directive, for the particular function within a source file. (See the compiler user and reference guide).

Compiler versions earlier than 15.0 update 1 accept the -qoverride-limits (/Qoverride-limits) switch, but they do not print diagnostic remarks 25463 and 25464. 15.0 update 1 and later compilers print these remarks only if the optimization report is enabled, e.g. with -qopt-report or /Qopt-report.

Resources

Requirements for Vectorizable Loops

Vectorization Essentials

Vectorization and Optimization Reports

使用 Intel.com 搜索

快速链接

最近搜索

高级搜索

仅搜索

New Vectorization Diagnostics starting from Intel® Fortran Compiler 15.0

Intel® Fortran Compiler

Resources

产品和性能信息