Intel® OpenCL™ Graphics Extensions

ID 标签 689020
已更新 9/8/2016
版本 Latest
公共

OpenCL Extensions available in Intel® SDK for OpenCL™ Applications

The following tables contain information about extensions to the Khronos Group OpenCL™ standard available for Intel processors.   

Notice: Not all extensions are available in all versions of the OpenCL drivers for each OS. Some features are only available on certain hardware platforms or in certain driver baselines. 

 

Preview Extensions

General info on preview extensions: /content/www/cn/zh/develop/articles/intel-opencl-experimental-features.html

 

Media Extensions

These extensions enable video processing applications to access hardware features in Intel processors.

Extension Name Description Supported HW Notes
cl_intel_device_side_avc_motion_estimation See below.  Provides programmers with a macroblock-level interface to the motion estimation functionality available in the Intel graphics processor media sampler. It describes the specification of low-level built-in functions, callable from OpenCL kernels, to evaluate AVC motion estimation operations. It covers everything the host side motion estimation extensions can do and more. Sample: Intro to Device Side AVC Motion Estimation

Gen9

(new in Linux SRB4, not yet available for Windows.)

cl_intel_advanced_motion_estimation cl_intel_motion_estimation   Provides a frame-level interface implemented as built-in kernels to accelerate motion estimation operations. Supports AVC block sizes, inter/intra estimation, skip checks, and motion vector costing. Notes:
  • Version 2 of this spec was introduced in early 2016.
  • Advanced VME allows access to a superset of features of the original cl_intel_motion_estimation extension  
For more info:https://software.intel.com/en-us/articles/intro-to-advanced-motion-estimation-extension-for-opencl Motion estimation samples available in Media Server Studio samples More info: Spec
 
cl_intel_packed_yuv YUV is usually a planar format.  This extension provides support for a few specific formats of packed YUV images.  More info: Spec  
cl_intel_planar_yuv

Provides support for the Planar YUV (YCbCr) image formats.

More info: Spec

Gen9

(new in Linux SRB4, not yet available for Windows.)

cl_intel_media_block_io Built-in functions to facilitate the reading and writing of flexible 2D regions from images.  Augments Intel vendor extensions cl_intel_subgroups and cl_intel_subgroups_short. More info: Spec

Gen9

(new in Linux SRB4, not yet available for Windows.)

VEBox preview extensions:
  • cl_intelx_video_enhancement
  • cl_intelx_video_enhancement_color_pipeline
  • cl_intelx_video_enhancement_camera_pipeline
More info on preview features
Built-in functions to work with VEBox.  Samples: Minimal VEBox Samples More info: OpenCL Preview Extensions for VEBox 

Gen9

(new in Linux SRB4, not yet available for Windows.)

 

Sharing Extensions

This group of extensions enables interoperability between OpenCL and other APIs using Intel GPUs.

Extension Name Description Supported HW Notes
cl_intel_simultaneous_sharing The OpenCL 1.2 Extension Spec forbids interoperability with multiple graphics APIs at clCreateContext or clCreateContextFromType  time.  It defines that CL_INVALID_OPERATION should be returned in such cases. The goal of this extension is to relax the restrictions and allow simultaneous use of API combinations as supported by a given OpenCL device. More info: Spec   
cl_intel_va_api_media_sharing Linux/Android Media Sharing More info: Spec See /content/www/cn/zh/develop/articles/tutorial-opencl-interoperability-with-video-acceleration-api-on-linux-os.html Used in Media Server Studio samples  
cl_intel_d3d11_nv12_media_sharing cl_intel_dx9_media_sharing

Windows sharing APIs (created before Khronos extensions below.)

See /content/www/cn/zh/develop/articles/d3d9-media-surface-sharing-between-intel-quick-sync-video-and-opencl-on-intel-hd-graphics.html

Used in Media Server Studio samples

More info:

d3d11 Spec

dx9 Spec

 
cl_khr_dx9_media_sharing cl_khr_d3d10_sharing cl_khr_d3d11_sharing

Sharing for DirectX 9, 10, 11

/content/www/cn/zh/develop/articles/opencl-and-intel-media-sdk.html

More info:

dx9 Spec

d3d10 Spec

d3d11 Spec

 
 
cl_khr_gl_sharing cl_khr_gl_msaa_sharing cl_khr_gl_depth_images cl_khr_gl_event Sample: https://software.intel.com/sites/default/files/managed/2c/79/intel_ocl_ogl_interop_win.zip 
Related Pages:
/content/www/cn/zh/develop/articles/opencl-and-opengl-interoperability-tutorial.html More info: gl_sharing Spec gl_msaa_sharing Spec gl_depth_images Spec gl_event Spec

 

 

 

 

Subgroups Extensions

Work items in a subgroup can share data without implementing shared local memory or using barriers. This extends the work group concept to allow more efficient data sharing.

Extension Name Description Supported HW Notes
cl_intel_subgroups Enables work-items in a workgroup to work together let work items share data without local memory and global barriers. Similar to OpenCL 2.0 workgroups. /content/www/cn/zh/develop/articles/sgemm-ocl-opt.html /content/www/cn/zh/develop/articles/sgemm-for-intel-processor-graphics.html /content/www/cn/zh/develop/articles/box-blur-filter-using-intel-subgroup-extensions-in-opencl.html More info: Spec  
cl_intel_required_subgroup_size The goal of this extension is to allow programmers to optionally specify the required subgroup size for a kernel function.  This information is   important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code. More info: Spec  
cl_khr_subgroups Implementation controlled division of a workgroup allowing independent forward progress within the workgroup. This feature was promoted to Core in OpenCL 2.1.  More info: Spec  
   cl_intel_subgroups_short

Improve the performance of applications operating on 16-bit data types by extending the subgroup functions described in the cl_intel_subgroups extension to support 16-bit integer data types (shorts and ushorts).

More info: Spec

 

 

 

Other Extensions

Extension Name Description Supported HW Notes
cl_intel_accelerator Basic accelerator support
The accelerator extension consists of a unified set of OpenCL runtime APIs to create, query, and manage the lifetime of objects which represent acceleration processors, engines, or algorithms.   More info: 
 
cl_intel_driver_diagnostics This extension allows the driver to pass additional strings containing diagnostic information. The diagnostic messages can help to understand how the driver works and can provide guidance to modify an application to improve performance. More info: 

 

 
cl_khr_3d_image_writes Enables writes to 3D image objects More info: Spec  
cl_khr_byte_addressable_store Removes restrictions of built-in types.  Needed to write to elements of a pointer or struct of type char, uchar, char2, uchar2, short, ushort, and half. More info: Spec  
cl_khr_spir

OpenCL Standard Portable Intermediate Representation (SPIR) non source representation of OpenCL.

More info: Spec

 
cl_khr_fp16 Half-precision floating-point More info:Spec  
cl_khr_fp64 IEEE-754 double-precision floating-point support More info: Spec  
cl_khr_global_int32_base_atomics 32-bit integer base atomic operations in global memory More info: Spec  
cl_khr_global_int32_extended_atomics 32-bit integer extended atomic operations in global memory More info: Spec  
cl_khr_icd

Access Khronos OpenCL installable client driver loader (ICD Loader)

More info: Spec

 
cl_khr_image2d_from_buffer 2D image from buffer creation support More info:   
cl_khr_mipmap_image cl_khr_mipmap_image_writes Ability to create / read mipmapped images Adds ability to write mipmapped images, requires cl_khr_mipmap_image More info:Spec 

 

 
cl_khr_depth_images   Depth Images More info: Spec  
cl_khr_throttle_hints Extension to OpenCL 2.1 API which allows the driver to implement throttling behavior. Throttling behavior is implementation specific. More info: Spec  
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
base_atomics: Spec extended_atomics: Spec  

 

Deprecated Extensions

Extension Name Description
cl_intel_ctz Built-in count trailing zeroes

 

"