仅对英特尔可见 — GUID: ccw1479135033107
Ixiasoft
7.4. 实例:指定局部存储器地址的Bank选择位
(b 0 , b 1 , ... ,b n )自变量指的是局部存储器地址位的位置, Intel® HLS Compiler Pro Edition将他们用作“bank选择”位。指定hls_bankbits(b 0, b 1 , ..., b n) 属性表示bank数目等于2 bank位号 。
Bank 0 | Bank 1 | Bank 2 | Bank 3 | |
Word 0 | 00 000 | 01 000 | 10 000 | 11 000 |
Word 1 | 00 001 | 01 001 | 10 001 | 11 001 |
Word 2 | 00 010 | 01 010 | 10 010 | 11 010 |
Word 3 | 00 011 | 01 011 | 10 011 | 11 011 |
Word 4 | 00 100 | 01 100 | 10 100 | 11 100 |
Word 5 | 00 101 | 01 101 | 10 101 | 11 101 |
Word 6 | 00 110 | 01 110 | 10 110 | 11 110 |
Word 7 | 00 111 | 01 111 | 10 111 | 11 111 |
实现hls_bankbits属性的实例
component int bank_arbitration (int raddr, int waddr, int wdata) { #define DIM_SIZE 4 // Adjust memory geometry by preventing coalescing hls_numbanks(1) hls_bankwidth(sizeof(int)*DIM_SIZE) // Force each memory bank to have 2 ports for read/write hls_singlepump hls_max_replicates(1) int a[DIM_SIZE][DIM_SIZE][DIM_SIZE]; // initialize array a… int result = 0; #pragma unroll for (int dim1 = 0; dim1 < DIM_SIZE; dim1++) #pragma unroll for (int dim3 = 0; dim3 < DIM_SIZE; dim3++) a[dim1][waddr&(DIM_SIZE-1)][dim3] = wdata; #pragma unroll for (int dim1 = 0; dim1 < DIM_SIZE; dim1++) #pragma unroll for (int dim3 = 0; dim3 < DIM_SIZE; dim3++) result += a[dim1][raddr&(DIM_SIZE-1)][dim3]; return result; }
如下图所示,该代码实例生成多个加载和存储指令,因此硬件中有多个加载/存储单元(LSU)。如果存储系统未拆分成多个bank,则端口少于存储器访问指令,并导致仲裁性访问。该仲裁结果为高循环启动间隔(II)值。尽可能避免仲裁,因为会增加组件中的FPGA面积使用率且损害组件性能。

默认情况下, Intel® HLS Compiler Pro Edition将存储器拆分成bank,如果其确定此拆分有利于组件的性能。编译器检查访问之间是否有任何位保持不变,且自动推断bank选择位。
component int bank_no_arbitration (int raddr, int waddr, int wdata) { #define DIM_SIZE 4 // Adjust memory geometry by preventing coalescing and splitting memory hls_bankbits(4, 5) hls_bankwidth(sizeof(int)*DIM_SIZE) // Force each memory bank to have 2 ports for read/write hls_singlepump hls_max_replicates(1) int a[DIM_SIZE][DIM_SIZE][DIM_SIZE]; // initialize array a… int result = 0; #pragma unroll for (int dim1 = 0; dim1 < DIM_SIZE; dim1++) #pragma unroll for (int dim3 = 0; dim3 < DIM_SIZE; dim3++) a[dim1][waddr&(DIM_SIZE-1)][dim3] = wdata; #pragma unroll for (int dim1 = 0; dim1 < DIM_SIZE; dim1++) #pragma unroll for (int dim3 = 0; dim3 < DIM_SIZE; dim3++) result += a[dim1][raddr&(DIM_SIZE-1)][dim3]; return result; }
下图显示该实例代码创建了具有4个bank的存储器配置。使用位4和5作为bank选择位可确保每个加载/存储访问直接到达其存储bank。

该代码实例中,设置hls_numbanks(4)代替hls_bankbits(4,5)可带来相同的存储器配置,因为 Intel® HLS Compiler Pro Edition自动推断最佳bank选择位。
Function Memory Viewer(inf the High-Level Design Reports)中,Address bit information显示bank选择位为b6和b7,而非b4和b5:
之所以出现该差异,是因为Function Memory Viewer中报告的地址位都基于字节地址而非元素地址。由于数组a中的每个元素都是4个字节,元素地址位中的位b4和b5对应字节寻址中的位b6和b7。