Introduction
Memcpy
is an important and often-used function of the standard C library. Its purpose is to move data in memory from one virtual or physical address to another, consuming CPU cycles to perform the data movement. Intel® I/O Acceleration Technology (Intel® I/OAT) allows offloading of data movement to dedicated hardware within the platform, reclaiming CPU cycles that would otherwise be spent on tasks like memcpy
. This article demonstrates describes a usage of Storage Performance Development Kit (SPDK) with the Intel® I/OAT DMA engine, which is implemented through the Intel® QuickData Technology Driver. The SPDK provides an interface to Intel I/OAT hardware directly from user space, which greatly reduces the software overhead involved in using the hardware. Intel I/OAT can take advantage of PCI-Express nontransparent-bridging, which allows movement of memory blocks between two different PCIe connected motherboards, thus effectively allowing the movement of data between two different computers at nearly the same speed as moving data in memory of a single computer. We include a sample application that contrasts performance of memcpy
and Intel I/OAT equivalent functionality when moving a series of different sized chunks of data in memory. The benchmarks are logged and results compared. To download the sample application, click on the button at the top of this article.
Figure 1: The SPDK is an end-to-end reference architecture for Storage.
Hardware and Software Configuration
See below for information about the hardware and software configuration of the system used to create and validate the technical content of this article and sample application.
CPU and Chipset | Intel® Xeon® processor E5-2699 v4, 2.2 GHz
|
Platform | Platform: Intel® Server System R2000WT product family
|
Memory | Memory size: 256 GB (16X16 GB) DDR4 2133P Brand/model: Micron* – MTA36ASF2G72PZ2GATESIG |
Storage | Brand and model: 1 TB Western Digital* (WD1002FAEX) Plus Intel® SSD Data Center P3700 Series (SSDPEDMD400G4) |
Operating System | Ubuntu* 16.04 LTS (Xenial Xerus*) Linux* kernel 4.4.0-21-generic |
Note: SPDK can run on various Intel® processor families with platform support for Intel I/OAT.
Why Use Intel® Storage Performance Development Kit?
Solid-state storage media is becoming a part of the storage infrastructure in the data center. Current-generation flash storage enjoys significant advantages in performance, power consumption, and rack density over rotational media. These advantages will continue to grow as next-generation media enter the marketplace.
The SPDK is all about efficiency and scalable performance. The development kit reduces both processing and development overhead, and ensures the software layer is optimized to take advantage of the performance potential of next-generation storage media, like Non-Volatile Memory Express* (NVMe) devices. The SPDK team has open-sourced the user mode NVMe driver and Intel I/OAT DMA engine to the community under a permissive BSD license. The code is available directly through the SPDK GitHub* page.
Prerequisites-Building the sample application (for Linux):
SPDK runs on Linux with a number of prerequisite libraries installed, which are listed below.
- Install the dependencies:
- a c++14 compliant c++ compiler
- cmake >= 3.1
- git
- make
- CUnit library
- AIO library
- OpenSSL library
- Get the latest version of the SPDK, using the get_spdk.bash script included with the sample application. The script will download the SPDK from the official GitHub* repository, build it, and then install it in “./spdk”directory.
- Build from the “ex4” directory:
mkdir <build-dir>
cd <build-dir>
cmake -DCMAKE_BUILD_TYPE=Release $OLDPWD
make
- Getting the system ready for SPDK:
The following command needs to be run once before running any SPDK application. It should be run as a privileged user.- (
cd ./spdk && sudo scripts/setup.sh
)
- (
Getting Started with the Sample Application
The sample application contains the following:
Figure 2: List of files that are parts of the sample application
This example goes through the following steps to show the usage of the Intel I/OAT driver:
Program Setup
- In the “main.cpp” file, the program calls probes the system for Intel I/OAT devices and calls a callback function for each device. If the probe callback returns true, then SPDK will go ahead and attach the Intel I/OAT device and on its success call the attach callback function.
- Then, the main program defines each test and sets up the buffers.
- After setting up the buffers, the main runs through three different
memcpy
routines in a for-loop. The first routine is using the regularmemcpy
from the standard C library. - The second routine uses the Intel I/OAT driver to perform the sequential memory copy using the Intel I/OAT channels.
- The third routine uses the Intel I/OAT driver to perform the parallel memory copy using the Intel I/OAT channels.
- Once the three routines are complete, the main program displays the results for each for-loop iteration.
- Finally, after completing the for-loop, the main program releases the buffers.
BIOS Setup
Before running the application, the platform needs to enable the Intel I/OAT feature in the BIOS for each CPU socket; otherwise, the sample program will not run.
Figure 3: BIOS setting for Intel I/OAT function
SPDK Setup
After the BIOS setup is done, SPDK needs to be initialized for the application to recognize all of the Intel I/OAT channels.
Figure 4: Setting up the Intel I/OAT channels
Run the Example
Figure 5: Results of the memcpy
and Intel I/OAT equivalent function
From the output, storage developers can use the results as a guide to determine the best combination of chunk size and buffer size that they can offload to the memcpy
, using the CPU resources over to the Intel I/OAT channels for their storage application. By offloading the CPU resources for the memcpy
over to the Intel I/OAT channels, the CPU can perform other tasks in parallel with the memcpy task.
Notes: 2x Intel® Xeon® processor E5-2699v4 (HT off), Intel Speed Step® enabled, Intel® Turbo Boost Technology disabled, 16x16GB DDR4 2133 MT/s, 1 DIMM per channel, Ubuntu* 16.04 LTS, Linux kernel 4.4.0-21-generic, 1 TB Western Digital* (WD1002FAEX), 1 Intel® SSD P3700 Series (SSDPEDMD400G4), 22x per CPU socket. Performance measured by the written sample application in this article.
Conclusion
This tutorial and sample application shows one way to incorporate SPDK and the Intel I/OAT feature into your storage application. The example shows how to prepare the buffers and perform the memory copy, along with hardware configuration and full build instructions. SPDK provides the Intel QuickData Technology drivers, and helps you quickly adopt your application to run on Intel® architecture with Intel I/OAT.
Other Useful Links
Authors
- Thai Le is a software engineer who focuses on cloud computing and performance computing analysis at Intel.
- Steven Briscoe is an application engineer focusing on cloud computing within the Software Services Group at Intel Corporation (UK).
- Jonathan Stern is an applications engineer and solutions architect who works to support storage acceleration software at Intel.