Introduction
Distributed Asynchronous Object Storage (DAOS) is high-performance storage that pushes the limits of Intel hardware. It's based on Intel Xeon, Persistent Memory and NVMe SSDs. It has been awarded top spots in the IO500 and IO500 10-node Challenge multiple times in the past several years. For more information, please refer to DAOS: Revolutionizing High-Performance Storage with Intel® Optane™ Technology.
This article will show how to easily interface with DAOS from Python through the PyDAOS package. There are two key advantages of using a dictionary stored in DAOS as opposed to a regular Python dictionary. The first is that you can manipulate a gigantic Key-Value (KV) store, given that nothing is stored in local memory. The second is that your dictionary is persistent (no need to sync your data to disk), which means that if you quit your program and reload it, your data will still be there.
It is esential to point out that PyDAOS is a work in progress. For example, only keys and values of type string are currently supported. In addition, the only supported data structure is a dictionary (although arrays will be included in the near future).
Installing DAOS also installs PyDAOS automatically. The location (in Linux) is:
<DAOS_INSTALLATION_DIR>/lib64/python3.6/site-packages/
If such path is not found by Python automatically, you can add it manually using sys:
This is usually not required if you install DAOS from repository packages.
Pools and Containers
The first thing you will need to use PyDAOS is an existing DAOS pool and container. At the moment, both have to exist beforehand; it is not possible to create pools or containers from the Python API. To create a pool, you run:
That command will create 1GiB pool labeled pydaos (you can choose any other name). Next, we can create our container inside this pool running:
We pass the type PYTHON to indicate that the container will be used from a client written in Python. The type serves to designate a pre-defined layout of the data in terms of the underneath DAOS object model. For example, other available types are HDF5 and POSIX. The label kvstore is arbitrary: you can choose any other name.
PyDAOS Step by Step
First, we have to make sure that we import all the necessary classes:
DCont represents a container, DDict a dictionary, and DObjNotFound is used to catch exceptions raised when objects are not found in a container.
But before we can get or create objects, we have to create a Python container object by passing the pool and container labels:
We can also use the last parameter to create our object using the path to the container in unified namespace:
Now we can get (or create) a dictionary object. We can use the DObjNotFound exception to create the dictionary if it doesn't exist:
Again, the name dict-0 is arbitrary. Now that we have our dictionary object, we can start inserting, reading, and deleting keys against our DAOS container.
Insert a New Key
To insert a new key, use put():
Get a Key
To get a key, use the [] interface (as in native Python dictionaries):
Delete a Key
To delete a key, use pop():
Iterate the Whole Dictionary
We can iterate the whole dictionary as we would do with a native Python dictionary:
Bulk Insertion
PyDAOS dictionaries allow us to also insert and read in bulk. We can do bulk insertion by passing a Python dictionary to bput():
Read in Bulk
To read in bulk, we pass a Python dictionary with the keys that we want to read to bget():
It is also possible to read all keys in bulk with dump():
Total Number of Keys
Finally, we can get the total number of keys stored in our dictionary with len():
A complete Example
Now that we have all the pieces, let’s put them together to create a complete example. The example is a simple program (kvmanage.py) to manage a DAOS KV store interactively through a command line interface (CLI):
The program accepts multiple commands to manage a KV store: read a key, read all keys, insert a new key, delete a key, insert new keys in bulk, read keys in bulk, and quit. The program runs an infinite loop until the user selects the quit command.
For example, to insert a new key:
Now we can read all keys and see our newly inserted key:
An Additional Example Using JSON Files
Below is a simple example demonstrating the use of PyDAOS with json files. Traditionally, a user can perform read/write operations in memory, but with the PyDAOS API, we can utilize DAOS’s performance with simple KV store operations.
First create your respective pool and container (which was already sown above), and then verify their creation through some simple commands on the CLI.
In json_example.py, we can take currency information from two json files—conversions.json and data.json—and then use that information to display simple exchange rates from the US Dollar to a requested currency. The file conversions.json contains the exchange rates from 29th of October of 2021. These rates are used in conjunction with data from data.json, which contains specific information pertaining to each of these currencies.
We begin by connecting to the pool and container created through the CLI, where the pool’s label is “pydaos_json”, and our container’s “kvstore.” Then we open data.json and store the information into our KV container named “kvstore” through put() operations.
data.json:
{ "USD": { "symbol": "$", "name": "US Dollar", "symbol_native": "$", "decimal_digits": 2, "rounding": 0, "code": "USD", "name_plural": "US dollars" }, "CAD": { "symbol": "CA$", "name": "Canadian Dollar", "symbol_native": "$", "decimal_digits": 2, "rounding": 0, "code": "CAD", "name_plural": "Canadian dollars" }, "EUR": { "symbol": "€", "name": "Euro", "symbol_native": "€", "decimal_digits": 2, "rounding": 0, "code": "EUR", "name_plural": "euros" }, "AUD": { "symbol": "AU$", "name": "Australian Dollar", "symbol_native": "$", "decimal_digits": 2, "rounding": 0, "code": "AUD", "name_plural": "Australian dollars" }, "CNY": { "symbol": "CN¥", "name": "Chinese Yuan", "symbol_native": "CN¥", "decimal_digits": 2, "rounding": 0, "code": "CNY", "name_plural": "Chinese yuan" }, "SGD": { "symbol": "S$", "name": "Singapore Dollar", "symbol_native": "$", "decimal_digits": 2, "rounding": 0, "code": "SGD", "name_plural": "Singapore dollars" } }
conversions.json:
{ "provider": "https://www.exchangerate-api.com", "WARNING_UPGRADE_TO_V6": "https://www.exchangerate-api.com/docs/free", "terms": "https://www.exchangerate-api.com/terms", "base": "USD", "date": "2021-10-29", "time_last_updated": 1635510901, "rates": { "USD": 1, "CAD": 1.23, "AUD": 1.33, "CNY": 6.39, "EUR": 0.857, "SGD": 1.34, } }
json_example.py:
Running it:
Summary
In this introductory article, we showed how to easily interface with DAOS from Python through the PyDAOS package. The PyDAOS dictionary API was presented, describing each operation in detail with small code snippets to ease understanding. After that, two complete working examples were presented. As mentioned in the introduction, the PyDAOS package is still a work in progress and more features will be supported in the future. Stay tuned.
"