如何找出一个数据集的PredType在HDF5使用c++库

How to find out the PredType of a dataset in HDF5 using the C++ library

本文关键字:PredType HDF5 使用 c++ 数据集 何找出 一个      更新时间:2023-10-16

所以我刚刚发现,如果我在HDF5文件中将unsigned char数组写入float数据集,则库不会抱怨。因此,在我写之前,我想检查一下这两个实际上是兼容的。对于unsigned char数组,我有相应的PredType。但是数据集并没有提供一个明显的方法来获得PredType,如果我没有弄错的话。

问题:给定H5::Dataset,我如何获得用于初始化它的PredType ?

https://www.hdfgroup.org/HDF5/doc/cpplus_RM/readdata_8cpp-example.html上的示例代码演示了如何做到这一点。

在总结

;可以找到使用DataSet::getTypeClass()函数存储的数据的"类"。然而,这个"类"并没有完全定义数据类型,因为它不允许您推断本机类型的大小(例如8位,32位……等)或符号表示(例如unsigned, 2的补码)。

在float;您还需要使用DataSet::getFloatType()FloatType::getSize()来推断数据类型是PredType::NATIVE_FLOAT还是PredType::NATIVE_DOUBLE,如:

auto dataClass = dataSet.getTypeClass();
if(dataClass == H5T_FLOAT)
{
    auto floatType = dataSet.getFloatType();
    size_t byteSize = floatType.getSize();
    if(byteSize == 4) 
    {
         // use PredType::NATIVE_FLOAT to write
    }
    else if(byteSize == 8)
    { 
         // use PredType::NATIVE_DOUBLE to write
    }
}

对于整数的符号表示,您需要使用IntType::getSign()

另一种解决问题的方法(即找出HDF5数据集的数据类型)是使用工具HDFql的c++如下(这个例子假设文件example.h5和数据集my_dataset已经存在):

// include HDFql C++ header file (make sure it can be found by the C++ compiler)
#include <iostream>
#include "HDFql.hpp"
int main(int argc, char *argv[])
{
    int data_type;
    // get data type of dataset "my_dataset" from HDF5 file "example.h5" and populate HDFql default cursor with it
    HDFql::execute("SHOW DATA TYPE example.h5 my_dataset");
    // move HDFql default cursor to first position
    HDFql::cursorFirst();
    // retrieve data type from HDFql default cursor
    data_type = *HDFql::cursorGetInt();
    // print message according to data type
    if (data_type == HDFql::TinyInt || data_type == HDFql::VarTinyInt)
        std::cout << "Data type is a char";
    else if (data_type == HDFql::UnsignedTinyInt || data_type == HDFql::UnsignedVarTinyInt)
        std::cout << "Data type is an unsigned char";
    else if (data_type == HDFql::SmallInt || data_type == HDFql::VarSmallInt)
        std::cout << "Data type is a short";
    else if (data_type == HDFql::UnsignedSmallInt || data_type == HDFql::UnsignedVarSmallInt)
        std::cout << "Data type is an unsigned short";
    else if (data_type == HDFql::Int || data_type == HDFql::VarInt)
        std::cout << "Data type is an int";
    else if (data_type == HDFql::UnsignedInt || data_type == HDFql::UnsignedVarInt)
        std::cout << "Data type is an unsigned int";
    else if (data_type == HDFql::BigInt || data_type == HDFql::VarBigInt)
        std::cout << "Data type is a long long";
    else if (data_type == HDFql::UnsignedBigInt || data_type == HDFql::UnsignedVarBigInt)
        std::cout << "Data type is an unsigned long long";
    else if (data_type == HDFql::Float || data_type == HDFql::VarFloat)
        std::cout << "Data type is a float";
    else if (data_type == HDFql::Double || data_type == HDFql::VarDouble)
        std::cout << "Data type is a double";
    else if (data_type == HDFql::Char || data_type == HDFql::VarChar)
        std::cout << "Data type is a char";
    else if (data_type == HDFql::Opaque)
        std::cout << "Data type is an opaque";
    else if (data_type == HDFql::Enumeration)
        std::cout << "Data type is an enumeration";
    else if (data_type == HDFql::Compound)
        std::cout << "Data type is a compound";
    else
        std::cout << "Unknown data type";
    return 0;
}

最后,如果您需要获取数据集my_dataset的端序或大小,请执行HDFql::execute("SHOW ENDIANNESS example.h5 my_dataset");HDFql::execute("SHOW SIZE example.h5 my_dataset");