如何通过从文件中读取来反序列化C++的字节数组

How to deserialize the ByteArrays from C++ by reading from the file

本文关键字：C++ 字节字节数数组反序列化何通过文件读取更新时间：2023-10-16

我正在做一个项目，我应该将ByteArray写入文件。然后，使用 C++ 程序读取相同的文件。

而我写入文件ByteArray是这三种ByteArrays的组合——

前 2 个字节是我schemaId我用短数据类型表示它。
然后接下来的 8 个字节是我Last Modified Date我用长数据类型表示它。
剩余字节可以是可变大小，这是我的属性的实际值。

将生成的ByteArray写入文件后。现在我需要从C++ program读取该文件并读取包含字节数组的第一行，然后相应地拆分生成的字节数组，如上所述，以便我能够从中提取我的schemaId、Last Modified Date和我的实际attribute value。

我所有的编码总是用 Java 完成的，我是C++新手......我能够用C++编写一个程序来读取文件，但不确定我应该如何以这样一种方式读取该 ByteArray，以便我能够像上面提到的那样拆分它。

下面是我的java代码，它将生成的ByteArray写入一个文件，现在我需要从c ++读回同一个文件。

public static void main(String[] args) throws Exception {
    String os = "whatever os is";
    byte[] avroBinaryValue = os.getBytes();
    long lastModifiedDate = 1379811105109L;
    short schemaId = 32767;
    ByteArrayOutputStream byteOsTest = new ByteArrayOutputStream();
    DataOutputStream outTest = new DataOutputStream(byteOsTest);
    outTest.writeShort(schemaId);
    outTest.writeLong(lastModifiedDate);
    outTest.writeInt(avroBinaryValue.length);
    outTest.write(avroBinaryValue);
    byte[] allWrittenBytesTest = byteOsTest.toByteArray();
    DataInputStream inTest = new DataInputStream(new ByteArrayInputStream(allWrittenBytesTest));
    short schemaIdTest = inTest.readShort();
    long lastModifiedDateTest = inTest.readLong();
    int sizeAvroTest = inTest.readInt();
    byte[] avroBinaryValue1 = new byte[sizeAvroTest];
    inTest.read(avroBinaryValue1, 0, sizeAvroTest);

    System.out.println(schemaIdTest);
    System.out.println(lastModifiedDateTest);
    System.out.println(new String(avroBinaryValue1));
    writeFile(allWrittenBytesTest);
}
    /**
 * Write the file in Java
 * @param byteArray
 */
public static void writeFile(byte[] byteArray) {
    try{
        File file = new File("bytearrayfile");
        FileOutputStream output = new FileOutputStream(file);
        IOUtils.write(byteArray, output);           
    } catch (Exception ex) {
        ex.printStackTrace();
    }
}

下面是我的C++程序，它正在读取上面的文件(由Java编写(，我不确定我应该怎么做才能以这种方式拆分字节数组，以便我可以相应地读取单个字节数组。

#include "ReadFile.h"
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
    string line;
    std::ifstream myfile("bytearrayfile", std::ios::binary);
    //check to see if the file is opened:
    if (myfile.is_open())
    {
        //while there are still lines in the
        //file, keep reading:
        while (! myfile.eof() )
        {
        // I am not sure what I am supposed to do here?
        }
        //close the stream:
        myfile.close();
    }
    else cout << "Unable to open file";
    return 0;
}

在反序列化单个 ByteArray 后，我应该能够从上述C++程序中将 schemaId 提取为 32767 ，lastModifiedDate提取为 1379811105109，并将我的属性值提取为whatever os is。

我是C++新手，所以面临很多问题。我的代码上的任何示例基础都将帮助我更好地理解。

谁能帮我？谢谢。

更新：-

以下是我的最新代码，通过它我能够提取schemaId、lastModifiedDate和attributeLength。

但不确定如何提取实际的属性值-

int main() {
    string line;
    std::ifstream myfile("bytearrayfile", std::ios::binary);
    if (myfile.is_open()) {
        uint16_t schemaId;
        uint64_t lastModifiedDate;
        uint32_t attributeLength;
        char buffer[8]; // sized for the biggest read we want to do
        // read two bytes (will be in the wrong order)
        myfile.read(buffer, 2);
        // swap the bytes
        std::swap(buffer[0], buffer[1]);
        // only now convert bytes to an integer
        schemaId = *reinterpret_cast<uint16_t*>(buffer);
        cout<< schemaId <<endl;
        // read eight bytes (will be in the wrong order)
        myfile.read(buffer, 8);
        // swap the bytes
        std::swap(buffer[0], buffer[7]);
        std::swap(buffer[1], buffer[6]);
        std::swap(buffer[2], buffer[5]);
        std::swap(buffer[3], buffer[4]);
        // only now convert bytes to an integer
        lastModifiedDate = *reinterpret_cast<uint64_t*>(buffer);
        cout<< lastModifiedDate <<endl;
        // read 4 bytes (will be in the wrong order)
        myfile.read(buffer, 4);
        // swap the bytes
        std::swap(buffer[0], buffer[3]);
        std::swap(buffer[1], buffer[2]);
        // only now convert bytes to an integer
        attributeLength = *reinterpret_cast<uint32_t*>(buffer);
        cout<< attributeLength <<endl;
      // not sure how to extract the actual attribute value?
        //close the stream:
        myfile.close();
    }
    else
        cout << "Unable to open file";
    return 0;
}

在 Java 中，你的程序是

写入架构标识
写入上次修改日期
写入 avro 二进制数据长度
写入 Avro 二进制数据

所以C++你的程序是

读取架构标识
读取上次修改日期
读取 Avro 二进制数据长度
读取 Avro 二进制数据

对于这个程序，C++和Java之间几乎没有区别，所以如果你可以用Java做到这一点，你应该(通过一些研究(能够在C++中做到这一点。

这是一个开始(第 1 项(

short schemaId;
myFile.read(reinterpret_cast<char*>(&schemaId), sizeof(short));

reinterpret_cast<char*>是必需的，因为 read 函数的第一个参数需要一个char*。因此，如果第一个参数不是指向字符的指针，则必须进行强制转换。

这确实假设sizeof(short) == 2(在 Java 中始终为真，在C++中通常为真(，并且没有 endia 问题。很难知道这一点，你只需要尝试一下，看看。

在读取或写入二进制整数时，Java 和 C++ 的实现可能会使用不同的字节顺序。这称为字节序。如果是这种情况，那么在读取整数时，您将不得不交换字节顺序。这里有一些代码可以做到这一点(这是非常乏味的东西，可能有一个更干净的方法(。

uint16_t schemaId;
uint64_t lastModifiedDate;
uint32_t attributeLength;
char buffer[8]; // sized for the biggest read we want to do
// read two bytes (will be in the wrong order)
myfile.read(buffer, 2);
// swap the bytes
std::swap(buffer[0], buffer[1]);
// only now convert bytes to an integer
schemaId = *reinterpret_cast<uint16_t*>(buffer);
// read eight bytes (will be in the wrong order)
myfile.read(buffer, 8);
// swap the bytes
std::swap(buffer[0], buffer[7]);
std::swap(buffer[1], buffer[6]);
std::swap(buffer[2], buffer[5]);
std::swap(buffer[3], buffer[4]);
// only now convert bytes to an integer
lastModifiedDate = *reinterpret_cast<uint64_t*>(buffer);

等。。。

您需要#include <algorithm>才能获得std::swap函数。