结构中的C++可变长度数组

C++ variable length arrays in struct

本文关键字：数组 C++ 结构更新时间：2023-10-16

我正在编写一个用于创建、发送、接收和解释ARP数据包的程序。我有一个表示ARP头的结构，如下所示：

struct ArpHeader
{
    unsigned short hardwareType;
    unsigned short protocolType;
    unsigned char hardwareAddressLength;
    unsigned char protocolAddressLength;
    unsigned short operationCode;
    unsigned char senderHardwareAddress[6];
    unsigned char senderProtocolAddress[4];
    unsigned char targetHardwareAddress[6];
    unsigned char targetProtocolAddress[4];
};

这只适用于长度为6的硬件地址和长度为4的协议地址。地址长度也在标题中给出，因此为了正确，结构必须看起来像这样：

struct ArpHeader
{
    unsigned short hardwareType;
    unsigned short protocolType;
    unsigned char hardwareAddressLength;
    unsigned char protocolAddressLength;
    unsigned short operationCode;
    unsigned char senderHardwareAddress[hardwareAddressLength];
    unsigned char senderProtocolAddress[protocolAddressLength];
    unsigned char targetHardwareAddress[hardwareAddressLength];
    unsigned char targetProtocolAddress[protocolAddressLength];
};

这显然是行不通的，因为在编译时地址长度是未知的。模板结构也不是一个选项，因为我想填写结构的值，然后将其从（ArpHeader*）转换为（char*），以获得可以在网络上发送的字节数组，或者将接收到的字节数组从（char*

一种解决方案是创建一个将所有标头字段作为成员变量的类，一个创建表示ARP标头的字节数组的函数，该数组可以在网络上发送，另一个构造函数只接受一个字节数组（在网络上接收），并通过读取所有标头字段并将其写入成员变量来解释它。这不是一个好的解决方案，因为它需要更多的代码。

相反，例如UDP报头的类似结构是简单的，因为所有报头字段都具有已知的恒定大小。我使用

#pragma pack(push, 1)
#pragma pack(pop)

围绕结构声明，这样我就可以实际执行一个简单的C样式转换，以获得要在网络上发送的字节数组。

有没有什么解决方案可以在这里使用，它将接近一个结构，或者至少不需要比一个结构多得多的代码？我知道结构中的最后一个字段（如果它是数组）不需要特定的编译时大小，我可以使用类似的方法来解决我的问题吗？只需将这4个数组的大小留空即可编译，但我不知道这将如何实际运行。从逻辑上讲，它不可能工作，因为如果第一个数组的大小未知，编译器将不知道第二个数组从哪里开始。

您想要一个相当低级别的东西，一个ARP数据包，并且您正试图找到一种方法来正确定义数据结构，以便将blob投射到该结构中。相反，您可以在blob上使用一个接口。

struct ArpHeader {
    mutable std::vector<uint8_t> buf_;
    template <typename T>
    struct ref {
        uint8_t * const p_;
        ref (uint8_t *p) : p_(p) {}
        operator T () const { T t; memcpy(&t, p_, sizeof(t)); return t; }
        T operator = (T t) const { memcpy(p_, &t, sizeof(t)); return t; }
    };
    template <typename T>
    ref<T> get (size_t offset) const {
        if (offset + sizeof(T) > buf_.size()) throw SOMETHING;
        return ref<T>(&buf_[0] + offset);
    }
    ref<uint16_t> hwType() const { return get<uint16_t>(0); }
    ref<uint16_t> protType () const { return get<uint16_t>(2); }
    ref<uint8_t> hwAddrLen () const { return get<uint8_t>(4); }
    ref<uint8_t> protAddrLen () const { return get<uint8_t>(5); }
    ref<uint16_t> opCode () const { return get<uint16_t>(6); }
    uint8_t *senderHwAddr () const { return &buf_[0] + 8; }
    uint8_t *senderProtAddr () const { return senderHwAddr() + hwAddrLen(); }
    uint8_t *targetHwAddr () const { return senderProtAddr() + protAddrLen(); }
    uint8_t *targetProtAddr () const { return targetHwAddr() + hwAddrLen(); }
};

如果需要const的正确性，则删除mutable，创建一个const_ref，并将访问器复制到非const版本中，并使const版本返回const_ref和const uint8_t *。

简单回答：在C++中不能有可变大小的类型。

在编译过程中，C++中的每个类型都必须具有已知（且稳定）的大小。IE运算符sizeof()必须给出一致的答案。注意，您可以通过使用堆来拥有保存可变数据量的类型（例如：std::vector<int>），但实际对象的大小始终是恒定的。

因此，您永远无法生成一个类型声明，您可以强制转换并神奇地调整字段。这深入到了基本的对象布局中——每个成员（也称为字段）都必须有一个已知的（稳定的）偏移。

通常，通过编写（或生成）解析输入数据并初始化对象数据的成员函数来解决问题。这基本上是一个由来已久的数据序列化问题，在过去30年左右的时间里，这个问题已经解决了无数次。

以下是一个基本解决方案的模型：

class packet { 
public:
    // simple things
    uint16_t hardware_type() const;
    // variable-sized things
    size_t sender_address_len() const;
    bool copy_sender_address_out(char *dest, size_t dest_size) const;
    // initialization
    bool parse_in(const char *src, size_t len);
private:    
    uint16_t hardware_type_;    
    std::vector<char> sender_address_;
};

注：

上面的代码显示了一个非常基本的结构，可以让您执行以下操作：
```
packet p;
if (!p.parse_in(input, sz))
    return false;
```

通过RAII做同样事情的现代方式看起来是这样的：

if (!packet::validate(input, sz))
    return false;
packet p = packet::parse_in(input, sz);  // static function 
                                         // returns an instance or throws

如果您想保持对数据的简单访问以及数据本身public，有一种方法可以在不更改访问数据的方式的情况下实现您想要的内容。首先，您可以使用std::string而不是char数组来存储地址：

#include <string>
using namespace std; // using this to shorten notation. Preferably put 'std::'
                     // everywhere you need it instead.
struct ArpHeader
{
    unsigned char hardwareAddressLength;
    unsigned char protocolAddressLength;
    string senderHardwareAddress;
    string senderProtocolAddress;
    string targetHardwareAddress;
    string targetProtocolAddress;
};

然后，您可以重载转换运算符operator const char*()和构造函数arpHeader(const char*)（当然最好也是operator=(const char*)），以便保持当前的发送/接收函数正常工作（如果需要的话）。

一个简化的转换运算符（跳过了一些字段，使其不那么复杂，但您应该可以将它们添加回来），看起来是这样的：

operator const char*(){
    char* myRepresentation;
    unsigned char mySize
            = 2+ senderHardwareAddress.length()
            + senderProtocolAddress.length()
            + targetHardwareAddress.length()
            + targetProtocolAddress.length();
    // We need to store the size, since it varies
    myRepresentation = new char[mySize+1];
    myRepresentation[0] = mySize;
    myRepresentation[1] = hardwareAddressLength;
    myRepresentation[2] = protocolAddressLength;
    unsigned int offset = 3; // just to shorten notation
    memcpy(myRepresentation+offset, senderHardwareAddress.c_str(), senderHardwareAddress.size());
    offset += senderHardwareAddress.size();
    memcpy(myRepresentation+offset, senderProtocolAddress.c_str(), senderProtocolAddress.size());
    offset += senderProtocolAddress.size();
    memcpy(myRepresentation+offset, targetHardwareAddress.c_str(), targetHardwareAddress.size());
    offset += targetHardwareAddress.size();
    memcpy(myRepresentation+offset, targetProtocolAddress.c_str(), targetProtocolAddress.size());
    return myRepresentation;
}

而构造函数可以这样定义：

ArpHeader& operator=(const char* buffer){
    hardwareAddressLength = buffer[1];
    protocolAddressLength = buffer[2];
    unsigned int offset = 3; // just to shorten notation
    senderHardwareAddress = string(buffer+offset, hardwareAddressLength);
    offset += hardwareAddressLength;
    senderProtocolAddress = string(buffer+offset, protocolAddressLength);
    offset += protocolAddressLength;
    targetHardwareAddress = string(buffer+offset, hardwareAddressLength);
    offset += hardwareAddressLength;
    targetProtocolAddress = string(buffer+offset, protocolAddressLength);
    return *this;
}
ArpHeader(const char* buffer){
    *this = buffer; // Re-using the operator=
}

然后使用你的类很简单：

ArpHeader h1, h2;
h1.hardwareAddressLength = 3;
h1.protocolAddressLength = 10;
h1.senderHardwareAddress = "foo";
h1.senderProtocolAddress = "something1";
h1.targetHardwareAddress = "bar";
h1.targetProtocolAddress = "something2";
cout << h1.senderHardwareAddress << ", " << h1.senderProtocolAddress
<< " => " << h1.targetHardwareAddress << ", " << h1.targetProtocolAddress << endl;
const char* gottaSendThisSomewhere = h1;
h2 = gottaSendThisSomewhere;
cout << h2.senderHardwareAddress << ", " << h2.senderProtocolAddress
<< " => " << h2.targetHardwareAddress << ", " << h2.targetProtocolAddress << endl;
delete[] gottaSendThisSomewhere;

它应该为您提供所需的实用程序，并在不更改类外任何内容的情况下保持代码的工作。

然而，请注意，如果您愿意稍微更改代码的其余部分（在这里谈论您已经编写的代码，在类的其他部分），jxh的答案应该和这个一样快，并且在内部更优雅。