最高有效字节计算

Most Significant Byte Computation

本文关键字：字节计算有效高有效更新时间：2023-10-16

我正在尝试实现一个四字节值（首先是最重要的数据）来计算数据的总长度。我找到了一个代码片段来计算这个，但我在输出中没有得到 4 字节的数据。相反，我只得到了一个 2 字节的值。

char bytesLen[4] ;
unsigned int blockSize = 535;
bytesLen[0] = (blockSize & 0xFF);
bytesLen[1] = (blockSize >> 8) & 0xFF;
bytesLen[2] = (blockSize >> 16) & 0xFF;
bytesLen[3] = (blockSize >> 24) & 0xFF;
std::cout << "bytesLen: " << bytesLen << 'n';

我的代码中是否遗漏了某些内容？

不，你没有。您将数组输出为 C 字符串，该字符串null终止。第三个字节为 nul，因此只会显示两个字符。

这不是输出二进制值的合理方法。

此外，您首先保存最低有效字节，而不是最重要。对于最重要的，您必须颠倒字节的顺序。

这显示了如何在没有移位运算符和位掩码的情况下做同样的事情。

#include <iostream>
#include <iomanip>
// C++11
#include <cstdint>
int main(void)
{
    // with union, the memory allocation is shared
    union {
        uint8_t bytes[4];
        uint32_t n;
    } length;
    // see htonl if needs to be in network byte order
    //  or ntohl if from network byte order to host
    length.n = 535;
    std::cout << std::hex;
    for(int i=0; i<4; i++) {
        std::cout << (unsigned int)length.bytes[i] << " ";
    }
    std::cout << std::endl;
    return 0;
}

如果你首先想要毫秒字节，那么你已经颠倒了字节的顺序。

您得到不正确的输出，因为您将所有内容都视为 C 字符串，即使它不是。摆脱char类型并修复打印。

在C++，它会是这样的：

#include <iostream>
#include <cstdint>
int main()
{
  uint8_t bytesLen[sizeof(uint32_t)];
  uint32_t blockSize = 535;
  bytesLen[3] = (blockSize >>  0) & 0xFF;
  bytesLen[2] = (blockSize >>  8) & 0xFF;
  bytesLen[1] = (blockSize >> 16) & 0xFF;
  bytesLen[0] = (blockSize >> 24) & 0xFF;
  bool removeZeroes = true;
  std::cout << "bytesLen: 0x";
  for(size_t i=0; i<sizeof(bytesLen); i++)
  {
    if(bytesLen[i] != 0)
    {
      removeZeroes = false;
    }
    if(!removeZeroes)
    {
      std::cout << std::hex << (int)bytesLen[i];
    }
  }
  std::cout << std::endl;
  return 0;
}

这是固定代码 [未经测试]。请注意，这不会按原样编译。您需要稍微重新排序，但它应该会有所帮助：

unsigned char bytesLen[4] ;
unsigned int blockSize = 535;
// little endian
#if 0
bytesLen[0] = (blockSize & 0xFF);
bytesLen[1] = (blockSize >> 8) & 0xFF;
bytesLen[2] = (blockSize >> 16) & 0xFF;
bytesLen[3] = (blockSize >> 24) & 0xFF;
// big endian
#else
bytesLen[3] = (blockSize & 0xFF);
bytesLen[2] = (blockSize >> 8) & 0xFF;
bytesLen[1] = (blockSize >> 16) & 0xFF;
bytesLen[0] = (blockSize >> 24) & 0xFF;
#endif
char tmp[9];
char *
pretty_print(char *dst,unsigned char *src)
{
    char *hex = "0123456789ABCDEF";
    char *bp = dst;
    int chr;
    for (int idx = 0;  idx <= 3;  ++idx) {
        chr = src[idx];
        *bp++ = hex[(chr >> 4) & 0x0F];
        *bp++ = hex[(chr >> 0) & 0x0F];
    }
    *bp = 0;
    return dst;
}
std::cout << "bytesLen: " << pretty_print(tmp,bytesLen) << 'n';

更新：

根据你的后续问题，要连接二进制数据，我们不能使用类似字符串的函数，例如sprintf[因为二进制数据内部可能有0x00，这会阻止字符串传输短路]。此外，如果二进制数据中没有0x00，字符串函数将运行到查找它的数组末尾之外，并且会发生不好的事情。字符串函数还假定有符号char数据，在处理原始二进制文件时，我们希望使用 unsigned char .

可以尝试以下操作：

unsigned char finalData[1000];  // size is just example
unsigned char bytesLen[4];
unsigned char blockContent[300];
unsigned char *dst;
dst = finalData;
memcpy(dst,bytesLen,sizeof(bytesLen));
dst += sizeof(bytesLen);
memcpy(dst,blockContent,sizeof(blockContent));
dst += sizeof(blockContent);
// append more if needed in similar way ...

注意：上述预设blockContent大小固定。如果它有一个可变数量的字节，我们需要用（例如）bclen替换sizeof(blockContent)，其中这是blockContent中的字节数