TCHAR数组的深度副本被截断

Deep copy of TCHAR array is truncated

本文关键字：副本深度数组 TCHAR 更新时间：2023-10-16

我创建了一个类来测试我需要使用的一些功能。本质上，该类将获取传入字符串的深层副本，并通过getter使其可用。我正在使用Visual Studio 2012。Unicode已在项目设置中启用。

问题是memcpy操作产生了一个截断的字符串。输出是这样的；

THISISATEST: InstanceDataConstructor: Testing testing 123
Testing te_READY

其中第一行是对传入的TCHAR*字符串&第二行是用memcpy操作填充分配的内存的输出。预期输出为；"测试123"。

有人能解释一下这里出了什么问题吗？

注意：从这里得到了#ifndef UNICODE typedefs：如何将tchar数组转换为stdstring

#ifndef INSTANCE_DATA_H//if not defined already
#define INSTANCE_DATA_H//then define it
#include <string>
//TCHAR is just a typedef, that depending on your compilation configuration, either defaults to char or wchar.
//Standard Template Library supports both ASCII (with std::string) and wide character sets (with std::wstring).
//All you need to do is to typedef String as either std::string or std::wstring depending on your compilation configuration.
//To maintain flexibility you can use the following code:
#ifndef UNICODE  
  typedef std::string String; 
#else
  typedef std::wstring String; 
#endif
//Now you may use String in your code and let the compiler handle the nasty parts. String will now have constructors that lets you convert TCHAR to std::string or std::wstring.

class InstanceData
{
public: 
    InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor     
        {
        //Check to passed in string
        String outMsg(L"THISISATEST: InstanceDataConstructor: ");//L for wide character string literal
        outMsg += strMessage;//concatenate message
        const wchar_t* finalMsg = outMsg.c_str();//prepare for outputting
        OutputDebugStringW(finalMsg);//print the message    
        //Prepare TCHAR dynamic array.  Deep copy.
        charArrayPtr = new TCHAR[strMessage.size() +1];
        charArrayPtr[strMessage.size()] = 0;//null terminate
        std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());//copy characters from array pointed to by the passed in TCHAR*.
        OutputDebugStringW(charArrayPtr);//print the copied message to check    
        }
    ~InstanceData()//destructor
        {
            delete[] charArrayPtr;
        }
//Getter
TCHAR* getMessage() const
{
    return charArrayPtr;
}
private:
    TCHAR* charArrayPtr;
    String strMessage;//is used to conveniently ascertain the length of the passed in underlying TCHAR array.
};
#endif//header guard

一个没有所有动态分配内存的解决方案。

#include <tchar.h>
#include <vector>
//...
class InstanceData
{
    public: 
        InstanceData(TCHAR* strIn) : strMessage(strIn),
        {   
            charArrayPtr.insert(charArrayPtr.begin(), strMessage.begin(), strMessage.end())
            charArrayPtr.push_back(0);   
        }
        TCHAR* getMessage()
        { return &charArrayPtr[0]; }
    private:
        String strMessage;
        std::vector<TCHAR> charArrayPtr;
};

这与您的类一样，但主要的区别在于它不执行任何手动的动态分配代码。与具有动态分配的代码（缺少用户定义的复制构造函数和赋值运算符）不同，该类也是安全可复制的。

std::vector类已经取代了在几乎所有情况下都必须执行new[]/delete[]。原因是vector将其数据存储在连续存储器中，与调用new[]没有什么不同。

请注意代码中的以下几行：

// Prepare TCHAR dynamic array.  Deep copy.
charArrayPtr = new TCHAR[strMessage.size() + 1];
charArrayPtr[strMessage.size()] = 0; // null terminate
// Copy characters from array pointed to by the passed in TCHAR*.
std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());

传递给memcpy()的第三个参数是要复制的字节数
如果字符串是存储在std::string中的简单ASCII字符串，则字节数与ASCII字符数相同。

但是，如果该字符串是wchar_tUnicode UTF-16字符串，那么在Visual C++中，每个wchar_t占用2个字节（与GCC不同，但这是用VC++编译的Windows Win32/C++代码，所以我们只关注VC++）
因此，考虑到wchar_t的适当大小，您必须适当缩放memcpy()的大小计数，例如：

memcpy(charArrayPtr, strMessage.data(), strMessage.size() * sizeof(TCHAR));

因此，如果您在Unicode（UTF-16）模式下编译，那么TCHAR将扩展为wchar_t，sizeof(wchar_t)为2，因此应该适当地深度复制原始字符串的内容。

作为替代方案，对于VC++中的Unicode UTF-16字符串，您也可以使用wmemcpy()，它将wchar_t视为其"复制单位"。因此，在这种情况下，您不必按sizeof(wchar_t)缩放大小因子。

附带说明一下，在您的构造函数中，您有：

InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor

由于strIn是输入字符串参数，请考虑通过const指针传递，即：

InstanceData(const TCHAR* strIn)