内存在压入向量后被损坏

Memory is corrupted after pushing into the vector

本文关键字：损坏向量存在内存更新时间：2023-10-16

为什么在压入vector后内存会损坏。在下面的程序中，我有一个带有字符串var的结构体(它不是指针)。我每次都创建一个局部结构对象，并分配一个字符串值并推到向量。在推到向量后，我正在对局部结构对象进行更改。但是这种变化反映在vector struct对象的字符串数据中。

    #include <iostream>
    #include <vector>
    #include <string>
    #include <memory.h>
    using namespace std;
    void PushVector(string);
    struct thread_info
    {
            int  id;
            string threadname;
            bool bval;
    };
    std::vector<thread_info> myvector;

    int main ()
    {
            PushVector("Thread1"); // valid data into vector
            PushVector("Thread2");
            struct thread_info print;
            while(!myvector.empty())
            {
                    for(unsigned int index = 0; index < myvector.size(); ++index )
                    {
                            print = myvector.at(index);
                            cout<<"id : "<<print.id<<"nthread name : "<<print.threadname<<"nbool value : "<<print.bval<<endl;
                    }
                    myvector.clear();
            }
            return 0;
    }
    void PushVector(const string str)
    {
            std::cout << "Push the thread name to vectorn";
            struct thread_info thread;
            thread.id = 10;
            thread.threadname = str;
            thread.bval = true;
            myvector.push_back (thread); //copying struct obj to vector
            char* p =  (char* )thread.threadname.c_str();
            memcpy(p,"Wrong", 5); //==> Memory corrupting with invalid data after push back. Is it a limitation in C++? 
            thread.threadname = "blabla";  //trying to corrupt directly to string object
    }

o/p:将线程名推入vector
将线程名推入vector
Id: 10
线程名:Wrongd1 ==>内存损坏?为什么没有绳子?
Bool值:1
Id: 10
线程名:Wrongd2 ==>内存损坏?为什么没有绳子?
Bool值:1

memcpy到.c_str()的结果是错误的。事实是，你不得不砍掉const与cast不是一个暗示?是哪一种学习资源教会你这么做的?

正如paulm所说:

这样践踏字符串的内存只会导致眼泪。

std::string::c_str()返回一个不能修改的常量缓冲区指针;由于某些工具链中存在某些优化(例如GCC &lt中的SSO;5.0)它甚至可能不是字符串真正的底层缓冲区，这似乎是你这里的情况。

忘记memcpy;

在最好的情况下，您可以这样做:

thread.threadname.resize(5);
memcpy(&thread.threadname[0], "Wrong", 5);

或者，在c++代码中:

thread.threadname.resize(5);
std::copy("Wrong", "Wrong"+5, &thread.threadname[0]);

但是，如果是真的，你应该写:

thread.threadname = "Wrong";

tl;

您的memcpy()在const指针上(这是未定义的行为)与写时复制优化发生冲突。

是的，vector::push_back()将对象的副本压入vector。在push_back()修改了本地thread_info对象之后，对本地对象的修改应该不会影响矢量中的对象，对吧?

然而，std::string 允许假设对它的任何访问都将以一种定义良好的方式发生。对.c_str()返回的(const)指针执行memcpy()是没有定义好的。

所以…假设std::string在将thread_info对象复制到vector中时采取了一种快捷方式:它没有复制所包含的数据，而是将指针复制到数据上，因此两个std::string对象引用相同的内存区域。

它可以将复制延迟到实际需要的时候(如果)，即当通过任何定义的函数(如string::insert()或operator+=)写入其中一个字符串时。这被称为"写时复制"，一种相当常见的优化。

通过从.c_str()的返回值中抛弃const并在其上运行memcpy()，你挫败了这一机制。因为您没有遍历任何可以执行写时复制的string成员函数，所以这两个对象(应该是不同的)仍然指向相同的数据内存。

GDB输出，断点位于PushVector()的最后一行:

(gdb) print &thread
$3 = (thread_info *) 0x7fffffffe240
(gdb) print &myvector[0]
$4 = (thread_info *) 0x605040

两个thread_info对象不同。

(gdb) print &thread.threadname
$5 = (std::string *) 0x7fffffffe248
(gdb) print &myvector[0].threadname
$6 = (std::string *) 0x605048

两个string对象也不同。

(gdb) print thread.threadname.c_str()
$7 = 0x605028 "Wrongd1"
(gdb) print myvector[0].threadname.c_str()
$8 = 0x605028 "Wrongd1"

但是它们指向同一个内存区域，因为两个string对象都没有意识到有写访问，所以没有发生实际的数据复制。