将链递归插入内存失败

Recursive insert of a chain into memory fails

本文关键字:内存 失败 插入 递归      更新时间:2023-10-16

这可能是一个很长的问题,但我希望有人能帮我找出问题所在。

我用我自己的Datatype将一个JSON对象插入到已经分配的内存中,它基本上包含一个带有Data的Union和一个到下一个Datatype的ptrdiff_t,步骤为8位。

template <typename T>
class BaseType
{
public:
    BaseType();
    explicit BaseType(T& t);
    explicit BaseType(const T& t);
    ~BaseType();
    inline void setNext(const ptrdiff_t& next);
    inline std::ptrdiff_t getNext();
    inline void setData(T& t);
    inline void setData(const T& t);
    inline T getData() const;
protected:
    union DataUnion
    {
        T data;
        ::std::ptrdiff_t size;
        DataUnion()
        {
            memset(this, 0, sizeof(DataUnion));
        } //init with 0
        explicit DataUnion(T& t);
        explicit DataUnion(const T& t);
    } m_data;
    long long m_next;
};

实现是严格的,所以没有什么特别的事情发生,只是设置/获取定义的值。(我将跳过这里的impl.)

因此,这里开始代码哪里出了问题:

std::pair<void*, void*> Page::insertObject(const rapidjson::GenericValue<rapidjson::UTF8<>>& value,
         BaseType<size_t>* last)
 {
     //return ptr to the first element
     void* l_ret = nullptr;
     //prev element ptr
     BaseType<size_t>* l_prev = last;
     //position pointer
     void* l_pos = nullptr;
     //get the members
     for (auto it = value.MemberBegin(); it != value.MemberEnd(); ++it)
     {
         switch (it->value.GetType())
         {
             case rapidjson::kNullType:
                 LOG_WARN << "null type: " << it->name.GetString();
                 continue;
             case rapidjson::kFalseType:
             case rapidjson::kTrueType:
                 {
                     l_pos = find(sizeof(BaseType<bool>));
                     void* l_new = new (l_pos) BaseType<bool>(it->value.GetBool());
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             case rapidjson::kObjectType:
                 {
                     //pos for the obj id
                     //and insert the ID of the obj
                     l_pos = find(sizeof(BaseType<size_t>));
                     std::string name = it->name.GetString();
                     void* l_new = new (l_pos) BaseType<size_t>(common::FNVHash()(name));
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                     //TODO something strange happens here!
                     // pass the objid Object to the insertobj!
                     // now recursive insert the obj
                     // the second contains the last element inserted
                     // l_pos current contains the last inserted element and get set to the
                     // last element of the obj we insert
                     l_pos = (insertObject(it->value, reinterpret_cast<BaseType<size_t>*>(l_new)).second);
                 }
                 break;
             case rapidjson::kArrayType:
                 {//skip this at the moment till the bug is fixed
                 }
                 break;
             case rapidjson::kStringType:
                 {
                     // find pos where the string fits
                     // somehow we get here sometimes and it does not fit!
                     // which cant be since we lock the whole page
                     l_pos = find(sizeof(StringType) + strlen(it->value.GetString()));
                     //add the String Type at the pos of the FreeType
                     auto* l_new = new (l_pos) StringType(it->value.GetString());
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             case rapidjson::kNumberType:
                 {
                     //doesnt matter since long long and double are equal on x64
                     //find pos where the string fits
                     l_pos = find(sizeof(BaseType<long long>));
                     void* l_new;
                     if (it->value.IsInt())
                     {
                         //insert INT
                         l_new = new (l_pos) BaseType<long long>(it->value.GetInt64());
                     }
                     else
                     {
                         //INSERT DOUBLE
                         l_new = new (l_pos) BaseType<double>(it->value.GetDouble());
                     }
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             default:
                 LOG_WARN << "Unknown member Type: " << it->name.GetString() << ":" << it->value.GetType();
                 continue;
         }
         //so first element is set now, store it to return it.
         if(l_ret == nullptr)
         {
             l_ret = l_pos;
         }
         //prev is the l_pos now so cast it to this;
         l_prev = reinterpret_cast<BaseType<size_t>*>(l_pos);
     }
     //if we get here its in!
     return{ l_ret, l_pos };
 }

我开始这样插入:

auto firstElementPos = insertObject(value.MemberBegin()->value, nullptr).first;

value.MemberBegin()->value是要插入的对象,而->name持有对象的名称。在下面的情况下,它的Person和{}之间的所有内容。

问题是,如果我插入一个JSON对象,其中有一个对象,如下所示:

"Person":
{
    "age":25,
    "double": 23.23,
    "boolean": true,
    "double2": 23.23,
    "firstInnerObj":{
        "innerDoub": 12.12
    }   
}

它工作正常,我可以复制对象。但如果我有更多这样的内部对象:

"Person":
{
    "age":25,
    "double": 23.23,
    "boolean": true,
    "double2": 23.23,
    "firstInnerObj":{
        "innerDoub": 12.12
    },
    "secondInnerObj":{
        "secInnerDoub": 12.12
    }
}

它失败了,我丢失了数据,所以我认为我的递归出错了,但我不明白为什么。如果你需要更多信息,请告诉我。我来看看这里,客户来看看。

test.json需要包含一个如上所述的json对象。并且find只需要包含{"oid__":2}就可以获得插入的第二个对象。


我可以将问题追溯到我在代码中递归地重新创建Object的点。一些Nextpointers似乎不正确:

    void* Page::buildObject(const size_t& hash, void* start, rapidjson::Value& l_obj,
                            rapidjson::MemoryPoolAllocator<>& aloc)
    {
        //get the meta information of the object type
        //to build it
        auto& l_metaIdx = meta::MetaIndex::getInstance();
        //get the meta dataset
        auto& l_meta = l_metaIdx[hash];
        //now we are already in an object here with l_obj!
        auto l_ptr = start;
        for (auto it = l_meta->begin(); it != l_meta->end(); ++it)
        {
            //create the name value
            rapidjson::Value l_name(it->name.c_str(), it->name.length(), aloc);
            //create the value we are going to add
            rapidjson::Value l_value;
            //now start building it up again
            switch (it->type)
            {
                case meta::OBJECT:
                    {
                        auto l_data = static_cast<BaseType<size_t>*>(l_ptr);
                        //get the hash to optain the metadata
                        auto l_hash = l_data->getData();
                        //set to object and create the inner object
                        l_value.SetObject();
                        //get the start pointer which is the "next" element
                        //and call recursive
                        l_ptr = static_cast<BaseType<size_t>*>(buildObject(l_hash,
                                                               (reinterpret_cast<char*>(l_data) + l_data->getNext()), l_value, aloc));
                    }
                    break;
                case meta::ARRAY:
                    {
                        l_value.SetArray();
                        auto l_data = static_cast<ArrayType*>(l_ptr);
                        //get the hash to optain the metadata
                        auto l_size = l_data->size();
                        l_ptr = buildArray(l_size, static_cast<char*>(l_ptr) + l_data->getNext(), l_value, aloc);
                    }
                    break;
                case meta::INT:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<long long>*>(l_ptr);
                        //with length attribute it's faster ;)
                        l_value = l_data->getData();
                    }
                    break;
                case meta::DOUBLE:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<double>*>(l_ptr);
                        //with length attribute it's faster ;)
                        l_value = l_data->getData();
                    }
                    break;
                case meta::STRING:
                    {
                        //create the data
                        auto l_data = static_cast<StringType*>(l_ptr);
                        //with length attribute it's faster
                        l_value.SetString(l_data->getString()->c_str(), l_data->getString()->length(), aloc);
                    }
                    break;
                case meta::BOOL:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<bool>*>(l_ptr);
                        l_value = l_data->getData();
                    }
                    break;
                default:
                    break;
            }
            l_obj.AddMember(l_name, l_value, aloc);
            //update the lptr
            l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();
        }
        //return the l_ptr which current shows to the next lement. //see line above
        return l_ptr;
    }

经过数小时的调试,我发现了导致这种情况的小问题。在插入Object之后构建Object的方法返回一个指针,指向插入的实际最后一个element->next,在切换情况之后,我再次调用了->next,这会导致数据丢失,因为它在单链列表中映射了一个元素。

解决方法是将线路

l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();

仅适用于非Object或Array的开关情况。修复提交这实际上也让我修复了插入数组的问题。

当然,真正的问题不可能在这里认识一个没有深入研究代码的人,但我仍然想在这里展示修复方法。感谢@sehe,他帮我弄清楚了这里出了什么问题。