C++RapidXml-使用first_node()遍历以修改XML文件中节点的值

C++ RapidXml - Traversing with first_node() to modify the value of a node in an XML file

本文关键字:文件 XML 修改 节点 遍历 first 使用 node C++RapidXml-      更新时间:2023-10-16

我一定快疯了。。。这是我的XML文件,名为Original.xml:

<root>
<metadata>Trying to change this</metadata>
<body>
<salad>Greek Caesar</salad>
</body>
</root>

我正在尝试修改metadata标记中的内容。

这是我的整段代码,WORKS:

#include <iostream>
#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>
int main()
{
// Open 'Original.xml' to read from
rapidxml::file<> xmlFile("Original.xml");
rapidxml::xml_document<> doc;
doc.parse<0>(xmlFile.data());
// Get to <metadata> tag
//                                       <root>        <metadata>    ???
rapidxml::xml_node<>* metadataNode = doc.first_node()->first_node()->first_node();

// Always correctly prints: 'Trying to change this'
std::cout << "Before: " << metadataNode->value() << std::endl;
// Modify the contents within <metadata>
const std::string newMetadataValue = "Did the changing";
metadataNode->value(newMetadataValue.c_str());
// Always correctly prints: 'Did the changing'
std::cout << "After: " << metadataNode->value() << std::endl;
// Save output to 'New.xml'
std::ofstream newXmlFile("New.xml");
newXmlFile << doc;
newXmlFile.close();
doc.clear();
return 0;
}

New.xml现在看起来是这样的:

<root>
<metadata>Did the changing</metadata>
<body>
<salad>Greek Caesar</salad>
</body>
</root>

这就是我想要的行为。

我不明白的是,为什么我需要第三次first_node()调用来SAVEmetadata中的信息。

如果我删除由上面的???标记的第三个first_node()调用,New.xml将保留旧的<metadata>字符串:"试图改变这一点">

然而,在这种情况下,metadataNode->value()上的两个std::cout调用仍将正确打印所需的字符串;意思是,第一个将打印";试图改变这一点">,并且第二个将正确地打印";改变的">

为什么我需要使用对first_node()n+1调用来SAVE所需节点处的新值,其中n是(从根(遍历到所需节点的节点数?为什么如果我有nfirst_node()调用,我只能成功地修改RAM中所需节点的值?

可能的错误?在谁的一端?

在XML树模型中,文本元素也是节点。当您有混合的内容元素时,这是有意义的:<a>some<b/>text<c/>nodes</a>

基本上:

#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>
int main() {
rapidxml::file<> xmlFile("Original.xml");
rapidxml::xml_document<> doc;
doc.parse<0>(xmlFile.data());
auto root      = doc.first_node();
auto metadata  = root->first_node();
auto text_node = metadata->first_node();
text_node->value("Did the changing");
std::ofstream newXmlFile("New.xml");
newXmlFile << doc;
}

但是等等,还有更多

遗憾的是,这是一个问题,除非您的输入具有完全预期的属性。

假设此样本正常:

char sample1[] = R"(<root><metadata>Trying to change this</metadata></root>)";

如果元数据元素为空,则会崩溃:

char sample2[] = R"(<root><metadata></metadata></root>)";
char sample3[] = R"(<root><metadata/></root>)";

事实上,这会触发ASAN故障:

/home/sehe/Projects/stackoverflow/test.cpp:17:25: runtime error: member access within null pointer of type 'struct xml_node'
/home/sehe/Projects/stackoverflow/test.cpp:17:25: runtime error: member call on null pointer of type 'struct xml_base'
/usr/include/rapidxml/rapidxml.hpp:762:24: runtime error: member call on null pointer of type 'struct xml_base'
/usr/include/rapidxml/rapidxml.hpp:753:21: runtime error: member access within null pointer of type 'struct xml_base'
AddressSanitizer:DEADLYSIGNAL

如果有惊喜,它会。。。。做一些令人惊奇的事情!

char sample4[] = R"(<root><metadata><surprise/></metadata></root>)";

最终错误生成:

<root>
<metadata>
<surprise>changed</surprise>
</metadata>
</root>

这还没有结束:

#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>
#include <iostream>
namespace {
char sample1[] = R"(<root><metadata>Trying to change this</metadata></root>)";
char sample2[] = R"(<root><metadata><surprise/></metadata></root>)";
char sample3[] = R"(<root><metadata>mixed<surprise/>bag</metadata></root>)";
char sample4[] = R"(<root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root>)";
char sample5[] = R"(<root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root>)";
//These crash:
//char sampleX[] = R"(<root><metadata></metadata></root>)";
//char sampleY[] = R"(<root><metadata/></root>)";
}
int main() {
for (char* xml : {sample1, sample2, sample3, sample4, sample5}) {
std::cout << "n=== " << xml << " ===n";
rapidxml::xml_document<> doc;
doc.parse<0>(xml);
auto root      = doc.first_node();
auto metadata  = root->first_node();
auto text_node = metadata->first_node();
text_node->value("changed");
print(std::cout << " --> ", doc, rapidxml::print_no_indenting);
std::cout << "n";
}
}

打印

=== <root><metadata>Trying to change this</metadata></root> ===
--> <root><metadata>changed</metadata></root>
=== <root><metadata><surprise/></metadata></root> ===
--> <root><metadata><surprise>changed</surprise></metadata></root>
=== <root><metadata>mixed<surprise/>bag</metadata></root> ===
--> <root><metadata>changed<surprise/>bag</metadata></root>
=== <root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root> ===
--> <root><metadata><![CDATA[changed]]></metadata></root>
=== <root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root> ===
--> <root><metadata>changed</metadata></root>

如何使其强健

  • 首先,使用查询来查找目标。遗憾的是,rapidxml不支持这一点;请参阅我应该在C++中使用什么XML解析器?

  • 其次,在编辑之前检查节点类型

  • 第三,如果可以的话,替换整个节点,这使您独立于以前的

  • 最后,确保从文档中实际分配新节点,这样就不会出现生存期问题。

    auto root = doc.first_node();
    if (auto* old_meta = root->first_node()) {
    assert(old_meta->name() == std::string("metadata"));
    print(std::cout << "Removing metadata node: ", *old_meta, fmt);
    std::cout << "n";
    root->remove_first_node();
    }
    auto newmeta = doc.allocate_node(rapidxml::node_element, "metadata", "changed");
    root->prepend_node(newmeta);
    

把它放在一起:

#include <rapidxml/rapidxml.hpp>
#include <rapidxml/rapidxml_print.hpp>
#include <rapidxml/rapidxml_utils.hpp>
#include <iostream>
namespace {
std::string cases[] = {
R"(<root><metadata>Trying to change this</metadata></root>)",
R"(<root><metadata><surprise/></metadata></root>)",
R"(<root><metadata>mixed<surprise/>bag</metadata></root>)",
R"(<root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root>)",
R"(<root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root>)",
R"(<root>
<metadata>Trying to change this</metadata>
<body>
<salad>Greek Caesar</salad>
</body>
</root>)",
//These no longer crash:
R"(<root><metadata></metadata></root>)",
R"(<root><metadata/></root>)",
// more edge-cases in the predecessor chain
R"(<root></root>)",
R"(<root><no-metadata/></root>)",
R"(<bogus/>)",
};
}
int main() {
auto const fmt = rapidxml::print_no_indenting;
for (auto& xml : cases) {
std::cout << "Input: " << xml << "n";
rapidxml::xml_document<> doc;
doc.parse<0>(xml.data());
if (auto root = doc.first_node()) {
if (root->name() == std::string("root")) {
if (auto* old_meta = root->first_node()) {
if (old_meta->name() == std::string("metadata")) {
root->remove_first_node();
} else {
std::cout << "WARNING: Not removing '" << old_meta->name() << "' element where 'metadata' expectedn";
}
}
auto newmeta = doc.allocate_node(rapidxml::node_element, "metadata", "changed");
root->prepend_node(newmeta);
} else {
std::cout << "WARNING: '" << root->name() << "' found where 'root' expectedn";
}
}

print(std::cout << "Output: ", doc, fmt);
std::cout << "n--n";
}
}

打印

Input: <root><metadata>Trying to change this</metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata><surprise/></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata>mixed<surprise/>bag</metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata><![CDATA[mixed<surprise/>bag]]></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata><!-- comment please -->outloud<!-- hidden --></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root>
<metadata>Trying to change this</metadata>
<body>
<salad>Greek Caesar</salad>
</body>
</root> ===
Output: <root><metadata>changed</metadata><body><salad>Greek Caesar</salad></body></root>
--
Input: <root><metadata></metadata></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><metadata/></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root></root> ===
Output: <root><metadata>changed</metadata></root>
--
Input: <root><no-metadata/></root> ===
WARNING: Not removing 'no-metadata' element where 'metadata' expected
Output: <root><metadata>changed</metadata><no-metadata/></root>
--
Input: <bogus/> ===
WARNING: 'bogus' found where 'root' expected
Output: <bogus/>
--

摘要

XML是可扩展的。这是Markup。这是语言。这并不简单:(