拆分 std::string 并插入到 std::set 中

Splitting std::string and inserting into a std::set

本文关键字：std set string 拆分插入更新时间：2023-10-16

根据C++聊天休息室的神奇家伙的要求，分解文件（在我的例子中包含一个大约 100 行的字符串，每行大约 10 个单词）并将所有这些单词插入 std：：set 的好方法是什么？

从包含一系列该元素的源构造任何容器的最简单方法是使用接受一对迭代器的构造函数。使用 istream_iterator 循环访问流。

#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
int main()
{
  //I create an iterator that retrieves `string` objects from `cin`
  auto begin = istream_iterator<string>(cin);
  //I create an iterator that represents the end of a stream
  auto end = istream_iterator<string>();
  //and iterate over the file, and copy those elements into my `set`
  set<string> myset(begin, end);
  //this line copies the elements in the set to `cout`
  //I have this to verify that I did it all right
  copy(myset.begin(), myset.end(), ostream_iterator<string>(cout, "n"));
  return 0;
}

http://ideone.com/iz1q0

假设你已经将文件读入一个字符串，boost：：split将做到这一点：

#include <set>
#include <boost/foreach.hpp>
#include <boost/algorithm/string.hpp>
std::string astring = "abc 123 abc 123ndef 456 def 456";  // your string
std::set<std::string> tokens;                              // this will receive the words
boost::split(tokens, astring, boost::is_any_of("n "));    // split on space & newline
// Print the individual words
BOOST_FOREACH(std::string token, tokens){
    std::cout << "n" << token << std::endl;
}

如有必要，可以使用列表或向量代替集合。

另请注意，这几乎是以下方面的欺骗：C++拆分字符串？

#include <set>
#include <iostream>
#include <string>
int main()
{
  std::string temp, mystring;
  std::set<std::string> myset;
  while(std::getline(std::cin, temp))
      mystring += temp + ' ';
  temp = "";      
  for (size_t i = 0; i < mystring.length(); i++)
  {
    if (mystring.at(i) == ' ' || mystring.at(i) == 'n' || mystring.at(i) == 't')
    {
      myset.insert(temp);
      temp = "";
    }
    else
    {
      temp.push_back(mystring.at(i));
    }
  }
  if (temp != " " || temp != "n" || temp != "t")
    myset.insert(temp);
  for (std::set<std::string>::iterator i = myset.begin(); i != myset.end(); i++)
  {
    std::cout << *i << std::endl;
  }
  return 0;
}

让我们从顶部开始。首先，您需要使用一些变量。 temp只是字符串的占位符，当您从要分析的字符串中的每个字符构建它时。 mystring是您要拆分的字符串，myset是您将粘贴拆分字符串的位置。

因此，然后我们读取文件（通过<管道输入）并将内容插入mystring .

现在我们要向下迭代字符串的长度，搜索空格、换行符或制表符来拆分字符串。如果我们找到这些字符之一，那么我们需要将字符串insert到集合中，并清空我们的占位符字符串，否则，我们将字符添加到占位符中，这将构建字符串。完成后，我们需要将最后一个字符串添加到集合中。

最后，我们向下迭代集合，并打印每个字符串，这仅用于验证，但否则可能很有用。

编辑：洛基·阿斯塔里（Loki Astari）在评论中对我的代码进行了重大改进，我认为应该将其集成到答案中：

#include <set>
#include <iostream>
#include <string>
int main()
{
  std::set<std::string> myset;
  std::string word;
  while(std::cin >> word)
  {
      myset.insert(std::move(word));
  }
  for(std::set<std::string>::const_iterator it=myset.begin(); it!=myset.end(); ++it)
    std::cout << *it << 'n';
}