如何根据一组规则/条件检查一组数据以进行分类

How do I check a set of data on a set of rules / conditions for categorization?

本文关键字：一组数据分类检查规则何根条件更新时间：2023-10-16

我有一组银行账户条目，它们存储在我定义的类bankAccountEntry的实例中。类bankAccountEntry具有数据成员

unsigned int year;
unsigned int month;
unsigned int day;
std::string name;
std::string accountNumberConsidered;
std::string  accountNumberContra;
std::string code;
double amount;
std::string sortOfMutation;
std::string note;

我想把这些银行帐户分录分类。

例如，如果std::string name将包含子串gasolineStation1或gasolineStation2，则应将其分类在gasoline下。为了实现这种分类，例如，我可以通过语句来检查数据成员

if (std::count(bae.name.begin(), bae.name.end(),"gasolineStation1")>0
|| 
std::count(bae.name.begin(), bae.name.end(),"gasolineStation2")>0)
{
bae.setCategory("gasoline");
}

对于我所有银行账户条目的分类，我有一大组这样的预定义规则/条件，我想将其作为主程序的输入参数。

有什么策略可以检查我的每个银行账户条目是否符合一组规则/条件，直到它发现命中为止？

If，big If here，所有规则都是简单的名称-类别映射，这可以相当干净地完成。如果规则不同。。。恶心。

现在只看简单的案例，

为了便于阅读和解释，定义：

struct mapping
{
std::string name;
std::string category;
}

使用std::pair<std::string,std::string>可能具有战术优势。并定义

std::vector<mapping> mappings;

将规则文件中的名称类别配对读取到mappings中。我不能就此给出任何建议，因为我们不知道规则是什么样子的。一旦完成

bool bankAccountEntry::categorize()
{
for (const mapping & kvp: mappings)
{
if (name.find(kvp.name) != std::string::npos)
{
setCategory(kvp.category);
return true;
}
}
return false;
}

这是蛮力。根据数据的外观，例如，如果它严格遵循命名方案，您确实可以加快速度。

如果规则更复杂，你最终会得到更像的东西

struct mapping
{
std::function<bool(const bankAccountEntry&)> rule;
std::string category;
}

和

std::vector<mapping> mappings;

每个mapping::rule是取bankAccountEntry并决定bankAccountEntry是否符合规则的函数。例如：

bool gasolineStationRule(const bankAccountEntry& bae)
{
return std::count(bae.name.begin(), bae.name.end(),"gasolineStation1")>0 ||       
std::count(bae.name.begin(), bae.name.end(),"gasolineStation2")>0;
}

这不起作用，因为std::count不起作用。

类似的东西

bool gasolineStationRule(const bankAccountEntry& bae)
{
return (bae.name.find("gasolineStation1")!= std::string::npos) ||
(bae.name.find("gasolineStation2")!= std::string::npos);
}

会，但可以通过搜索一次"gasolineStation"进行改进，然后，如果找到"gasoline Station"，则测试后面的字符"1"或"2"。

如何将rules代入向量将非常有趣。它可能需要大量的专业功能，一支兰达斯军队，或者一对树上的鹧鸪。问题中没有足够的具体说明来确定。

它可能看起来像

mappings.push_back(mapping{&gasolineStationRule, "gasoline"})

或者通过向mapping添加构造函数

mapping(std::function<bool(const bankAccountEntry&)> newrule,
std::string newcategory): rule(newrule), category(newcategory)
{
}

你可能会从中得到一个小的性能改进

mappings.emplace_back(&gasolineStationRule, "gasoline")

不管怎样。。。

bool bankAccountEntry::categorize()
{
for (const mapping & kvp: mappings)
{
if (kvp.rule(*this))
{
setCategory(kvp.category);
return true;
}
}
return false;
}

同样，你对规则了解得越多，它们的可预测性就越强，你就越能优化。

也可以将std::find_if看作是bankAccountEntry::categorize内脏的可能替代品。

std::function上的文档。