自动决定要用于数据处理的类

Automatically decide which class to use for data processing

本文关键字：数据处理用于决定更新时间：2023-10-16

我有一个大项目，我遇到了一个问题，可以很快公式化为：

我有一个临时创建的类，用于处理和修改一些数据（让我们称之为"worker"）。现在我有两个工人和两个相应的数据格式。数据数组可以包含混合数据，如何使我的程序自动决定应该创建哪个工人类并用于数据处理？如何以最好的方式做到这一点？

为了说明这个问题，我写了一个小示例程序，它与我的项目类似。

#include <iostream>
#include <vector>
using namespace std;
const int NInputs = 10;

struct TOutput {
  int i;
};
class TProcess {
  public:
  TProcess( const vector<TInput>& i ){ fInput = i; }
  void Run();
  void GetOutput( TOutput& o ) { o = fOutput; }
  private:
  vector<TInput> fInput;
  TOutput fOutput;
};
#if 0
struct TInput {
  int i;
};
class TWorker{
 public:
  void Init( int i ) { fResult = i; }
  void Add( int i ) { fResult += i; }
  int  Result() { return fResult; } 
 private:
  int fResult;
};
#else
struct TInput {
  int i;
};
class TWorker {
 public:
  void Init( int i ) { fResult = i; }
  void Add( int i ) { fResult ^= i; }
  int  Result() { return fResult; } 
 private:
  int fResult;
};
#endif
void TProcess::Run() {
  TWorker worker;
  worker.Init(0);
  for( int i = 0; i < fInput.size(); ++i )
    worker.Add(fInput[i].i);
  fOutput.i = worker.Result();
}
int main()  {
  vector<TInput> input(NInputs);
  for  ( int i = 0; i < NInputs; i++ ) {
    input[i].i = i;
  }
  TProcess proc(input);
  proc.Run();
  TOutput output;
  proc.GetOutput(output);
  cout << output.i << endl;
}

这个例子非常简单，但这并不意味着可以简单地将其转换为一个函数——它对应于大项目。因此，不可能：

删除已经存在的类或函数（但可以修改它们并创建新的）
使worker静态或只创建worker的一个副本（在许多复杂的函数和循环中，每个worker都是临时的）

那么如何修改它，使其变成这样：

 // TODO: TProcess declaration
struct TInput1 {
  int i;
};
class TWorker1{
 public:
  void Init( TInput1 i ) { fResult = i; }
  void Add( TInput1 i ) { fResult += i.i; }
  int  Result() { return fResult; } 
 private:
  int fResult;
};
#else
struct TInput2 {
  int i;
};
class TWorker2 {
 public:
  void Init( TInput2 i ) { fResult = i.i; }
  void Add( TInput2 i ) { fResult ^= i.i; }
  int  Result() { return fResult; } 
 private:
  int fResult;
};
void TProcess::Run() { 
  for( int i = 0; i < fInput.size(); ++i ) {
    // TODO: choose and create a worker
    worker.Add(fInput[i].i);
    // TODO: get and save result
  }
  fOutput.i = worker.Result();
}
int main()  {
  vector<TInputBase> input(NInputs);
  // TODO: fill input
  TProcess proc(input);
  proc.Run();
  TOutput output;
  proc.GetOutput(output);
  cout << output.i << endl;
}

我最初的想法是使用基本的类和模板函数，但没有模板虚拟函数。。。

您对第二个示例中的vector<TInputBase>声明有了正确的想法——您需要为所有输入和所有工作者都有一个公共基类：

class TInput {
}
class TInput1 : public TInput { ... }
class TInput2 : public TInput { ... }
class TWorker {
public:
  void Init(TInput *input) = 0;
  void Add(TInput *input) = 0;
  int Result() = 0;
}
class TWorker1 : public TWorker { ... }
class TWorker2 : public TWorker { ... }

但是，请注意，这意味着所有工作程序只能使用TInput *作为输入，并且您需要在每个工作程序类中强制转换到正确的输入类。

决定为给定的输入使用哪个工人类的最简单方法是询问输入本身！您可以在输入类中有一个虚拟函数，它可以创建正确类型的工作者：

class TInput {
  virtual TWorker *createWorker() = 0;
}
class TInput1 : public TInput {
  TWorker *createWorker() {
    return new TWorker1();
  }
}
class TInput2 : public TInput {
  TWorker *createWorker() {
    return new TWorker2();
  }
}

如果由于某种原因无法做到这一点，则可以使用typeid来确定输入的类型并创建相应的辅助实例。