使用单个索引访问子矩阵的有效方式

Efficient way to access submatrices with single index

本文关键字：有效方式单个索引访问更新时间：2023-10-16

我正在尝试对具有单个索引的子矩阵执行跨步访问。我需要这个来创建一个使用表达式模板的库。我已经计算出了以下类，其中访问是由过载的operator[]执行的，请参见以下内容：

template <class A, class Type>
class SubMatrixExpr
{
    private:
        int Rows_;              // Rows of the SubMatrix
        int Columns_;           // Columns of the SubMatrix
        int Rows_up_;               // Rows of the original Matrix
        int Columns_up_;            // Columns of the original Matrix
        int a_, c_;                 // Starting indices of the SubMatrix as evaluated in the original Matrix
        int rowstep_, columnstep_;      // Stride along rows and columns for the original matrix
        A M_;
    public:
        SubMatrixExpr(A &M, int Rows_up, int Columns_up, int Rows, int Columns, int a, int rowstep, int c, int columnstep) : 
           a_(a), c_(c), M_(M), 
           Rows_(Rows), 
           Columns_(Columns), 
           Rows_up_(Rows_up), Columns_up_(Columns_up), 
           rowstep_(rowstep), columnstep_(columnstep) { }
           inline const Type& operator[](const int i) const
           {
               const int LocalRow = i/Columns_;         
               const int LocalColumn = i%Columns_;      
               const int GlobalRow = a_+rowstep_*LocalRow;          
               const int GlobalColumn = c_+columnstep_*LocalColumn;
               return M_[IDX2R(GlobalRow,GlobalColumn,Columns_up_)];
           }
           inline Type& operator[](const int i) 
           {
              // Similar to above   
           }
 };

其中

#define IDX2R(i,j,N) (((i)*(N))+(j))

过载的operator[]工作正常，但计算成本很高。

有什么方法可以更好地实现过载的operator[]吗

提前非常感谢。

获得加速的唯一方法是在编译时调整矩阵和子矩阵的大小。然后使用template/constexpr可能会加快速度。例如，如果在编译时已知大小是2的幂，则编译器将能够用移位代替除法。