使用犰狳根据行和列索引从矩阵中提取元素

Extract elements from a matrix based on the row and column indices with Armadillo

本文关键字：索引元素提取更新时间：2023-10-16

在 R 中，我可以根据矩阵元素的索引提取矩阵元素，如下所示

> m <- matrix(1:6, nrow = 3)
> m
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6
> row_index <- c(1, 2)
> col_index <- c(2, 2)
> m[cbind(row_index, col_index)]
[1] 4 5

有没有一种本地方法是犰狳/Rcpp：：犰狳？我能做的最好的事情是使用行索引和列索引来计算元素索引的自定义函数（见下文）。我最担心自定义函数的性能不佳。

#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
NumericVector Rsubmatrix(arma::uvec rowInd, arma::uvec colInd, arma::mat m) {
  arma::uvec ind = (colInd - 1) * m.n_rows + (rowInd - 1);
  arma::vec ret = m.elem(ind);
  return wrap(ret);
}
/*** R
Rsubmatrix(row_index, col_index, m)
/

从文档中：

X.submat( vector_of_row_indices, vector_of_column_indices )

但这似乎只返回矩阵块。对于非简单连接的区域，我认为您的解决方案是最好的，但您实际上并不需要函数，

m.elem((colInd - 1) * m.n_rows + (rowInd - 1));

返回矢量没有任何问题。为了清楚起见，您可以定义一个函数来处理行+列到索引的转换，

inline arma::uvec arr2ind(arma::uvec c, arma::uvec r, int nrow) 
{ 
  return c * nrow + r;
}
// m.elem(arr2ind(colInd - 1, rowInd - 1, m.n_rows));

让我们试试这个...

特别是，可以通过rowInd和colInd来子集，通过编写自己的循环来使用 .(i,j) 子集运算符。否则，唯一的其他选择是您提出的开始问题的解决方案......

#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
// Optimized OP method
// [[Rcpp::export]]
arma::vec Rsubmatrix(const arma::mat& m, const arma::uvec& rowInd, const arma::uvec& colInd) {
  return  m.elem((colInd - 1) * m.n_rows + (rowInd - 1));
}
// Proposed Alternative
// [[Rcpp::export]]
arma::rowvec get_elements(const arma::mat& m, const arma::uvec& rowInd, const arma::uvec& colInd){
  unsigned int n = rowInd.n_elem;
  arma::rowvec out(n);
  for(unsigned int i = 0; i < n; i++){
    out(i) = m(rowInd[i]-1,colInd[i]-1);
  }
  return out;
}

哪里：

m <- matrix(1:6, nrow = 3) 
row_index <- c(1, 2)
col_index <- c(2, 2)
m[cbind(row_index, col_index)]

给：

[1] 4 5

我们有：

get_elements(m, row_index, col_index)

给：

     [,1] [,2]
[1,]    4    5

编辑

微基准：

microbenchmark(Rsubmatrix(m, row_index, col_index), get_elements(m, row_index, col_index), times = 1e4)

给：

Unit: microseconds
                                  expr   min    lq     mean median    uq      max neval
   Rsubmatrix(m, row_index, col_index) 2.836 3.111 4.129051  3.281 3.502 5016.652 10000
 get_elements(m, row_index, col_index) 2.699 2.947 3.436844  3.115 3.335  716.742 10000

这些方法都是接近时间的。请注意，后者应该更好，因为它避免了两个单独的循环（1.计算和2.子集）和一个额外的临时向量来创建来存储结果。

编辑

根据犰狳7.200.0发布，sub2ind()函数已获得采用矩阵符号的能力。此函数通过2 x n矩阵获取矩阵下标，其中n表示要子集的元素数，并将它们转换为元素表示法。

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::rowvec matrix_locs(arma::mat M, arma::umat locs) {
    arma::uvec eids = sub2ind( size(M), locs ); // Obtain Element IDs
    arma::vec v  = M.elem( eids );              // Values of the Elements
    return v.t();                               // Transpose to mimic R
}

在 R 中调用：

cpp_locs <- locs - 1       # Shift indices from R to C++
(cpp_locs <- t(cpp_locs))  # Transpose matrix for 2 x n form
matrix_locs(M, cpp_locs)   # Subset the matrix