LAPACK函数在第一次迭代后变慢

LAPACK function gets slower after first iteration

本文关键字:迭代 第一次 函数 LAPACK      更新时间:2023-10-16

我正在实现一个迭代算法,该算法使用LAPACK进行PSD投影(其实并不重要,关键是我反复调用这个函数(:

void useLAPACK(vector<double>& x, int N){
/* Locals */
int n = N, il, iu, m, lda = N, ldz = N, info, lwork, liwork;
double abstol;
double vl,vu;
int iwkopt;
int* iwork;
double wkopt;
double* work;
/* Local arrays */
int isuppz[N];
double w[N], z[N*N];
/* Negative abstol means using the default value */
abstol = -1.0;
/* Set il, iu to compute NSELECT smallest eigenvalues */
vl = 0;
vu = 1.79769e+308;
/* Query and allocate the optimal workspace */
lwork = -1;
liwork = -1;
dsyevr_( (char*)"Vectors", (char*)"V", (char*)"Upper", &n, &x[0], &lda, &vl, &vu, &il, &iu,
&abstol, &m, w, z, &ldz, isuppz, &wkopt, &lwork, &iwkopt, &liwork,
&info );
lwork = (int)wkopt;
work = (double*)malloc( lwork*sizeof(double) );
liwork = iwkopt;
iwork = (int*)malloc( liwork*sizeof(int) );
/* Solve eigenproblem */
dsyevr_( (char*)"Vectors", (char*)"V", (char*)"Upper", &n, &x[0], &lda, &vl, &vu, &il, &iu,
&abstol, &m, w, z, &ldz, isuppz, work, &lwork, iwork, &liwork,
&info );
/* Check for convergence */
if( info > 0 ) {
printf( "The dsyevr (useLAPACK) failed to compute eigenvalues.n" );
exit( 1 );
}
/* Print the number of eigenvalues found */
//printf( "n The total number of eigenvalues found:%2in", m );
//print_matrix( "Selected eigenvalues", 1, m, w, 1 );
//print_matrix( "Selected eigenvectors (stored columnwise)", n, m, z, ldz );
//Eigenvectors are returned as stacked columns (in total m)
//Outer sum calculation is fastest.
for(int i = 0; i < N*N; ++i) x[i] = 0;
double lambda;
double vrow1,vrow2;
for(int col = 0; col < m; ++col) {
lambda = w[col];
for (int row1 = 0; row1 < N; ++row1) {
vrow1 = z[N*col+row1];
for(int row2 = 0; row2 < N; ++row2){
vrow2 = z[N*col+row2];
x[row1*N+row2]  += lambda*vrow1*vrow2;
}
}
}
free( (void*)iwork );
free( (void*)work );
}

现在,我的时间测量显示,第一次通话大约需要4ms,但后来增加到100ms。在这个代码中对此有一个很好的解释吗?x每次都是相同的向量。

我想我已经解决了这个问题。我的算法从零矩阵开始,然后正特征值的数量或多或少是半正半负。dsyevr只计算具有这些自变量的正特征值和相应的特征向量。我想如果所有都为零,就不需要计算任何特征向量,这会使算法更快。感谢您的回答,并对丢失的信息表示歉意。