Sunday, 15 July 2012

performance - row-wise differences between two large matrices in R -



performance - row-wise differences between two large matrices in R -

i inquire sentiment on how speed next operation.

i have 2 matrices says , b n rows , 3 columns; row vector of want compare difference row vector of b. pairwise difference between all row vectors of 2 matrices. resulting matrix n*n matrix. want apply function element of this, biharm() function wrote in example. problem that, while little matrices have no problems, have necessity apply operation big matrices such 1000*3. in sigm() function, wrote that, first initialize s , wrote 2 annidated cycles. however, slow big matrices. has thought on how speed this? think using apply() cannot figure out right way. here below reproducible example. in advance advice. best, paolo.

biharm<-function(vec1,vec2){ reso<-norm(as.matrix(vec1)-as.matrix(vec2),type="f")^2*log(norm((as.matrix(vec1)-as.matrix(vec2)),type="f")) reso } sigm<-function(mat1,mat2=null){ tt<-mat1 if(is.null(mat2)){yy<-mat1}else{yy<-mat2} k<-nrow(yy) m<-ncol(yy) sgmr<-matrix(rep(0,k^2),ncol=k) for(i in 1:k){ for(j in 1: k){ sgmr[i,j]<-biharm(yy[i,],tt[j,]) }} sgmr<-replace(sgmr,which(sgmr=="nan",arr.ind=t),0) return(sgmr)} ### little matrices example: a<-matrix(rnorm(30),ncol=3) b<-matrix(rnorm(30),ncol=3) sigm(a,b) ### big matrices example: a<-matrix(rnorm(900),ncol=3) b<-matrix(rnorm(900),ncol=3) sigm(a,b)

this 8 times faster on system.

biharm.new <- function(vec1,vec2){ n <- sqrt(sum((vec1-vec2)^2)) n^2*log(n) } sigm.new<-function(mat1,mat2=null){ tt<-mat1 if(is.null(mat2)){yy<-mat1}else{yy<-mat2} sgmr <- apply(tt,1,function(t)apply(yy,1,biharm.new,t)) replace(sgmr,which(sgmr=="nan",arr.ind=t),0) } ### big matrices example: set.seed(1) a<-matrix(rnorm(900),ncol=3) b<-matrix(rnorm(900),ncol=3) system.time(result.1<-sigm(a,b)) # user scheme elapsed # 6.13 0.00 6.13 system.time(result.2<-sigm.new(a,b)) # user scheme elapsed # 0.81 0.00 0.81 all.equal(result.1,result.2) # [1] true

the utilize of apply(...) results in 3-fold improvement. rest comes optimizing biharm(...) - since calling 810,000 times pays create efficient possible.

note frobenius norm euclidean norm, if want utilize sqrt(sum(x^2)) rather converting matrices , using norm(...). former much faster.

r performance matrix

No comments:

Post a Comment