Friday, 15 June 2012

r - How to apply a summary function to multiple columns of a data table -



r - How to apply a summary function to multiple columns of a data table -

first off, let's false data

>library(data.table) >dt = data.table(x=c('a','a','b','b'),y=c('x','y','x','y'),z=c(1,2,3,4)) >dt x y z 1: x 1 2: y 2 3: b x 3 4: b y 4 >df<-data.frame(dt) >df x y z 1 x 1 2 y 2 3 b x 3 4 b y 4 cols<-cbind('x','y') > df[,cols] x y 1 x 2 y 3 b x 4 b y > lapply(x=df[,cols],fun=paste,sep=', ',collapse=', ') $x [1] "a, a, b, b" $y [1] "x, y, x, y"

this feels should simple. how do apply dt? i'm trying stick data.frame can run on big info sets (n > 1 mil). closest i've been able come has been:

> dt[,lapply(x=list(get(cols)),fun=paste,sep=', ',collapse=', ')] v1 1: a, a, b, b

it's applying function first of 2 columns specified.

as.list(dt[, lapply(.sd, paste, collapse = ","), .sdcols = c('x','y')]) #$x #[1] "a,a,b,b" # #$y #[1] "x,y,x,y"

r data.table

No comments:

Post a Comment