r - Strange output with ddply when using different variable classes -
i'm getting unusual output when using ddply apply function 2 different variables. it's completing task correctly, assumes format of output based on ever variable named first in c(var1, var2)
all i'm trying accomplish grouping dataframe conversion.id
, find max date , if click happened, thought simple.
> class(wrk.ds$intr.date.time) [1] "posixct" "posixt" > class(wrk.ds$type.bin) [1] "numeric" > wrk.ds.1 <- ddply(wrk.ds, .(conversion.id), function(wrk.ds){ + click.check = as.numeric(max(wrk.ds$type.bin)) + max.intr.date.time = max(wrk.ds$intr.date.time) + c(click.check, max.intr.date.time )}) > head(wrk.ds.1) conversion.id v1 v2 1 8.930874e+15 1 1406473200 2 4.266128e+16 0 1407955140 3 1.241770e+17 0 1409494260 4 1.309763e+17 1 1407238560 5 1.367159e+17 1 1408196760 6 1.417151e+17 0 1409251260 > > #reversing c() order > wrk.ds.1 <- ddply(wrk.ds, .(conversion.id), function(wrk.ds){ + click.check = as.numeric(max(wrk.ds$type.bin)) + max.intr.date.time = max(wrk.ds$intr.date.time) + c(max.intr.date.time, click.check)}) > head(wrk.ds.1) conversion.id v1 v2 1 8.930874e+15 2014-07-27 16:00:00 1970-01-01 01:00:01 2 4.266128e+16 2014-08-13 19:39:00 1970-01-01 01:00:00 3 1.241770e+17 2014-08-31 15:11:00 1970-01-01 01:00:00 4 1.309763e+17 2014-08-05 12:36:00 1970-01-01 01:00:01 5 1.367159e+17 2014-08-16 14:46:00 1970-01-01 01:00:01 6 1.417151e+17 2014-08-28 19:41:00 1970-01-01 01:00:00
my work-around has been these in 2 steps, i'm more curious know if can fixed.
i've tried following, no avail.
wrk.ds.1 <- ddply(wrk.ds, .(conversion.id), function(wrk.ds){ click.check = as.numeric(max(wrk.ds$type.bin)) max.intr.date.time = max(wrk.ds$intr.date.time) c(click.check, as.posixct(max.intr.date.time ))})
as bonus question, can tell me way labels newly created variables aren't getting assigned
the anonymous function pass ddply
should homecoming data.frame, yours returning vector. alter this:
wrk.ds.1 <- ddply(wrk.ds, .(conversion.id), function(df){ click.check = max(df$type.bin) max.intr.date.time = max(df$intr.date.time) data.frame(click.check, max.intr.date.time )})
of course, should utilize summarise
instead:
wrk.ds.1 <- ddply(wrk.ds, .(conversion.id), summarise, click.check = max(type.bin), max.intr.date.time = max(intr.date.time))
r class plyr
No comments:
Post a Comment